2026-06-24 04:00 UTCOriginal source2 min readUpdated: 2026-06-24 07:44 UTC

Safe and Generalizable Hierarchical Multi-Agent RL via Constraint Manifold Control

arXiv:2606.24010v1 Announce Type: new Abstract: Multi-agent systems are widely used in safety-critical applications that require coordinated behavior under strict safety constraints. Existing approaches face a fundamental trade-off: learning-based methods achieve strong empirical performance but lack theoretical safety guarantees, while control-theoretic methods enforce safety but often lead to overly conservative and inefficient behaviors. We propose a hierarchical multi-agent reinforcement learning framework that enforces hard safety constraints under mild assumptions at low level via a constraint manifold, while enabling effective coordination through high-level policy learning. Our approach provides theoretical safety guarantees in the multi-agent setting and yields stationary learning dynamics, thereby enabling stable and efficient training. Empirically, our method achieves competitive performance while maintaining nearly perfect safety rates, and generalizes effectively to varying numbers of agents and obstacles.

SourcearXiv AIAuthor: Zihao Guo, Jianing Zhao, Ling Li, Hao Liang, Giuseppe Loianno, Yali Du

[2606.24010] Safe and Generalizable Hierarchical Multi-Agent RL via Constraint Manifold Control

[Submitted on 22 Jun 2026]

Title:Safe and Generalizable Hierarchical Multi-Agent RL via Constraint Manifold Control

View a PDF of the paper titled Safe and Generalizable Hierarchical Multi-Agent RL via Constraint Manifold Control, by Zihao Guo and 5 other authors

View PDF HTML (experimental)

Abstract:Multi-agent systems are widely used in safety-critical applications that require coordinated behavior under strict safety constraints. Existing approaches face a fundamental trade-off: learning-based methods achieve strong empirical performance but lack theoretical safety guarantees, while control-theoretic methods enforce safety but often lead to overly conservative and inefficient behaviors. We propose a hierarchical multi-agent reinforcement learning framework that enforces hard safety constraints under mild assumptions at low level via a constraint manifold, while enabling effective coordination through high-level policy learning. Our approach provides theoretical safety guarantees in the multi-agent setting and yields stationary learning dynamics, thereby enabling stable and efficient training. Empirically, our method achieves competitive performance while maintaining nearly perfect safety rates, and generalizes effectively to varying numbers of agents and obstacles.

Comments: 10 pages

Subjects:

Artificial Intelligence (cs.AI)

Cite as: arXiv:2606.24010 [cs.AI]

(or arXiv:2606.24010v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2606.24010

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zihao Guo [view email] [v1] Mon, 22 Jun 2026 23:32:23 UTC (4,947 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled Safe and Generalizable Hierarchical Multi-Agent RL via Constraint Manifold Control, by Zihao Guo and 5 other authors

View PDF

HTML (experimental)

TeX Source

view license

Current browse context:

cs.AI

new | recent | 2026-06

Change to browse by:

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)