AI News HubLIVE
原文2 min read

VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents

VASO is a framework that uses formal verification to guide the self-evolution of LLM-generated robot skill contracts. On Clearpath Jackal and PX4 quadcopter tasks, it achieves 97.2% formal-specification compliance with fewer than 100 optimization samples, outperforming execution-feedback, prompt-optimization, and fine-tuning baselines. It is the first framework to close the loop between formal verification and self-evolving skills for physical AI agents.

SourcearXiv RoboticsAuthor: Yunhao Yang, Neel P. Bhatt, Kevin Wang, Samuel Tetteh, Zhangyang Wang, Ufuk Topcu

[2606.05395] VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents

[Submitted on 3 Jun 2026]

Title:VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents

View a PDF of the paper titled VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents, by Yunhao Yang and 5 other authors

View PDF HTML (experimental)

Abstract:Reusable robot skills are becoming the basic units through which embodied agents turn open-ended instructions into long-horizon physical behavior. We argue that, while foundation models have collapsed the cost of creating these skills, the cost of trusting them has not. Existing skill-evolution loops refine skills through execution feedback, unit tests, environment reward, or LLM self-critique, but these signals provide only trace-level evidence: they show that a skill worked on sampled executions, not that skill-induced plans satisfy temporal safety contracts under untested conditions. We introduce VASO, a framework for verification-guided self-evolution of LLM-generated robot skill contracts. In VASO, each skill is represented as a semantic contract with two coupled interfaces: a formal interface that aligns robot states, observations, and control commands with logical propositions for model checking, and a planner-facing interface that guides executable behavior generation. A model checker first filters logically inconsistent skill contracts, then verifies plans induced by the skill against global and local temporal specifications. When verification fails, VASO translates the counterexample trace into a textual gradient that updates the reusable skill contract while keeping foundation-model weights frozen. On Clearpath Jackal and PX4 quadcopter tasks, VASO reaches 97.2% formal-specification compliance using fewer than 100 optimization samples, outperforming execution-feedback, prompt-optimization, and fine-tuning baselines. To our knowledge, VASO is the first framework that closes the loop between formal verification and self-evolving LLM-generated skills for physical AI agents: formal counterexamples become optimization feedback for reusable robot skill contracts, rather than merely verifying one-off plans, tuning planner prompts, or fine-tuning model weights.

Comments: Project webpage: this https URL

Subjects:

Robotics (cs.RO); Artificial Intelligence (cs.AI)

Cite as: arXiv:2606.05395 [cs.RO]

(or arXiv:2606.05395v1 [cs.RO] for this version)

https://doi.org/10.48550/arXiv.2606.05395

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yunhao Yang [view email] [v1] Wed, 3 Jun 2026 20:02:35 UTC (18,510 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents, by Yunhao Yang and 5 other authors

View PDF

HTML (experimental)

TeX Source

view license

Current browse context:

cs.RO

new | recent | 2026-06

Change to browse by:

cs cs.AI

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Loading...

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Related Papers

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author

Venue

Institution

Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)