2026-06-09原文2 min readUpdated: 2026-06-09

VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents

This paper introduces VisualLeakBench, a benchmark for evaluating action-boundary propagation failures in vision-language agents, where sensitive text from images is copied into tool arguments. Baseline tests show 78.8% PII and 85.5% unsafe-text propagation rates. Defensive prompts reduce PII to 2.0% but suppress utility, while unsafe-text remains at 52.6%.

SourcearXiv Computer VisionAuthor: Youting Wang, Yuan Tang, Yitian Qian, Chen Zhao

[2606.07595] VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents

[Submitted on 29 May 2026]

Title:VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents

View a PDF of the paper titled VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents, by Youting Wang and 2 other authors

View PDF HTML (experimental)

Abstract:Vision-language agents increasingly consume screenshots, documents, and user interfaces before writing to memory, sending messages, or invoking external tools. We study a concrete failure mode in this setting: action-boundary propagation, where sensitive or unsafe visible text is copied from an image into downstream tool arguments. We present VisualLeakBench, a diversified 500-image benchmark spanning UI, chat, document, form, and dashboard scenes, and evaluate a stratified 100-image agent subset with four production VLM systems under two workflows: note capture and external handoff. At baseline, target strings are propagated into tool arguments in 78.8% of PII cases and 85.5% of rendered unsafe-text cases. Under a defensive system prompt, rendered unsafe-text propagation remains high at 52.6%, while PII tool propagation falls to 2.0%, largely by suppressing tool use rather than preserving utility. Rates are tool-surface dependent: search-like tools suppress PII propagation, but rendered unsafe text still crosses tool boundaries. We measure visual-to-tool propagation rather than downstream instruction execution. We additionally provide a labeled-target oracle upper-bound diagnostic that localizes most failures at the tool boundary while leaving response-side leakage as residual risk.

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Cite as: arXiv:2606.07595 [cs.CV]

(or arXiv:2606.07595v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2606.07595

arXiv-issued DOI via DataCite

Submission history

From: Yuan Tang [view email] [v1] Fri, 29 May 2026 05:17:03 UTC (73 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents, by Youting Wang and 2 other authors

View PDF

HTML (experimental)

TeX Source

view license

Current browse context:

cs.CV

new | recent | 2026-06

Change to browse by:

cs cs.AI cs.IR

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)