When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability
This paper proposes Belief-Aware GSAC (BA-GSAC), which adaptively modulates the distillation coefficient λ via ensemble disagreement, and systematically investigates when adaptive guidance is beneficial for autonomous driving under partial observability. Experiments show benefits under mild to moderate occlusion, but under severe occlusion the adaptive coefficient collapses due to 'observability blindness'—the ensemble predicts partial observations and fails to detect missing information. Proposed fix: train ensemble on full-state predictions. Simple linear decay schedule outperforms adaptive methods, indicating stability gain stems from scheduling effect.
Article intelligence
Key points
- BA-GSAC dynamically adjusts distillation coefficient using ensemble disagreement for knowledge distillation in autonomous driving.
- Adaptive guidance helps under mild to moderate partial observability but fails under severe occlusion due to observability blindness.
- Simple linear decay schedule achieves best performance under severe POMDP, with stability benefits driven by scheduling rather than adaptivity.
- Training ensemble on full-state predictions is recommended to improve uncertainty awareness.
Why it matters
This matters because BA-GSAC dynamically adjusts distillation coefficient using ensemble disagreement for knowledge distillation in autonomous driving.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
[2605.26155] When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability
[Submitted on 24 May 2026]
Title:When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability
View a PDF of the paper titled When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability, by Mehmet Haklidir
View PDF HTML (experimental)
Abstract:Guided Soft Actor-Critic (GSAC) distills knowledge from a privileged full-state teacher to a partial-observation student for autonomous driving, but uses a fixed distillation coefficient lambda regardless of the agent's uncertainty. We present Belief-Aware GSAC (BA-GSAC), which modulates lambda via ensemble disagreement, and use it as a testbed for a systematic empirical study asking: when does adaptive guidance actually help? Evaluating five strategies (fixed lambda in {0.01, 0.1}, adaptive, linear decay, and vanilla SAC) across three POMDP difficulty levels on Highway-Env, we find that preliminary single-seed runs suggest benefits under mild and moderate partial observability, but under severe occlusion (evaluated with 3 seeds for all methods) the adaptive coefficient collapses to lambda_min within about 3K steps. We trace this to an observability blindness phenomenon: because the ensemble predicts partial observations, it achieves low disagreement even under heavy occlusion, modeling what is visible but unable to detect what is missing. We diagnose the root cause and propose an architectural fix (training the ensemble on full-state predictions using the guiding actor's privileged access); while not validated here, we show that even with current limitations, the warmup phase provides measurable stabilization (CV=13.3% vs. 29.8% for constant lambda=0.01). In fact, a simple deterministic linear decay schedule achieves the best severe-POMDP performance across all metrics (mean 116.5, CV=8.9%), suggesting that the scheduling effect, not the ensemble, drives the stability benefit. These findings provide practical guidance for designing uncertainty-aware teacher-student frameworks and highlight ensemble prediction targets as an important design choice.
Comments: 9 pages, 3 figures, 7 tables. Accepted at CVPR 2026 Workshop on Autonomous Driving (WAD)
Subjects:
Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
ACM classes: I.2.9; I.2.6
Cite as: arXiv:2605.26155 [cs.RO]
(or arXiv:2605.26155v1 [cs.RO] for this version)
https://doi.org/10.48550/arXiv.2605.26155
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Mehmet Haklidir [view email] [v1] Sun, 24 May 2026 04:41:30 UTC (1,079 KB)
Full-text links:
Access Paper:
View a PDF of the paper titled When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability, by Mehmet Haklidir
View PDF
HTML (experimental)
TeX Source
view license
Current browse context:
cs.RO
new | recent | 2026-05
Change to browse by:
cs cs.AI cs.LG
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Loading...
Data provided by:
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Code, Data and Media Associated with this Article
alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos
Demos
Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers
Recommenders and Search Tools
Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
Author
Venue
Institution
Topic
About arXivLabs
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)