Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence
This paper introduces self-function vectors to directly estimate aleatoric uncertainty in in-context learning under a Bayesian framework, along with the first rigorous evaluation protocol to separate aleatoric from epistemic uncertainty. Experiments show the method reliably measures LLM prediction uncertainty and can be used for hallucination detection. Accepted to ACL 2026.
[2606.19353] Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence
[Submitted on 28 Apr 2026]
Title:Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence
View a PDF of the paper titled Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence, by Jinseok Chung and 3 other authors
View PDF HTML (experimental)
Abstract:In-Context Learning (ICL) allows LLMs to adapt to new tasks from a few demonstrations, but its reliability remains a concern: predictions are highly sensitive to both prompt design and the model's ability to understand the context, obscuring whether failures arise from data properties or model limitations. Uncertainty decomposition-separating aleatoric from epistemic sources-is particularly crucial in this setting, yet existing methods, designed for standard generation tasks, fail to capture the unique dynamics of ICL. To address this, we introduce a concept of self-function vectors, built upon Bayesian views and the mechanistic interpretability of ICL. These vectors leverage internal model representations to model the latent concept learned during in-context prompting, thereby enabling a direct estimation of aleatoric uncertainty within a Bayesian framework and circumventing the reliance on brittle input or decoding manipulations. Given the lack of established benchmarks and suitable evaluation protocols, we also propose the first and rigorous evaluation protocol, in which data is manipulated in controlled ways so as to quantify aleatoric uncertainty precisely and separately from epistemic uncertainty. With this new evaluation framework, initially grounded in synthetic tasks for conceptual development and subsequently extended to real-world datasets, we show that our proposed methodology can measure uncertainty of LLM predictions made under ICL more reliably than existing alternative methods. Moreover, we show it can be used as a practical tool for trustworthy-related applications, such as hallucination detection. Our findings pave a new direction for connecting the quantitative view of uncertainty with the mechanistic understanding of model behavior.
Comments: Accepted to ACL 2026
Subjects:
Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2606.19353 [cs.CL]
(or arXiv:2606.19353v1 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2606.19353
arXiv-issued DOI via DataCite
Submission history
From: Jinseok Chung [view email] [v1] Tue, 28 Apr 2026 09:47:40 UTC (1,668 KB)
Full-text links:
Access Paper:
View a PDF of the paper titled Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence, by Jinseok Chung and 3 other authors
View PDF
HTML (experimental)
TeX Source
view license
Current browse context:
cs.CL
new | recent | 2026-06
Change to browse by:
cs cs.LG
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Loading...
Data provided by:
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Code, Data and Media Associated with this Article
alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos
Demos
Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers
Recommenders and Search Tools
Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
Author
Venue
Institution
Topic
About arXivLabs
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)