AI News HubLIVE
原文2 min read

Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence

This paper introduces self-function vectors to directly estimate aleatoric uncertainty in in-context learning under a Bayesian framework, along with the first rigorous evaluation protocol to separate aleatoric from epistemic uncertainty. Experiments show the method reliably measures LLM prediction uncertainty and can be used for hallucination detection. Accepted to ACL 2026.

SourcearXiv Computational LinguisticsAuthor: Jinseok Chung, Minkyoung Song, Hyunji Jung, Namhoon Lee

[2606.19353] Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence

[Submitted on 28 Apr 2026]

Title:Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence

View a PDF of the paper titled Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence, by Jinseok Chung and 3 other authors

View PDF HTML (experimental)

Abstract:In-Context Learning (ICL) allows LLMs to adapt to new tasks from a few demonstrations, but its reliability remains a concern: predictions are highly sensitive to both prompt design and the model's ability to understand the context, obscuring whether failures arise from data properties or model limitations. Uncertainty decomposition-separating aleatoric from epistemic sources-is particularly crucial in this setting, yet existing methods, designed for standard generation tasks, fail to capture the unique dynamics of ICL. To address this, we introduce a concept of self-function vectors, built upon Bayesian views and the mechanistic interpretability of ICL. These vectors leverage internal model representations to model the latent concept learned during in-context prompting, thereby enabling a direct estimation of aleatoric uncertainty within a Bayesian framework and circumventing the reliance on brittle input or decoding manipulations. Given the lack of established benchmarks and suitable evaluation protocols, we also propose the first and rigorous evaluation protocol, in which data is manipulated in controlled ways so as to quantify aleatoric uncertainty precisely and separately from epistemic uncertainty. With this new evaluation framework, initially grounded in synthetic tasks for conceptual development and subsequently extended to real-world datasets, we show that our proposed methodology can measure uncertainty of LLM predictions made under ICL more reliably than existing alternative methods. Moreover, we show it can be used as a practical tool for trustworthy-related applications, such as hallucination detection. Our findings pave a new direction for connecting the quantitative view of uncertainty with the mechanistic understanding of model behavior.

Comments: Accepted to ACL 2026

Subjects:

Computation and Language (cs.CL); Machine Learning (cs.LG)

Cite as: arXiv:2606.19353 [cs.CL]

(or arXiv:2606.19353v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2606.19353

arXiv-issued DOI via DataCite

Submission history

From: Jinseok Chung [view email] [v1] Tue, 28 Apr 2026 09:47:40 UTC (1,668 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence, by Jinseok Chung and 3 other authors

View PDF

HTML (experimental)

TeX Source

view license

Current browse context:

cs.CL

new | recent | 2026-06

Change to browse by:

cs cs.LG

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Loading...

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Related Papers

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author

Venue

Institution

Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)