2026-06-02 04:00 UTCOriginal source2 min readUpdated: 2026-06-30 13:03 UTC

AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

Detecting AI-generated text is becoming increasingly challenging as modern language models approach human-level fluency. AEyeDE proposes an attribution-driven approach that leverages model attention as a discriminative signal. Using a proxy Transformer model to extract attention-based attribution matrices and a lightweight CNN to learn representations, the method outperforms text-only baselines in encoder-decoder translation settings, performs strongly in decoder-only settings, and shows robustness under cross-dataset transfer and spelling perturbations. The findings suggest attention-based attribution maps provide a complementary and interpretable signal for AI-generated text detection.

SourcearXiv Computational LinguisticsAuthor: Aria Nourbakhsh, Adelaide Danilov, Christoph Schommer, Salima Lamsiyah

[2606.00016] AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

[Submitted on 13 Apr 2026]

Title:AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

View a PDF of the paper titled AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection, by Aria Nourbakhsh and 3 other authors

View PDF HTML (experimental)

Abstract:Detecting AI-generated text is becoming increasingly challenging as modern language models approach human-level fluency and can evade detectors that rely on surface statistics or likelihood-based signals. We propose \textsc{AEyeDE}, an attribution-driven approach to human-AI authorship detection that leverages model attention as a discriminative signal. Specifically, we extract attention-based attribution matrices for both human- and AI-generated text using a \emph{proxy} Transformer model with white-box access and train a lightweight Convolutional Neural Network to learn representations from these attribution maps. Across encoder-decoder translation settings, our method consistently outperforms a text-only baseline. In decoder-only settings, it performs strongly in generator-specific detection, remains competitive on standard benchmarks, and shows robustness under cross-dataset transfer and alternative-spelling perturbations. We further show that attention maps exhibit recurring local structures whose relative frequencies differ consistently between human- and AI-generated text across datasets and proxy models. These findings suggest that attention-based attribution maps provide a complementary and interpretable signal for AI-generated text detection. We will make the code publicly available to support future research.

Comments: 24 pages, 2 figures

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

ACM classes: I.2.7

Cite as: arXiv:2606.00016 [cs.CL]

(or arXiv:2606.00016v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2606.00016

arXiv-issued DOI via DataCite

Submission history

From: Adelaide Danilov [view email] [v1] Mon, 13 Apr 2026 19:30:40 UTC (2,228 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection, by Aria Nourbakhsh and 3 other authors

View PDF

HTML (experimental)

TeX Source

view license

Current browse context:

cs.CL

new | recent | 2026-06

Change to browse by:

cs cs.AI

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)