2026-06-24 04:00 UTCOriginal source2 min readUpdated: 2026-06-24 07:44 UTC

One Year Later...The Harms Persist, But So Do We!

arXiv:2606.23884v1 Announce Type: new Abstract: General-purpose large language models (LLMs) are increasingly used for mental health-related conversations, yet safety safeguards remain inadequate and inconsistent across clinical conditions. This study evaluates six proprietary LLMs across 16 DSM-5 conditions using four adversarial attack variants, introducing an eight-dimension harm taxonomy and a multi-dimensional evaluation framework. Results show that safeguards hold reliably only for suicide and self-harm, while conditions such as eating disorders, substance use disorder, and major depressive disorder exhibit failure rates of up to 100%. We argue that ethical design and deployment of these LLMs demand clearly defined harm categories across clinical conditions and implementation of safeguards accordingly. Until such safeguards are in place, these models pose significant risks to vulnerable populations, making their growing integration into educational settings a particularly concerning.

SourcearXiv Computational LinguisticsAuthor: Annika Marie Schoene, Cansu Canca, Gautham Vijay Kumar, Anson Antony

[2606.23884] One Year Later...The Harms Persist, But So Do We!

[Submitted on 22 Jun 2026]

Title:One Year Later...The Harms Persist, But So Do We!

View a PDF of the paper titled One Year Later...The Harms Persist, But So Do We!, by Annika Marie Schoene and 3 other authors

View PDF

Abstract:General-purpose large language models (LLMs) are increasingly used for mental health-related conversations, yet safety safeguards remain inadequate and inconsistent across clinical conditions. This study evaluates six proprietary LLMs across 16 DSM-5 conditions using four adversarial attack variants, introducing an eight-dimension harm taxonomy and a multi-dimensional evaluation framework. Results show that safeguards hold reliably only for suicide and self-harm, while conditions such as eating disorders, substance use disorder, and major depressive disorder exhibit failure rates of up to 100%. We argue that ethical design and deployment of these LLMs demand clearly defined harm categories across clinical conditions and implementation of safeguards accordingly. Until such safeguards are in place, these models pose significant risks to vulnerable populations, making their growing integration into educational settings a particularly concerning.

Comments: 20 pages, 8 tables

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Cite as: arXiv:2606.23884 [cs.CL]

(or arXiv:2606.23884v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2606.23884

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Annika Marie Schoene [view email] [v1] Mon, 22 Jun 2026 19:30:14 UTC (460 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled One Year Later...The Harms Persist, But So Do We!, by Annika Marie Schoene and 3 other authors

View PDF

view license

Current browse context:

cs.CL

new | recent | 2026-06

Change to browse by:

cs cs.AI

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)