ZAYA1-8B Technical Report
ZAYA1-8B is a reasoning-focused mixture-of-experts model with 700M active and 8B total parameters, trained on AMD hardware. It matches or exceeds DeepSeek-R1-0528 on math and coding benchmarks and introduces Markovian RSA for test-time compute.
Article intelligence
Key points
- ZAYA1-8B features 700M active parameters and 8B total parameters, trained on a full-stack AMD platform.
- It matches or exceeds DeepSeek-R1-0528 on multiple math and coding benchmarks.
- Post-training uses a four-stage RL cascade: reasoning warmup, RLVE-Gym, math/code RL, and behavioral RL.
- Markovian RSA test-time compute boosts AIME'25 accuracy to 91.9% and HMMT'25 to 89.6%.
Why it matters
This matters because zAYA1-8B features 700M active parameters and 8B total parameters, trained on a full-stack AMD platform.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
[2605.05365] ZAYA1-8B Technical Report
Skip to main content
Learn about arXiv becoming an independent nonprofit.
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
> cs > arXiv:2605.05365
Help | Advanced Search
All fields Title Author Abstract Comments Journal reference ACM classification MSC classification Report number arXiv identifier DOI ORCID arXiv author ID Help pages Full text
Search
GO
quick links
Login
Help Pages
About
-->
Computer Science > Artificial Intelligence
arXiv:2605.05365 (cs)
[Submitted on 6 May 2026]
Title:ZAYA1-8B Technical Report
Authors:Robert Washbourne, Rishi Iyer, Tomas Figliolia, Henry Zheng, Ryan Lorig-Roach, Sungyeon Yang, Pritish Yuvraj, Quentin Anthony, Yury Tokpanov, Xiao Yang, Ganesh Nanduru, Stephen Ebert, Praneeth Medepalli, Skyler Szot, Srivatsan Rajagopal, Alex Ong, Bhavana Mehta, Beren Millidge
View a PDF of the paper titled ZAYA1-8B Technical Report, by Robert Washbourne and 17 other authors
View PDF HTML (experimental)
Abstract:We present ZAYA1-8B, a reasoning-focused mixture-of-experts (MoE) model with 700M active and 8B total parameters, built on Zyphra's MoE++ architecture. ZAYA1-8B's core pretraining, midtraining, and supervised fine-tuning (SFT) were performed on a full-stack AMD compute, networking, and software platform. With under 1B active parameters, ZAYA1-8B matches or exceeds DeepSeek-R1-0528 on several challenging mathematics and coding benchmarks, and remains competitive with substantially larger open-weight reasoning models. ZAYA1-8B was trained from scratch for reasoning, with reasoning data included from pretraining onward using an answer-preserving trimming scheme. Post-training uses a four-stage RL cascade: reasoning warmup on math and puzzles; a 400-task RLVE-Gym curriculum; math and code RL with test-time compute traces and synthetic code environments built from competitive-programming references; and behavioral RL for chat and instruction following. We also introduce Markovian RSA, a test-time compute method that recursively aggregates parallel reasoning traces while carrying forward only bounded-length reasoning tails between rounds. In TTC evaluation, Markovian RSA raises ZAYA1-8B to 91.9\% on AIME'25 and 89.6\% on HMMT'25 while carrying forward only a 4K-token tail, narrowing the gap to much larger reasoning models including Gemini-2.5 Pro, DeepSeek-V3.2, and GPT-5-High.
Subjects:
Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as: arXiv:2605.05365 [cs.AI]
(or arXiv:2605.05365v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2605.05365
Focus to learn more
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Quentin Anthony [view email] [v1] Wed, 6 May 2026 18:44:08 UTC (1,590 KB)
Full-text links:
Access Paper:
View a PDF of the paper titled ZAYA1-8B Technical Report, by Robert Washbourne and 17 other authors
View PDF
HTML (experimental)
TeX Source
view license
Current browse context:
cs.AI
new | recent | 2026-05
Change to browse by:
cs cs.CL
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
export BibTeX citation Loading...
BibTeX formatted citation
×
loading...
Data provided by:
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Code, Data and Media Associated with this Article
alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos
Demos
Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers
Recommenders and Search Tools
Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
Author
Venue
Institution
Topic
About arXivLabs
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
About
Help
contact arXivClick here to contact arXiv Contact
subscribe to arXiv mailingsClick here to subscribe Subscribe
Copyright
Privacy Policy
Web Accessibility Assistance
arXiv Operational Status