ReasoningBank: Enabling agents to learn from experience
Google Cloud researchers introduce ReasoningBank, a novel agent memory framework that distills generalizable reasoning strategies from both successful and failed experiences, enabling agents to continuously learn after deployment. It outperforms baselines on web and software engineering benchmarks.
ReasoningBank: Enabling agents to learn from experience
Jump to Content
Research
Research
Who we are
Back to Who we are menu
Defining the technology of today and tomorrow.
Philosophy
We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.
Learn more about our Philosophy Learn more
Philosophy
People
Our researchers drive advancements in computer science through both fundamental and applied research.
Learn more about our People Learn more
People
Research areas
Back to Research areas menu
Research areas
Explore all research areas
Research areas
Back to Research areas menu
Explore all research areas
Foundational ML & Algorithms
Algorithms & Theory
Data Management
Data Mining & Modeling
Information Retrieval & the Web
Machine Intelligence
Machine Perception
Machine Translation
Natural Language Processing
Speech Processing
Foundational ML & Algorithms
Back to Foundational ML & Algorithms menu
Algorithms & Theory
Data Management
Data Mining & Modeling
Information Retrieval & the Web
Machine Intelligence
Machine Perception
Machine Translation
Natural Language Processing
Speech Processing
Computing Systems & Quantum AI
Distributed Systems & Parallel Computing
Hardware & Architecture
Mobile Systems
Networking
Quantum Computing
Robotics
Security, Privacy, & Abuse Prevention
Software Engineering
Software Systems
Computing Systems & Quantum AI
Back to Computing Systems & Quantum AI menu
Distributed Systems & Parallel Computing
Hardware & Architecture
Mobile Systems
Networking
Quantum Computing
Robotics
Security, Privacy, & Abuse Prevention
Software Engineering
Software Systems
Science, AI & Society
Climate & Sustainability
Economics & Electronic Commerce
Education Innovation
General Science
Health & Bioscience
Human-Computer Interaction and Visualization
Responsible AI
Science, AI & Society
Back to Science, AI & Society menu
Climate & Sustainability
Economics & Electronic Commerce
Education Innovation
General Science
Health & Bioscience
Human-Computer Interaction and Visualization
Responsible AI
Our work
Back to Our work menu
Projects
We regularly open-source projects with the broader research community and apply our developments to Google products.
Learn more about our Projects Learn more
Projects
Publications
Publishing our work allows us to share ideas and work collaboratively to advance the field of computer science.
Learn more about our Publications Learn more
Publications
Resources
We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem.
Learn more about our Resources Learn more
Resources
Programs & events
Back to Programs & events menu
Shaping the future, together.
Collaborate with us
Student programs
Supporting the next generation of researchers through a wide range of programming.
Learn more about our Student programs Learn more
Student programs
Faculty programs
Participating in the academic research community through meaningful engagement with university faculty.
Learn more about our Faculty programs Learn more
Faculty programs
Conferences & events
Connecting with the broader research community through events is essential for creating progress in every aspect of our work.
Learn more about our Conferences & events Learn more
Conferences & events
Collaborate with us
Careers
Blog
Search
Home
Blog
ReasoningBank: Enabling agents to learn from experience
April 21, 2026
Jun Yan and Chen-Yu Lee, Research Scientists, Google Cloud
ReasoningBank is a novel agent memory framework that uses successful and failed experiences to distill generalizable reasoning strategies, enabling an agent to continuously learn from experience after deployment.
Quick links
Paper
ReasoningBank code
Share
Copy link
×
Agents are becoming increasingly crucial in tackling complex real-world tasks, ranging from general web navigation to assisting with extensive software engineering codebases. However, as these agents transition into persistent, long-running roles in the real world, they face a critical limitation: they struggle to analyze and learn from successful and failed experiences after deployment.
Agents approaching each new task without a memory mechanism will repeatedly make the same strategic errors and discard valuable insights. To address this, various forms of agent memory have been introduced to store information about past interactions for reuse. However, existing methods generally focus on saving exhaustive records of every action taken — such as the trajectory memory used in Synapse — or only documenting workflows summarized from successful attempts, as seen in Agent Workflow Memory). These approaches have two fundamental drawbacks: first, by recording detailed actions instead of tactical foresight, they fail to distill higher-level, transferable reasoning patterns; second, by over-emphasizing successful experiences, they miss out on a primary source of learning — their own failures.
To bridge this gap, in our ICLR paper, "ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory", we introduce a novel agent memory framework (github) that distills useful insights from both successful and failed experiences for test-time self-evolution. When evaluated on web browsing and software engineering benchmarks, ReasoningBank enhances both agent effectiveness (higher success rates) and efficiency (fewer task steps) compared to baseline approaches.
Memory content comparison: existing strategies and ReasoningBank.
Distilling insights with ReasoningBank
ReasoningBank distills global reasoning patterns into high-level, structured memories. Each structured memory item contains the following:
Title: A concise identifier summarizing the core strategy.
Description: A brief summary of the memory item.
Content: The distilled reasoning steps, decision rationales, or operational insights extracted from past experiences.
The memory workflow operates in a continuous, closed loop of retrieval, extraction, and consolidation. Before taking action, the agent draws upon the ReasoningBank to gather relevant memories into its context. It then interacts with the environment and uses an LLM-as-a-judge to self-assess the resulting trajectory and extracts success insights or failure reflection. Notably, this self-judgement does not need to be perfectly accurate, as we find ReasoningBank to be quite robust against judgment noise. During extraction, the agent distills workflows and generalizable insights from the trajectory into new memories. For simplicity, we directly append these to the ReasoningBank, leaving more sophisticated consolidation strategies for future work.
Crucially, unlike existing workflow memory strategies that only focus on successful runs, ReasoningBank actively analyzes failed experiences to source counterfactual signals and pitfalls. By distilling these mistakes into preventative lessons, ReasoningBank builds powerful strategic guardrails. For example, instead of merely learning a procedural rule like "click the 'Load More' button”, the agent might learn from a past failure to "always verify the current page identifier first to avoid infinite scroll traps before attempting to load more results”.
Workflow of ReasoningBank integrated with an agent during test time.
Memory-aware test-time scaling (MaTTS)
Test-time scaling (TTS) — scaling compute at inference time — has shown immense effectiveness in reasoning domains like math and competitive programming. However, in agentic environments, existing TTS methods often discard the exploration trajectory and treat the final answer as the only useful outcome. This overlooked exploration is actually a rich data source that could accelerate an agent's ability to learn from experience over time.
We bridge this gap by explicitly linking memory with scaling through memory-aware test-time scaling (MaTTS). By using ReasoningBank as a powerful experience learner, MaTTS distills extensive exploration into high-quality memories via contrastive and refinement signals. We demonstrate the power of MaTTS functions through two distinct forms of scaling:
Parallel scaling: The agent generates multiple distinct trajectories for the same query under the guidance of memory. Through self-contrast, ReasoningBank compares successful and spuriously reasoned trajectories to distill more robust strategies and synthesize higher-quality memories.
Sequential scaling: The agent iteratively refines reasoning within a single trajectory to produce strong intermediate rationale. ReasoningBank captures these intermediate insights on the agent's trial-and-errors and progressive improvement as high-quality memory items.
MaTTS establishes a strong synergy: high-quality memory from ReasoningBank steers the scaled exploration towards more promising strategies, and in return, the scaled interactions generate significantly richer learning signals that feed back into an even smarter ReasoningBank to help the agent.
Comparison of memory-aware test-time scaling (MaTTS) with ReasoningBank.
Performance & emergent capabilities
We evaluated ReasoningBank across challenging benchmarks covering dynamic environments. Using the ReAct prompting strategy as the foundation for all agents, we compared ReasoningBank against three memory configurations: a memory-free baseline (Vanilla ReAct), Synapse (Trajectory Memory) and AWM (Workflow Memory). From our main evaluation results with Gemini-2.5-Flash on WebArena and SWE-Bench-Verified, we have the following key observations:
Superior success rates: ReasoningBank without scaling outperformed memory-free agents by 8.3% on WebArena and 4.6% on SWE-Bench-Verified.
Efficiency gains: Because the agent actively accesses past decision rationales, it executes commands with vastly reduced aimless exploration. On SWE-Bench-Verified, ReasoningBank saved almost 3 total execution steps per task over memory-free baselines.
MaTTS synergy: When adding MaTTS (parallel scaling with a scaling factor k=5), success rates are further boosted. ReasoningBank w/ MaTTS improves over ReasoningBank by a 3% success rate increase and 0.4 fewer steps on WebArena.
Performance comparison (task success rates and average steps per task) of different agent memory strategies on WebArena and SWE-Bench-Verified.
Importantly, during evaluation, we observed the emergence of strategic maturity. In a web-browsing example, the agent's initial curated rules resembled simple procedural checklists (e.g., "Look for page links"). As the agent persisted through more problem sets, these memories were incorporated during execution. Building upon existing knowledge, the agent distilled new trajectories into more advanced memories. Over time, simple checklists evolved into memories with compositional, preventative logic structures (e.g., "Cross-reference tasks continuously with active page filters to ensure retrieved datasets aren't paginated prematurely"). See the paper for more details.
Conclusion
ReasoningBank provides a powerful framework for enabling LLMs to learn from experiences and evolve into continuous learners during test-time. We believe memory-driven experience scaling represents a crucial new frontier for agent scaling.
We are excited to share this with the broader research community.
Acknowledgements
This research was conducted by Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, and Tomas Pfister.
Labels:
Generative AI
Machine Intelligence
Natural Language Processing
Quick links
Paper
ReasoningBank code
Share
Copy link
×
Other posts of interest
April 29, 2026
Four ways Google Research scientists have been using Empirical Research Assistance
Data Mining & Modeling
·
General Science
·
Generative AI
·
Machine Intelligence
April 22, 2026
It's all about the angle: Your photos, re-composed
Generative AI
·
Photography
April 16, 2026
Designing synthetic datasets for the real world: Mechanism design and reasoning from first principles
Generative AI
·
Machine Intelligence
·
Natural Language Processing
× ❮ ❯
Follow us
About Google
Google Products
Privacy
Terms
Help
Submit feedback
×