2026-04-08 20:01 UTCOriginal source6 min readUpdated: 2026-06-27 00:25 UTC

Improving the academic workflow: Introducing two AI agents for better figures and peer review

Google Cloud researchers introduce PaperVizAgent for generating publication-ready figures and ScholarPeer for automated, rigorous peer review. Both systems outperform existing baselines by significant margins, demonstrating the potential of multi-agent AI in academic research.

SourceGoogle Research Blog

Article intelligence

EngineersAdvanced

Key points

PaperVizAgent generates high-quality academic figures from text using a five-agent collaborative system with iterative refinement.
ScholarPeer emulates senior reviewers through context acquisition, active verification, and multi-aspect questioning.
Both agents significantly outperform baselines like GPT-Image-1.5 and Paper2Any in respective evaluations.
These experimental prototypes aim to assist, not replace, human researchers in the scientific workflow.

Why it matters

This matters because paperVizAgent generates high-quality academic figures from text using a five-agent collaborative system with iterative refinement.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

This panel is AI-generated and reviewed for accuracy.

Improving the academic workflow: Introducing two AI agents for better figures and peer review

Jump to Content

Research

Who we are

Back to Who we are menu

Defining the technology of today and tomorrow.

Philosophy

We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.

Learn more about our Philosophy Learn more

Philosophy

People

Our researchers drive advancements in computer science through both fundamental and applied research.

Learn more about our People Learn more

People

Research areas

Back to Research areas menu

Research areas

Explore all research areas

Research areas

Back to Research areas menu

Explore all research areas

Foundational ML & Algorithms

Algorithms & Theory

Data Management

Data Mining & Modeling

Information Retrieval & the Web

Machine Intelligence

Machine Perception

Machine Translation

Natural Language Processing

Speech Processing

Foundational ML & Algorithms

Back to Foundational ML & Algorithms menu

Algorithms & Theory

Data Management

Data Mining & Modeling

Information Retrieval & the Web

Machine Intelligence

Machine Perception

Machine Translation

Natural Language Processing

Speech Processing

Computing Systems & Quantum AI

Distributed Systems & Parallel Computing

Hardware & Architecture

Mobile Systems

Networking

Quantum Computing

Robotics

Security, Privacy, & Abuse Prevention

Software Engineering

Software Systems

Computing Systems & Quantum AI

Back to Computing Systems & Quantum AI menu

Distributed Systems & Parallel Computing

Hardware & Architecture

Mobile Systems

Networking

Quantum Computing

Robotics

Security, Privacy, & Abuse Prevention

Software Engineering

Software Systems

Science, AI & Society

Climate & Sustainability

Economics & Electronic Commerce

Education Innovation

General Science

Health & Bioscience

Human-Computer Interaction and Visualization

Responsible AI

Science, AI & Society

Back to Science, AI & Society menu

Climate & Sustainability

Economics & Electronic Commerce

Education Innovation

General Science

Health & Bioscience

Human-Computer Interaction and Visualization

Responsible AI

Our work

Back to Our work menu

Projects

We regularly open-source projects with the broader research community and apply our developments to Google products.

Learn more about our Projects Learn more

Projects

Publications

Publishing our work allows us to share ideas and work collaboratively to advance the field of computer science.

Learn more about our Publications Learn more

Publications

Resources

We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem.

Learn more about our Resources Learn more

Resources

Programs & events

Back to Programs & events menu

Shaping the future, together.

Collaborate with us

Student programs

Supporting the next generation of researchers through a wide range of programming.

Learn more about our Student programs Learn more

Student programs

Faculty programs

Participating in the academic research community through meaningful engagement with university faculty.

Learn more about our Faculty programs Learn more

Faculty programs

Conferences & events

Connecting with the broader research community through events is essential for creating progress in every aspect of our work.

Learn more about our Conferences & events Learn more

Conferences & events

Collaborate with us

Careers

Blog

Home

Blog

Improving the academic workflow: Introducing two AI agents for better figures and peer review

April 8, 2026

Jinsung Yoon, Research Scientist, and Tomas Pfister, Director, Google Cloud

Introducing two AI agents to streamline academic research. These include: PaperVizAgent, a visualizer agent for drawing academic figures, and ScholarPeer, a reviewer agent that automatically and rigorously evaluates academic papers.

Quick links

PaperVizAgent paper

PaperVizAgent code

ScholarPeer paper

Copy link

Academic research is evolving at an unprecedented pace driven by the rapid advancements in AI. The academic research workflow is notoriously rigorous, involving far more than just conceptualizing an idea and writing a paper. One hurdle many researchers face is how to effectively visualize their research. While AI can draft text, creating the complex methodology diagrams and precise statistical plots required for top-tier conferences and journals is significantly more difficult. Furthermore, the scientific community relies on the peer review process to maintain the integrity of published research. However, the exponential growth of paper submissions has severely strained this system, leading to reviewer fatigue and inconsistent evaluations. As language models and multi-agent systems become more sophisticated, we see their potential not just as subjects of study, but as active participants in the scientific process itself.

To that end, we introduce two novel agentic frameworks: (i) PaperVizAgent (formally known as PaperBanana), a visualizer agent for drawing academic figures, and (ii) ScholarPeer, a reviewer agent that automatically and rigorously evaluates academic papers, including inlined diagrams). These agents are designed specifically to assist with the academic research lifecycle to empower scientists to focus on innovation rather than administrative overhead. Our evaluations show PaperVizAgent consistently generates expert quality figures that significantly outperform leading baselines (GPT-Image-1.5, Nano-Banana-Pro, Paper2Any) while ScholarPeer delivers highly critical, literature-grounded reviews that beat state-of-the-art automated reviewers.

PaperVizAgent: Generating publication-ready figures

PaperVizAgent is an autonomous framework designed to generate publication-ready academic illustrations from academic text. By bridging the gap between technical descriptions and visual communication, PaperVizAgent allows researchers to create professional-grade figures directly from their manuscripts. To initiate the process, a researcher provides two inputs:

Source context: Typically the method sections of a manuscript with technical details of the research.

Communicative intent: A detailed figure caption that describes what the visual should convey.

The PaperVizAgent framework orchestrates a collaborative team of five specialized AI agents including: (1) a retriever, (2) a planner, (3) a stylist, (4) a visualizer, and (5) a critic. First, the retriever and planner agents gather references (e.g., existing literature to reference relevant academic figures) and organize the content. Next, the stylist agent synthesizes aesthetic guidelines to ensure the output matches academic standards. The visualizer then renders an image or generates executable python code for statistical plots. Finally, the critic agent evaluates the output against the original text. If inconsistencies are found, the critic provides targeted feedback to the visualizer agent, triggering a loop of iterative refinement.Through iterative refinement, this multi-agent system ensures the final illustration is both visually appealing and technically accurate.

Given the source context and communicative intent, PaperVizAgent retrieves relevant reference examples and synthesizes a stylistically optimized description. It then uses an iterative refinement loop to transform the description into the final illustration.

Examples of methodology diagrams generated by PaperVizAgent.

Results

In comprehensive experiments, PaperVisAgent consistently outperformed leading baselines — including direct prompting, few-shot prompting, and Paper2Any (a state-of-the-art approach for visualization). The system was rigorously evaluated using a comparative scoring metric (using a 0-100 scale, where a higher score is better) across four critical dimensions: faithfulness, conciseness, readability, and aesthetics. In this evaluation, we used an LLM judge that was calibrated using human-generated figures as inputs and a set human performance baseline of 50.0.

PaperVizAgent achieved an impressive overall score of 60.2, significantly surpassing all evaluated baselines such as GPT-Image-1.5, Nano-Banana-Pro, and Paper2Any. Notably, it stands as the only framework to exceed the established human baseline of 50.0 in its overall rating. When breaking down the specific dimensions, the system particularly excels in Conciseness and Aesthetics, scoring well above the human threshold in both categories. It also achieved human-competitive results in generating statistical plots, proving its versatility. These results represent a significant leap forward in automated illustration.

PaperVizAgent outperforms all baseline models across five key metrics, achieving results competitive with the human baseline.

Emulating senior reviewers with ScholarPeer

ScholarPeer is a context-aware, search-enabled multi-agent framework designed to automate and elevate the peer review process by following the workflow of a senior researcher.

Unlike standard language models that treat reviewing as a simple text-generation task, ScholarPeer relies on a dual-stream process of context acquisition and active verification. It dynamically constructs a domain narrative using a sub-domain historian agent which grounds the review in live, web-scale literature. A baseline scout acts as an adversarial auditor, specifically hunting for datasets or comparative baselines the authors may have missed. Finally, a multi-aspect Q&A engine rigorously verifies the paper's technical claims, ensuring a deep and fact-based critique. The final review report includes a detailed summary, strengths, weaknesses, and questions for the authors, much like a standard expert peer review.

Given an input paper, ScholarPeer employs a dual-stream information retrieval process. The context and knowledge module uses summarizer and search-enabled literature review to compress internal and external information. These inputs feed into the multi-aspect Q&A engine, which generates and answers probing questions regarding the novelty and technical soundness. Finally, the review generator utilizes these inputs and conference-specific review guidelines to generate the final review.

ScholarPeer's performance demonstrates the immense potential of integrating active web search with multi-agent orchestration for academic evaluation. When tested on the extensive public datasets, ScholarPeer achieved significant win-rates against state-of-the-art automated reviewing approaches in side-by-side evaluations. More importantly, the system's active verification workflow drastically reduced the gap between AI-generated feedback and human-level diversity, producing reviews that are highly critical, realistic, and deeply grounded in existing literature.

Comparative evaluation of ScholarPeer against existing frameworks on public datasets. Left: Win rate of ScholarPeer against review of fine-tuned models and agentic baselines. Middle: Average single-side score of reviews generated by various frameworks (best human review is considered as 5). Right: Correlation between human-expert evaluations and automated review framework rankings.

What’s next for the scientific community

PaperVizAgent and ScholarPeer are part of our broader efforts exploring AI-assisted research more generally. By tackling two distinct but equally demanding phases of the publication lifecycle, these tools serve as collaborators that elevate the quality of scientific discourse and can, alongside other tools, accelerate the dissemination of knowledge.

While these two frameworks offer immediate and tangible benefits to the academic community, they are just the beginning of our journey. We envision a future where researchers have access to a rich, interconnected ecosystem of AI assistants seamlessly integrated into every facet of the scientific workflow, and we are actively continuing our work in this space.

Acknowledgements

We would like to thank Palash Goyal, Dawei Zhu, Mihir Parmar, Rui Meng Yiwen Song, Yale Song, Hamid Palangi, Xiyu Wei, Sujian Li and Burak Gokturk for their valuable contributions to this work.

Disclaimer

PaperVizAgent and ScholarPeer are experimental research prototypes, not production-ready tools. Their automated feedback, figures, and reviews are intended only for research exploration and should not be relied upon as the definitive basis for editorial or publication decisions.

Labels:

Generative AI

Natural Language Processing

Quick links

PaperVizAgent paper

PaperVizAgent code

ScholarPeer paper

Copy link