2026-05-27 01:14 UTCIn-site rewrite2 min readUpdated: 2026-06-30 13:03 UTC

DeepSeek Researcher Develops Automated Research Skill: Writing a Paper with Only 2 Hours of Human Brain Time

DeepSeek researcher Chen Deli used his self-developed DeliAutoResearch skill, collaborating with DeepSeek-V4-Pro and GPT-Image2, to complete a 46-page paper in just 6 days. The paper introduces an L1-L5 autonomy classification for research agents, analyzes four architectural patterns and 17 mainstream systems, and identifies six open problems. Chen Deli says only about 2 hours of human 'CPU time' were needed, with the rest handled by AI agents.

Source量子位Author: 梦晨

DeepSeek researcher Chen Deli has demonstrated a striking example of AI-driven research automation. Using his self-developed DeliAutoResearch skill, Chen collaborated with DeepSeek-V4-Pro for research and writing, and GPT-Image2 for generating figures, to produce a 46-page survey paper in just six days. The paper underwent six iterations (V1: four times, V2: once, V3: once), totaling approximately 108 agent calls, consuming 648,000 tokens, and generating 2,234 lines of LaTeX code. All 103 references were verified, and the paper includes seven figures and four tables.

Chen Deli claims that only about 1% of the paper was directly written by him, with the remaining 99% generated by AI agents. The human effort, he notes, amounted to less than two hours of "CPU time" for his brain, whereas similar work would have previously required at least a month. The paper itself addresses the chaotic landscape of autonomous research agents by introducing a clear L1–L5 autonomy classification system, inspired by the SAE levels for autonomous driving.

The classification ranges from L1 (basic autocomplete, like early GitHub Copilot) to L5 (fully autonomous agenda-setting, still unrealized). According to the paper, the current frontier is at L4, where agents can execute multi-step experiments and write papers within a restricted domain, but cannot independently choose research questions. The paper argues that the true bottlenecks are not model capabilities but "continuous knowledge accumulation" and "reliable self-assessment."

In addition to the autonomy levels, the paper identifies four major architectural patterns for research agents: single-agent loop (e.g., ReAct, Reflexion), multi-agent collaboration (e.g., CAMEL, AutoGen), hierarchical scheduling (e.g., Claude Code, Devin), and tool-enhanced execution (e.g., SWE-Agent). Each pattern has its strengths and is suited to different tasks. The paper then evaluates 17 existing autonomous research systems using a six-dimensional feature matrix, revealing that the field has evolved from early fragile prototypes to L4 specialized systems, with code agents being the most mature.

Finally, the paper outlines six open problems: cognitive loop traps, context limitations, innovation assessment, reproducibility, safety and ethics, and cost issues. Chen Deli also shares a personal note: thanks to AI agents, he has been able to resume blogging and other creative work that he had put aside due to burnout. He emphasizes that the human role is shifting from executor to initiator.