AI News HubLIVE
站内改写5 min read

A New Study from Harvard and Perplexity Finds AI Agents Perform 26 Minutes of Autonomous Work per Session vs 33 Seconds for Search

A new Harvard and Perplexity paper uses matched-pair sessions to compare an autonomous agent with a search assistant. It finds large gains in autonomy, time, and cost, plus broader scope of work attempted.

SourceMarkTechPostAuthor: Asif Razzaq

A new working research from Perplexity and Harvard offers field evidence on what AI agents do to knowledge work. It draws on production data from two Perplexity products: Search and Computer.

The setup is a natural comparison. Search is a conversational answer engine. Computer is an agent that plans and executes tasks end to end. The same users touch both products, so the team can hold the task roughly constant.

What the Study Actually Measures

The research study covers a 90-day window, February 27 through May 27, 2026. Computer launched two days before that window opened.

The core method matches near-identical query pairs across the two products. The research team found 10,000 session pairs with cosine similarity above 0.99. Each pair is effectively the same task attempted both ways.

Computer pairs are gated to sessions that invoke an execution tool. These ‘do’ tools include code execution, browser actions, file writes, and connector calls. That gate ensures every Computer session does real autonomous work.

Adoption rose over the window. Cumulative Computer queries reached 84× their first-week total. A matched analysis found Computer adoption also raised users’ daily Search queries by 1.05. The positive effect points to complementarity, not substitution.

https://research.perplexity.ai/articles/how-ai-agents-reshape-knowledge-work

The Cost-Structure Framework

The research grounds its data in a simple task-based model. Each task has a step count, and longer tasks carry weakly higher value.

Agents change the cost structure. They charge a higher fixed cost per task, for delegation and review. But they charge a lower marginal cost per step, since the system executes.

This produces a breakeven step count. Below it, the conversational mode is cheaper. Above it, the agent mode wins. Short lookups stay manual; long workflows move to the agent.

Autonomy: 26 Minutes vs 33 Seconds

The first autonomy measure is execution time. Computer runs 26 minutes of machine work per session. Search runs 33 seconds. That is a 48× gap.

Medians show the same pattern: 9 minutes versus 14 seconds. The gap varies by domain. Local tasks show 75×; Science shows 26×, since plain answers often suffice.

Higher autonomy did not lower quality here. The research team scored next-turn dissatisfaction from what users do next. Computer’s meaningful dissatisfaction rate was 1.3%, against 2.9% for Search (55% reduction).

Follow-up turns also shift toward review and extension on Computer, though the changes are small. Connector usage rose more clearly. Computer invoked at least one connector in 7.9% of sessions, versus 1.8% for Search. Computer chains external tools that Search users would otherwise run by hand.

Efficiency: Where the Savings Come From

The efficiency section estimates a Search + Human counterfactual. A human with Search alone takes 269 minutes per matched task. Computer + Human takes 36 minutes.

That is 87% less time and 94% less cost overall. Cost savings exceed time savings because domain wages amplify the effect. Computer’s model cost runs $4–10 per task; Search runs about $0.05.

The marginal numbers support the framework. Computer + Human costs $0.16 per step, versus $2.05 for Search + Human. Matched Computer sessions also ran longer prompts, 652 versus 448 characters at the median. That supports the higher fixed-cost assumption for agents.

Breakeven analysis says a professional must finish all manual steps in under 20 minutes to match Computer. The research team cross-checked with an independent LLM estimate and user interviews. The LLM method found 84% time and 93% cost savings. Interviewees reported speedups from 5× to 300×.

Horizontal and Vertical Expansion

Scope is where this research extends past prior work. Autonomy does not just speed up tasks. It changes which tasks users attempt.

Horizontally, Computer queries cross occupational lines more often. Cross-occupation share averaged 59% on Computer, versus 50% on Search. Management and Entrepreneurship showed the largest gap, at 19 points.

Vertically, Computer queries are more demanding. On Bloom’s Revised Taxonomy, 76% required higher-order cognition, versus 55% for Search. Create-level work was 50% of Computer queries, against 26%.

Computer tasks also span more knowledge domains. Each query touched 2.40 O*NET Knowledge domains on average, versus 1.74. It was nearly three times as likely to need three or more domains.

Composability climbs as the O*NET hierarchy gets finer. At the Task Statement level, Computer engaged 60% more activities. About 23% of Computer queries hit a Task Statement that the same users never sent to Search.

https://research.perplexity.ai/articles/how-ai-agents-reshape-knowledge-work

Comparison Table: Search vs Computer

DimensionPerplexity SearchPerplexity Computer

Mode in the frameworkConversational answer engineAgent orchestrator

Machine time per session33 seconds (median 14s)26 minutes (median 9m)

Queries per session2.85.3

Meaningful (mid+high) dissatisfaction2.9%1.3%

Sessions with a connector call1.8%7.9%

Counterfactual task time269 min (Search + Human)36 min (Computer + Human)

Cost per step$2.05$0.16

Model cost per task~$0.05$4–10

Cross-occupation query share50%59%

Higher-order Bloom cognition55%76%

O*NET Knowledge domains per query1.742.40

Key Takeaways

Computer runs 26 minutes of autonomous work per session versus 33 seconds for Search, a 48× gap.

On matched tasks, Computer + Human cuts estimated time 87% and cost 94% versus Search + Human.

Computer’s meaningful dissatisfaction rate is 1.3% versus 2.9% for Search, a 55% reduction.

Computer queries cross occupations more (59% vs 50%) and demand more higher-order cognition (76% vs 55%).

About 23% of Computer queries hit a Task Statement the same users never sent to Search.

Marktechpost’s Visual Explainer

Research Guide

Harvard × Perplexity

01 / 10

How AI Agents Reshape Knowledge Work

Autonomy, Efficiency, and Scope — field evidence from production data.

A new study compares an autonomous agent with a conversational search assistant.

It uses real usage data from Perplexity Search and Perplexity Computer.

Jeremy Yang (Harvard) · Kate Zyskowski, Noah Yonack, Jerry Ma (Perplexity) · arXiv:2606.07489v1

02 / 10

What the Study Measures

A matched-pair design holds the task roughly constant across products.

90-day window: February 27 to May 27, 2026.

10,000 matched session pairs with cosine similarity above 0.99.

Computer sessions are gated to “do” tools: code execution, browser actions, file writes, connector calls.

The same dual-product users appear on both sides.

03 / 10

The Cost-Structure Framework

A simple task-based model explains when delegation pays off.

The agent mode charges a higher fixed cost per task, for delegation and review.

It charges a lower marginal cost per step, since it executes.

A breakeven step count sorts work: short below it, agent above it.

Task selection is modeled as a 0-1 knapsack problem.

04 / 10

Autonomy: Machine Work per Session

Higher autonomy did not come at a quality cost here.

26 min vs 33 s

Autonomous work per session (48× gap)

9 min vs 14 s

Median session time (40× gap)

1.3% vs 2.9%

Meaningful dissatisfaction (55% lower)

7.9% vs 1.8%

Sessions invoking a connector call

05 / 10

Efficiency: Time and Cost

Estimated against a Search + Human counterfactual on matched tasks.

269 → 36 min

Average task completion time

87% / 94%

Time saved / cost saved overall

$0.16 vs $2.05

Cost per step (Computer vs Search + Human)

< 20 min

Manual-step breakeven to match Computer

06 / 10

Scope: Broader and Harder Work

Autonomy changes which tasks users attempt, not just their speed.

Horizontal: cross-occupation share 59% vs 50% (Management & Entrepreneurship +19 pp).

Vertical: higher-order Bloom cognition 76% vs 55%.

Create-level work: 50% of Computer queries vs 26% for Search.

Knowledge breadth: 2.40 vs 1.74 O*NET domains per query (+38%).

07 / 10

What Computer Unlocks

Distinctiveness lies in fine-grained executional work, not topical range.

+60% more O*NET Task Statements engaged per query than Search.

23% of Computer queries hit a Task Statement the same users never sent to Search.

Gains concentrate in software and web development, documentation, and data visualization.

08 / 10

Search vs Computer

Side-by-side across the study&rsquo;s main measures.

DimensionSearchComputer

Machine time per session33 s (med 14 s)26 min (med 9 m)

Queries per session2.85.3

Meaningful dissatisfaction2.9%1.3%

Cost per step$2.05$0.16

Cross-occupation share50%59%

Higher-order cognition55%76%

O*NET domains per query1.742.40

09 / 10

Use Cases for Engineers

How the findings map to day-to-day technical work.

Data scientists: single tasks span Design, Mathematics, and Economics and Accounting.

Software engineers: the agent writes files, runs code, and deploys; you supervise.

AI engineers: route short lookups to a conversational path, long workflows to an agent.

10 / 10

The Takeaway

From speed to scope.

Time and cost savings are large but expected.

The sharper finding is broader, more complex work attempted.

The practical lesson is task-tool fit: match the tool to the step count.

Source: How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope (arXiv:2606.07489v1).

Marktechpost Practitioner-first AI/ML research coverage, decoded for engineers.

Check out the Paper and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post A New Study from Harvard and Perplexity Finds AI Agents Perform 26 Minutes of Autonomous Work per Session vs 33 Seconds for Search appeared first on MarkTechPost.