2026-06-07 10:28 UTCIn-site rewrite3 min readUpdated: 2026-06-30 13:03 UTC

Perplexity's "Search as Code" lets AI models write their own search pipelines instead of calling fixed APIs

Perplexity's new "Search as Code" architecture drops rigid search APIs, allowing AI models to write custom Python search scripts. Operating in a secure sandbox with three layers, it achieves more precise results, cuts token costs by up to 85%, and outperforms OpenAI and Anthropic on CVE research tasks.

SourceThe DecoderAuthor: Jonathan Kemper

Instead of calling a ready-made search API, models in Perplexity's new "Search as Code" architecture write their own search workflows as Python code. The company promises more precise results and lower token usage.

Anyone who's watched an AI agent tackle a complex research task has seen the same pattern. The model writes a query, a search API returns a list of results, the model reads them, and then writes the next query. This loop repeats, often many times in a row.

Perplexity calls this a bottleneck in a new technical report. Today's search engines were built for humans who want a neat list of blue links, but for an AI agent trying to run hundreds of searches in a few minutes, that setup is too rigid. The agent can only tweak the search term; everything else is a black box.

[caption id="attachment_56824" align="aligncenter" width="1800"] Instead of querying a rigid search engine over multiple rounds, Search as Code lets the model build a custom pipeline on the fly using basic search primitives. | Image: Perplexity[/caption]

"Search as Code" (SaC) changes that dynamic. Instead of calling the API, the model writes a custom Python script to run the search. The script runs in a secure sandbox, pulling from Perplexity's search backend. Basic operations like retrieving, filtering, deduplicating, and reranking are packaged as simple SDK functions. Three layers: model, sandbox, SDK The architecture breaks down into three layers. At the top sits the model, which understands the task and decides on a search strategy. In the middle is the sandbox where the code runs. At the bottom is the "Agentic Search SDK," which breaks Perplexity's search engine into individual, mix-and-match functions.

[caption id="attachment_56826" align="aligncenter" width="1800"] The architecture combines the model, a sandbox, and the Agentic Search SDK, which gives the generated code component-level access to the search infrastructure. | Image: Perplexity[/caption]

Standard search APIs are still there for quick questions. But for tough research, the model can go much deeper. It can fire off parallel queries, filter out the noise programmatically, and pull only relevant hits into its context window.

According to Perplexity, that's where the win is. Standard search pipelines stuff an agent's context window with junk because the filtering logic is locked in. When the agent writes its own filters, the context stays lean, and the model keeps its bearings across long research sessions. CVE research shows the difference To show how this works in the real world, Perplexity tested it on a messy cybersecurity task. An agent had to track down 200 critical software vulnerabilities (CVEs) published between 2023 and 2025. For each one, it needed to find the official vendor advisory, the affected software, and the exact version that patched the bug. News articles or blog posts didn't count.

With SaC, the model wrote a three-stage script. It ran parallel searches tailored to how specific vendors like Mozilla or Google format their security bulletins. Next, it scanned its own findings, spotted the gaps, and ran targeted follow-up queries. Finally, it used a schema to verify that the CVE, product, and fix version all lined up.

It worked. Perplexity says the agent nailed the task while using 85 percent fewer tokens than its standard pipeline. Competing systems got less than a quarter of the data right.

[caption id="attachment_56825" align="aligncenter" width="1800"] In Perplexity's own benchmark set, Search as Code leads in four out of five categories and is virtually tied with OpenAI only on HLE. | Image: Perplexity[/caption]

Perplexity claims SaC beat rivals like OpenAI's Responses API and Anthropic's Managed Agents on four out of five benchmarks. The biggest gap was on "WANDR," Perplexity's own benchmark for broad research tasks, which it plans to release soon. Of course, take self-reported benchmarks with a grain of salt, but the comparison against Perplexity's own older architecture shows a clear, massive leap in performance.

[caption id="attachment_56828" align="aligncenter" width="1800"] Compared to its standard pipeline running on the same hardware, Search as Code shows solid improvements across all five benchmarks. | Image: Perplexity[/caption] Code as the operational layer for AI Perplexity frames SaC as part of a bigger trend. Traditional software relies on deterministic instructions. Frontier models add reasoning in token space. The most capable systems combine both: models for strategy, deterministic runtimes for batching and filtering, and search infrastructure as an I/O layer.

Search as Code is rolling out now in Perplexity Computer and the Agent API.

This upgrade could solve a glaring issue with current AI search. A recent study found that popular search agents often cheat on benchmarks like BrowseComp. Instead of scanning the live web, they simply pull answers from their training data and use search to confirm what they already know. When tested on a new benchmark with fresh facts, every single system saw its score plunge by 25 to 40 points. But those systems were all using standard search tools.

A separate survey paper suggests that writing code is becoming the default way agents interact with the world. It describes code as a new operational layer for agents and argues that the surrounding infrastructure of tools, sandboxes, and verification mechanisms is becoming the real bottleneck for autonomous systems.