We break down the technical architecture behind our multi-stage vulnerability discovery harness and automated triage loop. Learn how we manage state controls, squash false positives through adversarial review, and route around LLM context limits.
Treat models as interchangeable components; cross-test with different models to avoid single-lens coverage.
Evolve from a single skill to a pipeline with Recon, Hunt, Validate, Gapfill, Dedup, Trace, Feedback, and Report stages.
Cloudflare is opening up its Agents SDK primitives for any agent framework to build on, with Flue as the first open-source framework targeting the SDK. Flue, built on the Pi harness, leverages Durable Objects for durable execution, dynamic code execution via Code Mode, and a virtual filesystem via @cloudflare/shell, enabling production-grade agent deployment.
Cloudflare makes Agents SDK primitives available to all agent frameworks, starting with Flue.
Flue is a declarative agent framework that eliminates the need for orchestration loops by describing what the agent knows.
Cloudflare is deepening its investment in AI by adding team members from Ensemble AI, focusing on machine learning infrastructure and efficiency. The Ensemble team brings expertise in model compression and efficient inference, including NdLinear technology, to enhance Workers AI's performance and cost-effectiveness.
Ensemble AI team members join Cloudflare to focus on ML infrastructure and efficiency.
Ensemble developed NdLinear and NdLinear-LoRA for model compression and efficient inference.
In our post about Project Glasswing, we made the argument that the architecture around a vulnerability matters more than the speed of the patch. Here we walk through what that architecture looks like, the threats it defends against, and how we run it ourselves as Cloudflare's customer zero.
Frontier models accelerate vulnerability discovery, exploit chain construction, and PoC generation, requiring defenses to focus on discovery speed, exploit volume/adaptation, and post-exploitation impact.
Cloudflare leverages global network visibility via Cloudforce One threat intelligence and a dedicated WAF team for rapid response.
AI Gateway now features real-time spend limits to prevent runaway token bills across multiple AI providers. By integrating with Cloudflare Access, companies can use identity-driven budgets and policies.
Cloudflare AI Gateway introduces spend limits, allowing budgets by model, provider, or custom attributes.
Integration with Cloudflare Access enables identity-driven budgets and policies per user or team.
Cloudflare processes over a billion events per second, but data was scattered and hard to access. They built Town Lake, a unified analytics platform, and Skipper, an AI agent that lets anyone ask questions in plain English and get auditable answers. The article details platform architecture, governance (default-closed), and the AI agent's workings.
Cloudflare built Town Lake (unified data platform) and Skipper (AI agent) to solve data sprawl.
Town Lake uses a data lakehouse architecture with Trino, R2, and Iceberg for unified querying.
Cloudflare has integrated with Anthropic's Claude Managed Agents to provide a fast, isolated execution environment for autonomous code delivery. This means builders can scale agent workflows globally while strictly controlling access to private backends and easily customizing their agent’s tools and runtimes.
Cloudflare integrates Claude Managed Agents with its sandbox environment for enhanced control and security.
Features include lightweight isolates for fast scaling, private service connectivity, and browser observability.
In recent weeks, we pointed Mythos and other security-focused LLMs at live code across critical parts of our infrastructure. We share what we observed, the models’ strengths and weaknesses, and what the work around them needs to look like before any of it can scale.
Mythos Preview excels at exploit chain construction and proof generation.
Model refusals are inconsistent, requiring additional safeguards.
Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away. Humans can be in the loop to grant permission, but there’s no need to go to the dashboard, copy and paste API tokens, or enter credit card details.
AI agents can autonomously provision Cloudflare accounts and resources via Stripe Projects.
The protocol includes discovery, authorization, and payment components with spending limits.
As AI assistants and privacy proxies challenge the capabilities of traditional bot detection, the Web needs new models for accountability. Cloudflare argues that control should remain with the client, and that an open ecosystem of anonymous credentials is key to preserving user privacy while protecting origins from abuse.
Traditional bot detection methods like IP addresses and fingerprints fail to capture intent; websites need behavior-based access control
Privacy Pass enables clients to provide privacy-preserving proofs without tracking
Learn about how Cloudflare built a CI-native AI code reviewer using OpenCode that helps engineers ship better, safer code. It uses specialized agents, a coordinator, risk tiers, and circuit breakers to scale across thousands of repositories.
Cloudflare built an AI code review system using OpenCode, with specialized reviewers for security, performance, code quality, etc.
The system uses a coordinator agent to deduplicate findings and judge severity, with circuit breakers and failback chains for resilience.
93% of Cloudflare's R&D organization uses AI coding tools powered by their own platform. In the last 30 days, AI Gateway processed 20.18M requests and 241.37B tokens; Workers AI processed 51.47B input tokens. The internal stack includes zero-trust authentication, centralized routing, MCP Server Portal, AI Code Reviewer, and a knowledge graph, all running on shipped Cloudflare products.
3,683 internal users actively use AI coding tools (60% company-wide, 93% across R&D).
AI Gateway handles 20.18M requests/month and 241.37B tokens; Workers AI handles 51.47B input tokens.
Agents Week 2026 is a wrap. Let’s take a look at everything we announced, from compute and security to the agent toolbox, platform tools, and the emerging agentic web. Everything we shipped for the agentic cloud.
Cloudflare hosted its first Agents Week, focusing on infrastructure for the age of agents.
Announcements spanned compute, security, agent toolbox, platform tools, and the agentic web.
Cloudflare launches isitagentready.com to help site owners assess how well their websites support AI agents across discoverability, content accessibility, bot access control, and capabilities. The article also introduces new Radar data tracking standard adoption and details how Cloudflare overhauled its developer docs to reduce token consumption and improve response speed.
Cloudflare introduces the Agent Readiness score and isitagentready.com for evaluating site support for AI agents.
The score is based on four dimensions: discoverability, content accessibility, bot access control, and capabilities.
Cloudflare announces support for shared compression dictionaries to reduce redundant data transfers, with an open beta starting April 30, 2026. The technology leverages delta compression to send only diffs of updated resources, achieving up to 99% reduction in transfer size compared to gzip.
Web pages grow 6-9% heavier each year, exacerbated by agentic crawlers and frequent deployments.
Shared dictionaries use delta compression to send only file diffs, dramatically reducing bandwidth.
Cloudflare launches Redirects for AI Training to automatically redirect verified AI training crawlers from deprecated pages to canonical URLs using a single toggle. The feature leverages existing canonical tags to issue 301 redirects without origin changes. Additionally, Cloudflare Radar's AI Insights now includes response status code analysis for AI crawler traffic.
Cloudflare introduces Redirects for AI Training, using canonical tags to auto-redirect verified AI training crawlers with 301 responses
Verified AI crawlers include GPTBot, ClaudeBot, Bytespider; human and other traffic unaffected
Cloudflare Agent Memory is a managed service that gives AI agents persistent memory, allowing them to recall what matters, forget what doesn't, and get smarter over time.
Cloudflare launches private beta of Agent Memory, a managed service for persistent AI agent memory.
Extracts and retrieves session information to provide context without filling the context window.
Running LLMs across Cloudflare’s network requires us to be smarter and more efficient about GPU memory bandwidth. That’s why we developed Unweight, a lossless inference-time compression system that achieves up to a 22% model footprint reduction, so that we can deliver faster and cheaper inference than ever before.
Unweight is a lossless compression system that reduces model weights by 15-22% without sacrificing quality. It exploits the redundancy in BF16 exponent bytes using Huffman coding, targeting MLP weight matrices.
It offers four execution pipelines (full decode, exponent-only decode, palette transcode, and direct palette) with an autotuner that selects the best one per weight matrix and batch size.