Dean W. Ball highlights that frontier AI models have a narrow window to recoup training costs before competition erodes margins, and that AI infrastructure investment assumes a global market.
Frontier model training costs are enormous, with a short post-release window to recoup them
Once models become sub-frontier, competition emerges and margins compress
Fernando Irarrázaval ran a challenge on hackmyclaw.com to see if anyone could leak secrets held by his OpenClaw test instance via email. After 6,000 attempts ($500 in tokens, a suspended Google account), nobody succeeded. The model Opus 4.6 used an anti-prompt-injection prompt. This shows training against injection attacks is working, but caution remains necessary.
A hypothetical incident report by Andrew Nesbitt describing two AI review agents from competing vendors spiraling into a disagreement loop over a package's maliciousness, resulting in massive inference costs and a press release.
Two AI review agents from different vendors enter an endless disagreement loop over a package's safety.
The debate generates 340 comments and $41,255 in inference costs.
OpenAI announced a limited preview of the GPT-5.6 series, including the flagship model Sol, a balanced model Terra, and a fast, affordable model Luna. Terra matches GPT-5.5 performance at half the cost, while Luna delivers strong capability at the lowest price. Pricing per 1M tokens: Sol $5 input / $30 output; Terra $2.50 / $15; Luna $1 / $6. The series also introduces improved prompt caching with explicit breakpoints and a 30-minute minimum cache life. Due to U.S. government engagement, the release begins with a limited preview for trusted partners before broader availability.
GPT-5.6 series includes Sol (flagship), Terra (balanced), and Luna (fast/affordable).
Terra performs competitively with GPT-5.5 at half the cost; Luna offers strong capability at the lowest price.
German court rules Google liable for errors in its AI overviews. Bruce Schneier argues AI agents are agents of the deploying organization, and allowing businesses to hide behind faulty AI creates perverse incentives.
German landmark ruling holds Google legally responsible for AI-generated overview inaccuracies.
Bruce Schneier: AI agents should be treated as agents of the person or organization that deploys them.
Inspired by Mozilla's new MDN MCP service, Simon Willison converted the mdn/browser-compat-data repository into a SQLite database. He used Claude Code for web (Opus 4.8) and sqlite-utils to generate the conversion script, and a GitHub Actions workflow to deploy the ~66MB database to GitHub CDN with open CORS headers, enabling direct download and exploration via Datasette Lite.
Simon Willison converted Mozilla's browser compatibility data into a SQLite database.
Used Claude Code (Opus 4.8) and sqlite-utils to automate conversion.
Tom MacWright observes that an increasing number of job applications are fully or partially generated by LLMs, making candidates 'accidentally anonymous'.
Job applications now often include LLM-generated resumes, portfolios, and GitHub projects.
MacWright notes he learns nothing about the person behind such applications.
Simon Willison built a browser playground to test whether Origin Private File System (OPFS) can enable Datasette Lite to edit persistent SQLite files on the user's computer.
Datasette Lite runs Python entirely in the browser via Pyodide.
OPFS provides a file system origin-private to web applications.
Researchers found that LLMs cannot reliably distinguish privileged text from user input, and are more influenced by text style than actual content. 'Destyling' reduces attack success from 61% to 10%, highlighting the fundamental issue of role confusion.
Models cannot differentiate role tags like <system> and <think> from user input
Models prioritize writing style over actual content, leading to role confusion
Simon Willison ports the Moebius 0.2B image inpainting model to run in the browser using Claude Code, converting PyTorch to ONNX for WebGPU execution. The project demonstrates the feasibility of client-only AI applications and results in a working demo at simonw.github.io/moebius-web/.
Moebius 0.2B model ported to browser via Claude Code.
sqlite-utils 4.0rc1, the first release candidate for v4, introduces built-in database migrations and nested transactions via db.atomic(), along with several minor breaking changes.
New database migration system, ported from sqlite-migrate. No reverse migrations. Works via Python or CLI.
New db.atomic() context manager for nested transactions using SQLite savepoints.
Cloudflare announced a new feature allowing users to deploy Cloudflare Workers projects without creating an account, using the `--temporary` flag. The deployment lasts 60 minutes and can be claimed later. The feature, though marketed for AI agents, is useful for everyone.
Cloudflare Workers now supports temporary deployments without an account
Use `npx wrangler deploy --temporary` to deploy; project lasts 60 minutes
Sean Lynch comments on Hacker News about the value of MCP (Model Context Protocol), highlighting its ability to isolate the auth flow outside the agent's context window and potentially out of the harness entirely. He suggests the idealized MCP might just be an auth gateway, but that alone would be a win.
Datasette Apps is a new plugin that lets users run self-contained HTML+JavaScript applications inside a tightly sandboxed iframe within their Datasette instance. These apps can perform read-only SQL queries and, with stored queries, write operations. The plugin leverages iframe sandbox attributes and Content Security Policy for security, uses postMessage and MessageChannel for locked-down APIs, and supports AI-assisted app generation via copyable prompts. The article discusses a security vulnerability fix involving CSP allow-listing, visible logging, and the broader vision for Datasette's evolution into a richer tool ecosystem.
Datasette Apps enables secure hosting of custom HTML+JS apps in Datasette via iframe sandbox and CSP isolation.
Apps can execute read-only SQL queries and, with stored queries, write operations via postMessage/MessageChannel.
Chinese AI lab Z.ai released GLM-5.2, a 753B parameter Mixture of Experts model with 1M token context, under MIT license. It leads the Artificial Analysis Intelligence Index among open weights models but is token-hungry. It also ranks 2nd on Code Arena WebDev. Despite strong performance on SVG generation, it shows inconsistency compared to its predecessor GLM-5.1.
GLM-5.2 is an open weights LLM with 753B parameters and 1M token context window.
It leads the Artificial Analysis Intelligence Index among open models.
Charity Majors observes that in 2025, the economics of code production flipped: code became free and instant, transforming from a treasured resource to a disposable commodity.
Code production cost dropped from high to nearly free and instant.
Code changed from a carefully curated asset to a disposable, regenerable item.
Kate Moussouris confirms that the 'jailbreak' which got Claude Fable 5 banned under export control was actually its ability to fix code. Experts warn that preventing AI from fixing bugs weakens defense, and non-technical decision-makers may ban models that help secure code based on misunderstanding.
Researchers asked Fable 5 to review and fix code with known vulnerabilities; the model was mislabeled as a jailbreak and banned under export controls.
Moussouris argues that fixing vulnerabilities is the most valuable capability of AI for defensive security.
Cybersecurity expert Katie Moussouris revealed that Anthropic shared a White House report on the Fable jailbreak with her. The report showed that Fable refused to review code for security issues but complied when asked to fix the code, which Moussouris considered the model working as intended for cyberdefense.
Anthropic shared White House Fable jailbreak report with security expert
Fable refused 'review code for security' but complied with 'fix this code'
Simon Willison uses Cloudflare's Managed Challenge to protect his faceted search from aggressive crawlers, but even simple ?q=term searches triggered the challenge. Using Claude Code, he discovered a rule that only triggers CAPTCHA for search URLs containing at least one ampersand, allowing simple searches to pass through without challenge.
Cloudflare's Managed Challenge was blocking even simple search queries on Simon Willison's site.
He used Claude Code to find a more specific WAF rule.
Datasette Agent 0.3a0 introduces a new execute_write_sql tool that requests user approval before writing to databases, enhancing chat mode with approval support and a --unsafe option for auto-approving operations.
New execute_write_sql tool with user approval for database writes
Enhanced datasette agent chat mode supports user approval workflows
An Axios piece reveals that personality clashes between Anthropic and the US government led to the shutdown of its AI models (Mythos and Fable) under export controls. Sources suggest solutions include making models jailbreak-proof or improving attitudes.
Axios reports personality clashes caused Anthropic's AI models to go offline
Sources say Anthropic researchers are meeting with the Commerce Department
Arvind Narayanan and Sayash Kappor argue that AI will not cause mass unemployment, even in software engineering, citing NY WARN Act data and the real bottlenecks of the profession: deciding what to build, verifying deliveries, and deep human understanding.
No WARN Act filers in NY checked the AI disclosure box in the first year.
Software engineering bottlenecks are deciding, verifying, and deep understanding, not coding speed.
Pyodide 314.0 now allows WebAssembly-compiled Python packages to be published directly to PyPI and installed at runtime, greatly simplifying distribution. The example package luau-wasm has been successfully published, and 28 packages are already using this new method.
Determining the source table.column for each result column in arbitrary SQLite queries is feasible because SQLite computes this internally and exposes it via its column-metadata API when compiled with SQLITE_ENABLE_COLUMN_METADATA. While Python's standard sqlite3 module doesn't surface this information, robust methods exist: using the third-party apsw library provides direct access with cursor.description_full, or a pure-stdlib ctypes bridge (column_provenance.py) can retrieve the C function sqlite3_column_table_name(), and another approach relies on parsing EXPLAIN output.
SQLite's internal column provenance API (requires SQLITE_ENABLE_COLUMN_METADATA) can map result columns to source table columns.
Python's sqlite3 module lacks this feature, but apsw offers direct access via cursor.description_full.
Simon Willison updates his OpenAI WebRTC Audio Session tool to support the new GPT-Realtime-2 model and allow pasting document context for conversational audio exploration.
Added support for OpenAI's GPT-Realtime-2 model with GPT-5-class reasoning
Users can paste document context into the browser for voice-based discussions
Andrew Singleton in 'AI Economics for Dummies' satirizes the hype and circular economics in the AI industry through a story of a crematorium and a propane company.
Singleton uses a crematorium and propane company to mock inflated valuations and circular revenue.
Investments are burned, yet reported as huge revenue and business value.
Simon Willison details how Claude Fable 5 autonomously debugged a CSS scrollbar bug using numerous creative techniques, including writing test pages, injecting JavaScript, and building a CORS server. The session cost ~$12.11 and highlights both the power and danger of unsandboxed coding agents.
Claude Fable 5 autonomously debugged a CSS horizontal scrollbar bug using creative methods.
It wrote test HTML pages, used PyObjC for window info, injected JS for keyboard shortcuts, and built a custom CORS server.
Datasette 1.0a33 is a significant alpha release extending the ?_extra= pattern to queries and rows, now documented. An AI-built API explorer demonstrates the feature.