AI News HubLIVE

Today's must-reads

Models

After Fable 5 ban, Anthropic and 19 organizations launch open source security body

The Linux Foundation launches Akrites, a coordinated body for open source vulnerability disclosure, with founding members including Anthropic, AWS, Google, Microsoft, etc. The initiative aims to address the challenges posed by AI-powered vulnerability discovery, which has outpaced existing coordination models.

  • Anthropic, after the Fable 5 ban, joins 19 other organizations to launch Akrites, an open-source security coordination body under the Linux Foundation.
  • Akrites consolidates vulnerability reports via a shared SIRT to reduce duplicates and speed up fixes for critical open-source projects.
In-site article

The US government just told OpenAI who’s allowed to use the next GPT 5.6 model

The US government has directed OpenAI to restrict access to its upcoming GPT-5.6 model, allowing only approved partners due to cybersecurity concerns. The move sparks debate over security versus open innovation, with experts warning it could drive developers toward alternative models and weaken US AI leadership.

  • White House mandates OpenAI to limit GPT-5.6 access to a small number of government-approved partners.
  • OpenAI CEO Sam Altman reportedly expressed displeasure, calling it not his preferred long-term model.
In-site article

Incident Report: CVE-2026-LGTM

A hypothetical incident report by Andrew Nesbitt describing two AI review agents from competing vendors spiraling into a disagreement loop over a package's maliciousness, resulting in massive inference costs and a press release.

  • Two AI review agents from different vendors enter an endless disagreement loop over a package's safety.
  • The debate generates 340 comments and $41,255 in inference costs.
In-site article

Prompt Caching with Deep Agents

Learn how Deep Agents uses prompt caching to cut LLM token costs by up to 80% across every major model provider - no extra config required.

  • Prompt caching reduces token costs by 41-80% by storing model state after processing a prompt.
  • Different providers have varying support for caching features, making provider-agnostic optimization tricky.
In-site article

OpenAI Previews GPT-5.6 Series: Sol, Terra, and Luna

OpenAI announced a limited preview of the GPT-5.6 series, including the flagship model Sol, a balanced model Terra, and a fast, affordable model Luna. Terra matches GPT-5.5 performance at half the cost, while Luna delivers strong capability at the lowest price. Pricing per 1M tokens: Sol $5 input / $30 output; Terra $2.50 / $15; Luna $1 / $6. The series also introduces improved prompt caching with explicit breakpoints and a 30-minute minimum cache life. Due to U.S. government engagement, the release begins with a limited preview for trusted partners before broader availability.

  • GPT-5.6 series includes Sol (flagship), Terra (balanced), and Luna (fast/affordable).
  • Terra performs competitively with GPT-5.5 at half the cost; Luna offers strong capability at the lowest price.
In-site article
Policy

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction

Google researchers introduce a method to retrofit Multi-Token Prediction onto deployed Gemini Nano v3 models without retraining the backbone, achieving faster inference and lower energy consumption on mobile devices. Deployed on Pixel 9 and 10 series, it boosts speed by over 50% for features like AI Notification Summaries and Proofread.

  • Freezes the backbone and attaches a lightweight MTP head, enabling seamless acceleration without the memory overhead of a separate drafter model.
  • Zero-copy architecture allows the MTP head to leverage the main model's KV cache directly, reducing memory usage by 130MB and eliminating draft prefill latency.
In-site article

The Discoverable Evidence of AI-Assisted Software Porting

This article explores the discoverable evidence generated during AI-assisted software porting, including code diffs, comment patterns, and migration traces, and analyzes their impact on software verification and auditing.

  • AI-assisted porting leaves traceable evidence in the codebase
  • This evidence aids in verifying correctness and completeness of the port
In-site article
Robotics
Agents

How to Tell We–and AI–Are Choosing the Good

This article explores how humans and AI can recognize when we are choosing the good. The author proposes three tells: means and ends (Kant and Kierkegaard), vice and virtue (Aristotle), and shallow vs. deep (Salzberg and Spinoza). While the nature of good is hard to define, these indicators can help guide decision-making for both humans and AI.

  • Kant and Kierkegaard emphasize the unity of means and ends; AI should not take unethical shortcuts.
  • Aristotle's virtue ethics suggests balance between vices, but AI cannot practice virtue directly.
In-site article

Bigger context windows are the wrong abstraction for coding agents

Large context windows are useful but continuity is different. The article argues that coding agents need persistent, belief-backed memory rather than larger prompt space. It contrasts context-native and memory-native agents, discusses why retrieval is insufficient, and presents Sigilix's approach with a memory backing layer. A smaller model (Boreas) can outperform a larger one on continuity-heavy tasks when the substrate is prepared. The piece also covers failure modes of memory systems and the importance of source, scope, decay, and proof.

  • Context size does not equal continuity; a larger window carries more text but does not decide what to remember.
  • Retrieval answers what text is relevant, not what the repo has taught, so agents can still seem forgetful.
In-site article
Other updates (12)
Agents

AI coding agents could soon cost more than the developers using them

Gartner warns that consumption-based pricing for AI coding agents is driving costs up to $20,000 per developer per month, with little transparency or cost control. Token consumption does not directly correlate with productivity gains. Recommendations include context engineering and model routing. By 2028, AI coding costs could exceed average developer salaries globally.

  • Shift from seat-based to consumption-based pricing causes cost spikes
  • Lack of cost optimization tools and transparency leads to tokenmaxxing without productivity gains
In-site article

Show HN: Smart model routing directly in Claude, Codex and Cursor

Weave Router is an open-source model router that intelligently selects the best AI model per request, supporting multiple API formats and reducing costs by 40-70%.

  • Routes every request to the optimal model using a cluster scorer based on Avengers-Pro 2
  • Supports Anthropic, OpenAI, Gemini APIs and open models via OpenRouter
In-site article

A free checker for whether AI search engines can cite your site

This free GEO checker evaluates your website's visibility in AI search engines like ChatGPT, Claude, Perplexity, and Gemini across 7 technical layers—including llms.txt, structured data, service catalog API, OpenAPI spec, Agent Card, health endpoint, and robots/sitemap—providing a score and actionable improvements.

  • Checks 7 AI discovery layers: llms.txt, structured data, service catalog API, OpenAPI spec, Agent Card, health endpoint, robots & sitemap.
  • Free to use, no account required, instant A-F grade score.
In-site article

Show HN: TickerPro – An AI research terminal for US stocks

TickerPro is an AI-assisted stock research terminal that helps investors discover and analyze US stocks with personalized recommendations, real-time data, and narrative-driven insights, built by a couple to streamline their own research process.

  • TickerPro provides AI-powered personalized stock recommendations based on your portfolio and investment style.
  • It offers deep-dive research capabilities, including business models, financials, and transcripts, with AI-generated overviews.
In-site article

No-Slop OSS, a checklist of contribution best practices when using AI (or not)

A checklist for avoiding 'AI slop' in open source contributions, covering best practices for both AI-assisted and human contributions.

  • 12-item checklist for high-quality contributions with or without AI.
  • Emphasizes understanding the project, engaging the community, and using AI responsibly.
In-site article

Frontier AI at a Fraction of the Cost: Open-Source Worker Agents with a Closed-Source Advisor

This article presents a worker-advisor architecture that combines open-source worker agents with a closed-source advisor model, achieving near-frontier performance on multiple benchmarks at significantly lower costs. The GLM-5.2 + Opus 4.8 combination shows consistent improvements across SWE-bench Pro, Terminal-Bench 2.1, and Legal Agent Bench, with cost savings of 19% to 67% compared to using Opus alone as the worker.

  • An open-source worker (Kimi-K2.6 or GLM-5.2) drives the task end-to-end, consulting a closed-source frontier model (Claude Opus 4.8) once for review.
  • Lifts of +4 to +7 pp on SWE-bench Pro, +4 to +8 pp on Terminal-Bench 2.1, and +1 to +4 pp on Legal Agent Bench.
In-site article
Models

OpenAI unveils GPT-5.6 amid US AI regulatory drama

Less than 24 hours after news broke that OpenAI would stagger its next model release at the request of the Trump administration, that model, GPT-5.6, is here. On Friday, the company unveiled the limited preview of its new GPT 5.6 model suite: Sol, the flagship; Terra, a medium-tier model for "high-volume work"; and Luna, a "fast and affordable" everyday model. OpenAI says it's especially skilled at coding, cybersecurity, and biology, as well as staying focused during long-horizon agentic AI tasks. Per million tokens, GPT-5.6 Sol is priced at $5 input / $30 output (nearly half the cost of Anthropic's Claude Fable 5, which is $10 input / $50 output). Terra is half the cost of Sol, and Luna is less than half the cost of Terra. The company also debuted two additional modes for Sol: a "max" mode for deeper reasoning and an "ultra" mode for leveraging sub-agents — evoking OpenClaw, and perhaps a sign of OpenClaw creator Peter Steinberger’s work at OpenAI so far. Unsurprisingly amid a security panic in Washington, D.C., OpenAI dedicated the majority of its announcement blog post to safety and potential misuse. It appeared to reference the recent jailbreaking travails of its rival Anthropic, writing that “GPT‐5.6 is trained to refuse prohibited cyber assistance, including when users attempt to disguise their intent or jailbreak the model.” It also said that flagship model Sol “is better at helping people find and fix vulnerabilities than reliably carrying out end-to-end attacks,” and that Sol doesn’t cross the cyber-critical threshold under OpenAI’s preparedness framework — though it should be noted that OpenAI recently revised its preparedness framework in April and removed some areas of previous study. The company said Sol has the company’s “most robust safety stack to date” and that it “strengthened protections for higher-risk activity, sensitive cyber requests, and repeated misuse.” OpenAI said it had dedicated “approximately 700,000 A100e GPU hours” to automated red-teaming and also worked with third-party testers, the latter of which will continue to test it for the next two weeks.  OpenAI also seemed to be taking an extra-sensitive approach during the preview period, which is being closely monitored by the Trump administration. The company wrote that “safeguards may occasionally intervene on legitimate work, particularly in dual-use areas where defensive and offensive activity can initially look similar. That is part of what the preview is designed to test.” The report earlier this week said that the Trump administration will approve customers on a case-by-case basis during the preview period. OpenAI said the model suite should be generally available in the coming weeks because the company believes in “broad access,” and that the company cooperated with the US government ahead of this launch, but that it hopefully wouldn’t be the norm. “We don’t believe this kind of government access process should become the long-term default,” the company wrote. “It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them. We are taking this short-term step because we believe it is the strongest path to broader availability in the coming weeks, while we work with the Administration to develop the cyber Executive Order framework and a repeatable process for future model releases.”

  • OpenAI released GPT-5.6 suite (Sol, Terra, Luna) hours after delaying at Trump administration's request.
  • Sol pricing undercuts Anthropic's Claude Fable 5 by nearly 50%.
In-site article

Benchmarking AI Gateways: GoModel vs. LiteLLM vs. Portkey vs. Bifrost

This article benchmarks four AI gateways on the hot path, measuring latency, throughput, memory, CPU, cold start, and image size. GoModel leads in nearly every metric, while LiteLLM suffers from high resource consumption. The author discusses the importance of runtime footprint for local models and serverless deployments, and notes the need to evaluate openness and vendor neutrality.

  • GoModel excels with 1.8ms median latency, 4900 req/s throughput, 37MB RAM, and 0.56s cold start. LiteLLM lags with 2.3GB RAM, 25.5s cold start, and 324 req/s. Bifrost and Portkey fall in between.
  • The benchmark focuses on runtime overhead, not feature count or provider coverage. It measures what matters when the gateway sits on every request.
In-site article
Research

These are the 20+ best Prime Day phone deals I'd actually buy for myself

Prime Day 2026 is in its final hours. ZDNET's expert picks the best phone deals still available, including discounts on iPhone, Samsung, Google Pixel, and Motorola. Tips on how to choose and when to buy.

  • Prime Day 2026 runs June 23-26, with final hours today.
  • Top deals include Google Pixel 10, Samsung Galaxy S26, and various iPhones.
In-site article
Chips

AI Cheerleading, AI Abstention and AI Redirection

Using social cartography, this article analyzes three polarized orientations toward AI: techno-solutionist cheerleading, total refusal via abstention, and strategic redirection that engages while acknowledging risks. It argues that refusal does not grant moral innocence and adoption need not imply endorsement, emphasizing the need for discernment and restraint.

  • Social cartography reveals three main stances in AI debates: cheerleading, abstention, and strategic redirection.
  • AI abstention maintains moral clarity but may overestimate the leverage of refusal.
In-site article
Tools

How I use Siri mode in the iOS 27 Camera app to ask questions about anything I see

A hands-on look at the new Siri mode in iOS 27's Camera app, which allows users to ask questions about objects in view using AI. The feature builds upon Visual Intelligence from iOS 18.2 but integrates it directly into the camera interface. Early beta testing reveals some bugs and long wait times for waitlist approval.

  • Siri mode in iOS 27 Camera app enables real-time AI queries about objects in view.
  • It improves upon Visual Intelligence by keeping users within the Camera app.
In-site article
Policy

David Autor named head of the Department of Economics

David Autor, a leading researcher in AI and the future of work, has been named head of MIT's Department of Economics, effective July 1. His work focuses on labor market impacts of technological change and globalization.

  • David Autor, MIT economics professor since 1999, appointed department head.
  • He is a leading expert on AI and the future of work, studying technology's impact on jobs and inequality.