AI News HubLIVE
In-site rewrite3 min read

LWiAI Podcast #235: Sonnet 4.6, Deep-Thinking Tokens, and Anthropic vs Pentagon

The 235th episode of Last Week in AI covers Anthropic's Sonnet 4.6 with 1M context and strong ARC-AGI-2 results, Google's Gemini 3.1 Pro, xAI's Grok 4.2 beta, and various tool updates. Business news includes Meta's up-to-$100B AMD chip deal, MatX raising $500M, World Labs $1B, and Simile $100M. Research highlights deep-thinking tokens, masking updates, and LLM attractor states. Policy discussions focus on Anthropic's stance on Pentagon contracts and distillation attacks.

SourceLast Week in AIAuthor: Last Week in AI

Our 235th episode with a summary and discussion of last week’s big AI news!

Recorded on 02/27/2026. Hosted by Andrey Kurenkov and Jeremie Harris

Note from Andrey: my startup Astrocade is hiring for engineers, marketing, product, growth, and more! If you’re in the bay area, would like to join a small but growing startup, and think building a youtube-of-games sounds exciting, feel free to email me at [email protected]

Check out Astrocade!

Feel free to email us your questions and feedback at [email protected] and/or [email protected]

In this episode:

Model and tool updates highlight Anthropic’s Sonnet 4.6 (1M context; strong ARC-AGI-2 results), Google’s Gemini 3.1 Pro (major ARC-AGI-2 jump and multimodal demos), xAI’s Grok 4.2 beta (multi-agent debate), plus Anthropic’s Claude Code “Remote Control” and Perplexity’s multi-agent “Computer” coordinator.

Compute and business moves include Meta’s reported up-to-$100B AMD chip deal with warrant/equity incentives, MatX raising $500M to build specialized transformer chips shipping in 2027, World Labs raising $1B for world-model/3D environment tech, and a new startup raising $100M to simulate/predict human behavior.

Infrastructure and geopolitics cover Stargate data-center delays amid OpenAI/Oracle/SoftBank control disputes and cash concerns, and China’s plan to scale 7nm/5nm wafer output despite yield and tooling constraints.

Research and safety/policy discuss optimizer gains from masked updates, “deep thinking tokens” as a reasoning-effort signal, LLM attractor-state behaviors in bot-to-bot chats, mechanistic interpretability of counting/line-wrapping, methods to map task difficulty to human time horizons, plus Anthropic–Pentagon contract tensions, Anthropic’s report on distillation attacks (DeepSeek/Moonshot/Minimax), and OpenAI’s report on disrupting malicious use.

A thank you to our current sponsors:

Box - visit Box.com/AI to learn more

ODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.

Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a year

Timestamps:

(00:00:10) Intro / Banter

(00:01:52) News Preview

Tools & Apps

(00:03:20) Anthropic releases Sonnet 4.6 | TechCrunch

(00:11:24) Google Rolls Out Latest AI Model, Gemini 3.1 Pro - CNET

(00:14:54) Elon Musk says Grok 4.20 public beta is now available: Capabilities of AI chatbot offered by xAI - The Times of India

(00:18:06) Anthropic just released a mobile version of Claude Code called Remote Control | VentureBeat

(00:21:01) Perplexity announces “Computer,” an AI agent that assigns work to other AI agents - Ars Technica

Applications & Business

(00:23:40) Meta strikes up to $100B AMD chip deal as it chases ‘personal superintelligence’ | TechCrunch

(00:27:05) Nvidia challenger AI chip startup MatX raised $500M | TechCrunch

(00:31:00) World Labs lands $1B, with $200M from Autodesk, to bring world models into 3D workflows | TechCrunch

(00:33:07) Simile Raises $100 Million for AI Aiming to Predict Human Behavior

(00:33:52) Stargate AI data centers for OpenAI reportedly delayed by squabbles between partners — sources say OpenAI, Oracle, and SoftBank disagreed on who would have ultimate control of the planned data centers

(00:36:43) China to increase leading-edge chip output by 5x in two years, report claims — aims to lift 7nm and 5nm production to 100,000 wafers per month, targeting half a million monthly by 2030

Research & Advancements

(00:40:33) On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

(00:48:03) Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens

(00:54:52) models have some pretty funny attractor states

(01:01:41) When Models Manipulate Manifolds: The Geometry of a Counting Task

(01:05:16) BRIDGE: Predicting Human Task Completion Time From Model Performance

(01:12:00) NESSiE: The Necessary Safety Benchmark -- Identifying Errors that should not Exist

(01:13:15) The least understood driver of AI progress

(01:21:45) The Persona Selection Model: Why AI Assistants might Behave like Humans

Policy & Safety

(01:25:04) Anthropic CEO Amodei says Pentagon’s threats ‘do not change our position’ on AI

(01:33:04) Musk’s xAI, Pentagon reach deal to use Grok in classified systems

(01:34:17) Detecting and preventing distillation attacks

(01:38:36) OpenAI details expanding efforts to disrupt malicious use of AI in new report - SiliconANGLE