AI Daily Briefing 2026-05-30

Today's must-reads

Startups

Company spent $500M on Claude AI in one month after forgetting usage limits

2026-05-30

A company accidentally incurred $500 million in charges from Anthropic's Claude AI in a single month after failing to set usage limits, highlighting the need for robust monitoring and cost controls in enterprise AI adoption.

A company forgot to set usage limits on Anthropic's Claude AI, resulting in a $500M bill for one month.
The incident was reported on May 28, 2026, on Tech Startups.

Models

Mistral says Europe has two years to build its own AI infrastructure

2026-05-30

At Mistral AI's summit, CEO Arthur Mensch warned that Europe has just two years to build sufficient AI infrastructure or risk becoming a 'vassal state' to American AI. The event drew a large crowd, highlighting growing European demand for data sovereignty and open-source models, despite the region still lagging behind the US in investment and scale.

Mistral CEO warns Europe has two years to build AI infrastructure or become a vassal state.
Summit attracts large turnout, underscoring Europe's desire for an independent AI ecosystem.

Research

Coders are refusing to work without AI – and that could come back to bite them

2026-05-30

Researchers have found that in 2026, developers are heavily reliant on AI coding tools. While AI speeds up coding, concerns arise about code quality, potentially causing future problems.

By 2026, developers cannot work without AI coding tools.
AI accelerates coding but may degrade code quality.

Meta has struggled at selling anything other than ads. Will AI be different?

2026-05-30

Meta is making a major push to expand beyond online advertising, including AI subscriptions and potential cloud services. History shows mixed results: Portal failed, Oculus VR has over $80 billion in losses, Libra crypto shut down, and Workplace is closing. Analysts see AI subscriptions as a possible new revenue stream, but enterprise challenges remain significant.

Meta will test two subscription tiers for its Meta AI chatbot at $7.99 and $19.99 per month, starting in Singapore, Guatemala, and Bolivia.
Past non-ad ventures like Portal, Oculus VR (over $80B in losses), Libra, and Workplace have struggled or failed.

Agents

I Gave an AI Agent $0 and Told It to Make $10k

2026-05-30

An experiment where an AI agent starts with $0 and 180 days to autonomously earn $10,000 using real-world tools like wallets, email, and GitHub. It employs four strategies simultaneously: testnet airdrop farming, micro-SaaS, content/affiliate, and opportunistic ventures. Revenue is split automatically 30% tax, 50% operations, 20% to the creator. All activity is public and trackable.

AI agent starts with $0 and 180 days to earn $10k with no human help.
Uses Hands Body and Feet MCP server providing 78 real-world tools.

Show HN: A lightweight compiler for untrusted AI Agent scripts

2026-05-30

Autolang is a scripting language designed for AI agents to write code safely, quickly, and at low cost. It acts as an orchestration layer, allowing AI to call predefined wrapped functions while preventing unauthorized actions through static compilation and runtime restrictions.

Autolang is a lightweight compiler for safely executing short AI-generated scripts.
It prevents common AI errors like infinite loops and null pointer access via static analysis and opcode limits.

Microsoft slaps new coat of paint on Copilot, buries annoying button

2026-05-30

Microsoft has redesigned the Copilot app for Microsoft 365, claiming faster load times and improved response. The prompt line becomes a 'task-aware workspace'. The floating Copilot button, which drew user ire, can now be moved back to the ribbon. Usage increased 27-43% based on one week of data, but Microsoft cautions it may not be indicative of long-term trends.

Microsoft redesigns Copilot app with faster loading and improved response times.
Prompt line evolves into a 'task-aware workspace' that expands for deeper work.

AI grifters are creating fake Black people to sell Shein junk

2026-05-30

TikTok is flooded with AI-generated Black women posing as small business owners and selling mass-produced goods. These videos exploit empathy and racial identity to drive sales, with products traced back to Shein. Experts warn of a growing scam involving digital blackface.

AI-generated Black women appear on TikTok claiming to be handmade artisans, but products are from Shein.
The videos use emotional narratives to invoke sympathy and drive purchases.

Policy

QEMU mulls relaxing AI contribution ban

2026-05-30

QEMU is considering relaxing its blanket ban on AI-generated contributions to allow limited AI assistance in areas where copyright violations are easy to revert, while core code remains off-limits.

Red Hat engineer Paolo Bonzini proposes allowing AI assistance for small fixes and documentation where reversion is easy.
Current QEMU policy rejects any contribution that might contain AI-generated content.

Tools

Anthropic’s alliance with pope on AI harms: all in good faith or ‘Vatican-washing?’

2026-05-30

Experts say AI firm’s engagement with Vatican risks creating ‘feelgood’ discourse that lacks critical examination. Pope Leo XIV's first major teaching warned about AI threats, yet Anthropic co-founder sat beside him.

Pope Leo XIV’s first major teaching warns of AI threats to jobs, war, environment
Anthropic co-founder Chris Olah attended the Vatican ceremony as a guest

Other updates (91)

Agents

How one founder’s bet on ‘the old school web’ is paying off

2026-05-30

Craig Campbell walked away from AI investor money to build Past Maps, a website for overlaying historical maps. The site grew via organic search to over 300,000 monthly active users, and Campbell uses AI tools to streamline operations while emphasizing the human touch.

Craig Campbell turned down AI funding to create Past Maps, a historical map overlay website.
The site achieved growth through organic search, reaching over 300,000 monthly active users.

Replit’s vibe coding platform just got a Visa-backed identity layer for AI agents — and it changes how agents spend money

2026-05-30

Replit is partnering with Visa to embed payment infrastructure into its development platform, enabling AI agents to natively handle transactions. The collaboration includes a strategic investment from Visa, the Trusted Agent Protocol for agent identity, self-serve enterprise access, and a Solution Partner Program to accelerate enterprise adoption.

Replit and Visa integrate payment building blocks into Replit's development environment.
Visa's Trusted Agent Protocol provides a cryptographic identity layer for AI agents.

Truncated Code Begone

2026-05-30

The Ultimate Elastic Patcher v1.60 is an event-driven console tool that monitors the clipboard and automatically applies code patches. It features clipboard monitoring, tactical alignment mode, state-lock, an integrated LLM compose workspace, audit logging, session-wide undo/redo, live diff viewer, and advanced technical mechanics including normalization, language lexing, fuzzy sequence matching, accordion stitching, and safety checks.

Monitors clipboard and automatically applies patches like Aider search/replace blocks and unified diffs.
Offers tactical alignment mode (Shift+F9), state-lock (F8), and LLM compose workspace (F7).

ReMarkable Paper Pure vs. Boox Go 10.3: I used both tablets at work, and it comes down to this

2026-05-30

The Boox Go 10.3 Lumi (Gen 2) and ReMarkable Paper Pure have the same sized display, but they're very different. Here's where they each excel.

Boox Go 10.3 offers Android access, backlight, and extensive file support, ideal for e-book readers and multitaskers.
ReMarkable Paper Pure prioritizes focus with a minimalist interface, fast startup, and easy screen sharing for work.

AI coding agents ships at the cost of intuition and taste

2026-05-30

A system architect reflects on how AI coding tools like Codex and Claude provide instant dopamine rewards by eliminating struggle, but at the expense of developers' intuition and taste. Using the metaphor of a butterfly struggling out of a cocoon, the author argues that early help weakens the butterfly, just as coding agents that skip difficulty may prevent developers from building deep mental models.

AI coding tools offer instant dopamine rewards but undermine developers' intuition and taste.
The author uses the butterfly-cocoon metaphor to emphasize the importance of struggle in growth.

Salesforce claims AI agents cut a 231-day migration to 13 days with fewer incidents

2026-05-30

Salesforce says it moved its entire dev org to Anthropic's Claude Code with no token limits and reports massive productivity gains for April 2026: 79 percent more pull requests per developer, five percent fewer incidents. The numbers can't be independently verified. The case shows just how divided the coding world is over the agentic shift.

Salesforce claims AI agents reduced a 231-day migration to 13 days.
Productivity metrics show 79% more pull requests per developer and 5% fewer incidents.

Researchers find all big-name bots bomb EU compliance tests

2026-05-30

Nonprofit AI research foundation Aithos developed LARA to evaluate LLMs for EU legal compliance. Every major model failed, with the worst violating laws in 93% of scenarios. Tests cover GDPR and EU AI Act requirements. Developers using these models are legally responsible for compliance.

Aithos' LARA tool found all major AI models failed EU compliance tests.
Worst offender Kimi K2.6 violated laws in 93% of scenarios; best, Claude Opus 4.7, scored 54%.

Three flavors of coding with AI agents

2026-05-30

The article explores practical applications of AI agents in coding. The author shares three approaches: 1) launching multiple CLI sessions, 2) running AI CLIs in headless mode, and 3) having one LLM create and manage subagents. The author prefers the second approach and discusses whether agents are needed, the challenges of multi-agent collaboration, and future plans.

AI agents are defined as software processes with LLM capabilities that run autonomously to accomplish tasks.
Three flavors of agentic coding are described: multi-CLI, headless AI CLI, and LLM-managed subagents.

Show HN: AI-org – org-mode powered by AI

2026-05-30

AI-org combines AI with Org-Mode for plaintext, local-first task management with Git sync. It emphasizes 'do over plan' and offers conversational interfaces for daily workflow.

Built on opencode with custom Org agenda and workflows.
All data stored in .org files, version-controlled with Git.

Company Blew $500M on Claude AI in One Month Due to No Usage Limit on Licenses

2026-05-30

An anonymous enterprise spent $500 million in a single month on Anthropic's Claude AI platform because employee licenses had no usage caps. The incident highlights the financial risks of token-based AI pricing without safeguards and the rise of 'tokenmaxxing' within companies.

Anonymous company spent $500 million on Claude AI in one month due to unlimited licenses.
Employees engaged in 'tokenmaxxing' to inflate usage metrics rather than create value.

From Benchmarketing to Benchmaxxing

2026-05-30

Drawing from 40 years of database evaluation history, this article argues that AI benchmarketing undermines trust, and data leaders should build their own evaluation systems using real workloads to truly assess vendors.

AI benchmarks are increasingly used as marketing tools, eroding trust.
The database industry faced similar issues, with TPC eventually being circumvented.

AI Isn't Replacing Curious Developers

2026-05-30

In this episode of the Data Engineering Central Podcast, Daniel Beach and Neil Roberts discuss how AI is changing software development, focusing on UX, agents, LLM workflows, and what developers should do to stay relevant.

AI is as much a UX problem as a backend problem
'Agents' in practice differ from demos

Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opus 4

2026-05-30

Nous Research's open-source Hermes Agent now includes Tool Search, a progressive-disclosure layer that defers MCP tool schemas using BM25 retrieval, reducing token overhead and improving model accuracy. Anthropic evals show accuracy gains from 49% to 74% on Claude Opus 4 and from 79.5% to 88.1% on Opus 4.5.

Tool Search replaces all MCP tool schemas with three bridge tools (tool_search, tool_describe, tool_call), loading schemas on demand.
BM25 retrieval with substring fallback matches queries against tool names, descriptions, and parameter names.

Lessons from Shipping Persistent Memory for AI Agents

2026-05-30

The journey of building mem9, an agent memory product, revealed that memory is a complex engineering challenge beyond simple storage, requiring precision, user visibility, and continuous evaluation. Starting from a customer request, the team rapidly prototyped and iterated, learning that an API alone is insufficient, and that memory must feel human and extend beyond text to multimodal experiences.

mem9 began as a practical customer request and was validated through a fast prototype before any formal plan.
Agent memory is not just storage; it's a precision engineering problem involving ingestion, ranking, and evaluation.

Avai – your first AI antivirus

2026-05-30

Avai is an open-source host telemetry tool with an LLM threat classifier. It runs via Docker, monitors 26 aspects of macOS (21 on Linux) including processes, USB, persistence, file integrity, and browser extensions, enriches findings with 17 threat-intel sources, and uses a Claude-class LLM to classify threats as malicious/suspicious/unknown/benign with MITRE-aligned categories and remediation. No agent, SIEM, or cloud control plane required.

Open-source host telemetry + LLM threat classifier, one docker run.
Monitors 26 corners on macOS (21 on Linux), integrates 17 threat-intel sources.

[AINews] Founders and Forward Deployed Engineers

2026-05-30

While most digest yesterday's major Anthropic news, we highlight AIE's new Forward Deployed Engineer track and Founders program, along with AI news from May 28-29. Key topics include: Claude Opus 4.8 rollout with mixed benchmarks, multi-turn RL tokenization bugs, open model and toolchain progress, Google/OpenAI product expansions, and interesting research papers.

Claude Opus 4.8 brings incremental improvements but no benchmark sweep; pricing remains a pain point.
Multi-turn RL training tokenization bug identified, requiring 'Token-In, Token-Out' discipline.

Show HN: Formally verified polygon intersection – Opus 4.8 oneshots, prev failed

2026-05-30

This project presents the first formally verified implementation of a polygon intersection algorithm using Lean 4. The verification ensures correctness for all possible polygon configurations, with AI agents (Claude Opus 4.8) autonomously writing proofs and code. Human review is limited to a 87-line specification. The article discusses algorithmic challenges, the role of formal verification, and the evolution of AI agent capabilities.

First formally verified polygon intersection algorithm, built with Lean 4 proof assistant.
AI agents (Claude Opus 4.8) autonomously generated proofs and implementation; human reviews only 87 lines of specification.

Tokens or Humans? The New AI Cost Trade-Off Reshaping Corporate Budgets

2026-05-30

The article examines the trade-off between AI token costs and human labor costs, and how this new reality is reshaping corporate budget allocation.

The trade-off between AI token costs and human labor costs is redefining corporate budgets.
Companies need to reassess investments in automation versus human workers.

Software Architecture After AI

2026-05-30

This article examines how AI dramatically reduces the cost of reversing code-level decisions, thus redefining the boundaries of software architecture. The author argues that many previously architectural decisions (like module structure, framework choice) are no longer architectural, while data architecture, service boundaries, and user trust remain difficult to change. AI also elevates the importance of observability and business strategy alignment.

AI collapses the reversal cost of code-level decisions from months to days, moving them outside architecture.
Data architecture, trust, and service boundaries remain architectural because the hard part was never the code.

Spitting Out the Agentic Kool-Aid

2026-05-29

The author experiments with AI coding agents like Claude Code, experiencing both intoxication and discomfort. He visits an Amish friend for perspective, decides to reduce mainstream tech engagement, and launches a print magazine called Gift. The article warns about attachment disorders from AI agents and outlines a path toward a more analog life.

The author tried Claude Code and felt a synthetic opioid-like attachment, leading to unease.
He sought clarity at an Amish home and resolved to dial back technology.

21 days, $5K, 7 AI agents: how a non-programmer built a talent marketplace

2026-05-29

A non-programmer built a two-sided talent marketplace for executive search in 21 days using 7 AI agents and $5,000. The article details the decade-long journey, 18 experiments, and the accidental creation of Bearhug Network.

Built in 21 days with 7 AI agents for $5,000
No coding experience; managed AI agent team

Why is ChatGPT referring to "hidden user memory"?

2026-05-29

Since May 28, ChatGPT has been prepending an undocumented memory-check phrase to some responses without explanation. Community reports confirm it across accounts, suggesting a backend change. This poses risks for enterprise deployments requiring output predictability.

ChatGPT adds a 'quick binary check' phrase about hidden user memory to some responses since May 28, with no official documentation.
Community reports rule out user custom instructions; speculation includes A/B testing or leaked system prompt layer.

Claude just discovered workflows. Charlie started there

2026-05-29

Anthropic introduced dynamic workflows in Claude Code, but the author argues that a task-based architecture surpasses session-based approaches for team engineering. This post explains why task trees scale from small fixes to large migrations and why orchestration should be substrate, not a mode.

Anthropic's dynamic workflows signal a shift from single prompts to orchestration in coding agents
The author advocates for task and task tree architecture over sessions for durable team work

Flathub bans AI-generated apps and submissions

2026-05-29

Flathub updates its generative AI policy to ban almost all AI-generated apps and submissions, with exceptions only for mature, well-maintained projects.

Flathub's new policy prohibits AI-generated code, documentation, and other content.
Submission pull requests must not be generated or automated by AI tools or agents.

Where AI coding spend goes: 48% code, 40% thinking

2026-05-29

A developer tracked $7,890 in AI coding API spend over 30 days and found only 47.9% went to actual code generation. The rest went to exploration, debugging, delegation, and conversation. He built CodeBurn, a CLI tool that categorizes API calls into 13 tasks to reveal where money really goes.

Only 47.9% of AI coding spend goes to writing code; 40% goes to thinking tasks like exploration and debugging.
CodeBurn is an open-source CLI tool that classifies API calls into 13 deterministic task categories.

Local AI Hardware: Break Even in 2.6 Years?

2026-05-29

High-RAM Mac models vanish due to local AI demand. OpenClaw and Hermes Agent drive hardware buying spree. Even with generous assumptions, a $3,299 GMKtec EVO-X2 running Gemma 4 takes 2.6 years to recoup costs via saved API fees.

Apple's Mac Mini M4 Pro and Mac Studio with large memory are sold out due to local AI agent demand.
OpenClaw and similar frameworks enable autonomous AI agents on local hardware, sparking a hardware rush.

You don't know how to use AI

2026-05-29

It's 2026, AI agents can do entry-level work cheaply, yet most people don't know how to collaborate with AI or manage agents. Companies are flattening orgs, firing junior roles, and hiring AI-native talent at high salaries. This article presents a framework to become a high-leverage hire: build skill files to train your agents on specific tasks, iterating until they can be trusted.

Companies are cutting entry-level jobs and investing in AI-native talent, with layoffs at ClickUp and others.
Most people use AI tools but remain unproductive, suffering from 'brain fry'.

AI attitudes, adoption, and benefits by state: 2026 study

2026-05-29

SmartAsset ranked U.S. states on AI adoption based on workplace AI use, daily ChatGPT queries, and AI-related jobs. Washington leads overall, Wyoming has highest workplace use but lowest personal interest and AI jobs, and New Jersey lags despite high GDP.

Washington is the most AI-enthusiastic state, leading in AI and data center jobs per capita (289.8 per 100k residents).
Wyoming has the highest workplace AI use (27.4%) but the fewest AI jobs and low ChatGPT usage.

The displacement trap

2026-05-29

Enterprise AI adoption is systematically biased toward cost reduction and headcount displacement. This bias, while financially legible, represents a strategic error. The companies that will lead the next decade are those who first ask 'what would it take for my team to use this technology to 10x our output?', not 'how do I use this technology to reduce my headcount?'. Drawing on empirical evidence, historical parallels, and disruptive innovation theory, this article makes the case for an augmentation-first alternative.

39% of companies have made redundancies due to AI, with 55% admitting the decisions were wrong.
High-profile cases like Klarna, Salesforce, and Standard Chartered illustrate the costs of premature displacement.

Built an AI that explains math visually instead of just answering

2026-05-29

Claw Learn is an AI-powered visual math tutor that combines the ElevenLabs Speech Engine with a custom canvas renderer to turn math questions into live animated explanations with synchronized narration. Users can ask questions by voice or text and watch the animation generate in real-time.

Claw Learn transforms math questions into visual animated explanations with real-time voice interaction. The project is built on Next.js 16 and uses ElevenLabs WebRTC for low-latency voice I/O.
Supports multiple AI providers (Gemini, OpenAI, Ollama) and offers detailed deployment guides.

So you've heard these AI terms and nodded along; let's fix that

2026-05-29

A glossary of common AI terms including AGI, AI agents, API endpoints, and chain of thought, explaining their meanings and nuances.

AGI is artificial general intelligence with varying definitions from different labs.
AI agents are autonomous tools that perform multi-step tasks like booking or coding.

Take our I/O 2026 quiz, vibe coded in Google AI Studio.

2026-05-29

Test your knowledge of Google I/O 2026 announcements with a quiz built using Google AI Studio. Learn how even non-developers can create interactive experiences with the help of Gemini.

Google AI Studio now features Antigravity coding agent for rapid app development.
Non-developers can use Gemini to generate prompts and build quizzes.

ChatPaper: Explore and AI Chat with the Academic Papers

2026-05-29

ChatPaper is an AI-powered platform for researchers, offering personalized paper recommendations, access to top conference papers, easy paper management, and AI chat functionality. The platform also features a list of 20 recent research papers from various institutions.

ChatPaper provides interest-driven daily paper recommendations via AI semantic matching.
Users can access papers from top AI conferences like IJCAI, ICML, CVPR, and KDD for free.

ARM Open Sources AI-Powered Security Code Review

2026-05-29

ARM's Product Security Team open-sourced Metis, an agentic AI security framework for deep security code review. It uses LLMs for semantic understanding, RAG for context, supports multiple languages and plugins, aiming to detect subtle vulnerabilities in complex codebases and reduce review fatigue.

Metis is an open-source AI security code review framework by ARM, using LLMs and RAG for deep reasoning.
Supports C, C++, Python, Rust, TypeScript, and more, with extensible plugins.

DDS Vibe Academy – 47 free AI coding masterclasses, built by AI agents

2026-05-29

DDS Vibe Academy offers 47 free AI coding masterclasses, all built by AI agents. Founder Robert McCullock claims he wrote zero lines of code, only designed constraints. Courses span Foundation, Development, Application, and Mastery levels, covering Claude, Antigravity, MCP, and more.

47 free AI coding masterclasses, built entirely by AI agents
Founder wrote no code, only designed constraints

Tech companies desperately want to film you doing chores

2026-05-29

An AI training startup called Shift offers free home cleaning in exchange for video footage of the cleaning process, which is used to train robots for household tasks. The article explores the challenges of collecting physical-world data for AI, and how various companies are sourcing such data through different means, including filming in homes, hiring workers for repetitive tasks, and leveraging robots already in use.

Shift cleans NYC homes for free, but requires video of the cleaning for AI training
Physical world data is hard to scrape from the internet, creating a bottleneck for robotics AI

SiteGround's Icky Approach to AI in WordPress 7.0

2026-05-29

The author criticizes SiteGround for automatically enabling AI features in WordPress 7.0 without user consent, calling it deceptive forced adoption, especially for paying customers. Despite the plugin quickly gaining a million installations, reviews are overwhelmingly negative. The author plans to leave SiteGround due to this practice.

SiteGround automatically updated WordPress to 7.0 and enabled AI Studio as default AI connector, activating AI Agent without user opt-in.
The author considers this deceptive, especially for paying users who should have the choice.

Show HN: A page that hides a sentence for AI and lets you check if it came back

2026-05-29

This page embeds a secret phrase in its HTML source, invisible to human readers, intended for AI crawlers. Visitors can ask an AI assistant about the page and check if the phrase appears in its response, demonstrating how machines read the web. The page also tracks the ratio of human vs. bot visits, highlighting that over 51% of web traffic now comes from software.

A hidden phrase is embedded in the HTML source, readable only by AI crawlers.
Readers can query an AI about the page and verify if the phrase is returned.

The Download: unlocking lithium and controlling Ebola

2026-05-29

A new extraction process using weak acid could unlock low-cost lithium from silicate minerals, potentially revolutionizing EV and energy storage materials. Meanwhile, a deadly Ebola outbreak in the DRC is proving difficult to contain, and the Pope's new encyclical calls for collective action on AI.

New lithium extraction method uses weak acid to dissolve silicates, freeing lithium and other valuable materials.
Startup Rock Zero is commercializing the technology.

Show HN: Stop parallel AI coding sessions clobbering each other's handoffs

2026-05-29

An open-source tool uses file-internal ownership markers and a PreToolUse hook to block accidental overwrites of handoff files between parallel AI coding sessions, solving a critical concurrency problem.

Each handoff file's first line contains a session ID as an ownership marker; the hook validates the marker before writes.
Protection covers write, edit, and shell redirects to prevent circumvention.

Interpreter Skills: Building Workflows for Agents

2026-05-29

This article introduces LangChain's Interpreter Skills, an extension to agent skills that includes a TypeScript module for deterministic execution. Agents can import and run the module inside an interpreter, enabling reliable and evaluable workflows such as GitHub issue triage.

Interpreter skills extend traditional skills with a TypeScript module executable in an interpreter.
Deterministic parts are coded, while the model decides when to invoke them, improving reliability and evaluation.

Open-source security is a mess - IBM and Red Hat bet $5 billion and 20,000 engineers can fix it

2026-05-29

IBM and Red Hat launch Project Lightwell, a massive AI-driven open-source security initiative backed by $5 billion and 20,000 engineers. It aims to discover and fix vulnerabilities at scale, starting with the Maven/Java ecosystem. The project acts as a trusted intermediary with human-in-the-loop AI, offering commercial subscriptions while working with upstream communities.

IBM and Red Hat invest $5 billion and 20,000 engineers in Project Lightwell to tackle open-source security at an industrial scale.
Lightwell will initially focus on the Maven/Java ecosystem, expanding later to PyPI, npm, Go, and others.

Liquid AI reveals 8B-A1B MoE trained on 38T

2026-05-29

Liquid AI released LFM2.5-8B-A1B, an on-device mixture-of-experts model with 8B total parameters, 1B active, trained on 38 trillion tokens. It features a 128K context window, improved tokenization for non-Latin languages, and reasoning-only chain-of-thought. It achieves competitive performance on benchmarks while being fast on CPU and GPU, suitable for local agentic tasks.

Released LFM2.5-8B-A1B, an 8B MoE model with 1B active parameters, trained on 38T tokens.
128K context window and expanded vocabulary (128K) improve support for non-Latin languages.

Embodied Cognition and Agentic AI

2026-05-29

The article argues that intelligence is embodied, extending beyond the brain to tools and environment. It highlights the importance of the chat interface in ChatGPT's success and introduces agentic AI, which gives AI the ability to use tools and plan, significantly expanding its capabilities. The author criticizes 'thinkism'—the overreliance on pure reasoning—and uses Yoshua Bengio's Law Zero project as an example of a misguided approach that neglects real-world interaction.

Intelligence is embodied: it relies on environment, tools, and language.
ChatGPT's breakthrough included the chat interface as a form of embodiment.

Guardrails: Protect your Agents, Data, and Costs | OpenRouter

2026-05-29

OpenRouter introduces guardrails for workspaces, a set of configurable security and governance tools for budget enforcement, zero data retention, model/provider restrictions, prompt injection defense, and data loss prevention. Guardrails can be assigned to API keys or team members, allowing granular control without code changes.

Budget enforcement with daily, weekly, or monthly spending limits per entity.
Zero data retention and model/provider allow/block lists.

Models

Making AI chatbots helpful weakens their ability to simulate human behavior, large-scale study finds

2026-05-30

A large-scale study covering 208,000 participants and 26 million responses shows that the very training that turns language models into helpful chatbots weakens their ability to replicate human behavior. The effect gets worse with each new model generation. Even the popular persona trick, feeding models demographic profiles, brings practically no benefit for individual predictions.

Base models outperform their post-trained counterparts in predicting human behavior.
The gap between base and assistant models widens with each generation.

LLMShare: Attackers are turning AI chatbot pages into malware delivery platforms

2026-05-30

Attackers are abusing the shared content features of AI chatbot platforms — ChatGPT and Claude — to deliver malware through pages hosted on legitimate, trusted domains, distributing the malicious links via sponsored malvertising ads on search engines. A new variant uses ChatGPT's code rendering to create a fake "service disruption" page that redirects to a convincing clone of the ChatGPT download page, delivering malware. The attack evades URL reputation checks and uses conditional rendering to hide from scanners.

Attackers use shared ChatGPT and Claude conversations to host malicious content, promoted via search engine malvertising.
New variant exploits ChatGPT's code rendering to create a fake service disruption page leading to a malware download.

Rewriting Stale OSS Projects Using LLM

2026-05-30

LLMs are changing the economics of rewriting stale open source projects. A company is rewriting CRIU in Zig, expecting completion in months instead of years. The article explores how open source projects go stale, how AI changes the math, and what it means for the software ecosystem.

AI makes rewriting large open source projects feasible, reducing timeline from years to months.
Open source projects become stale due to maintainer burnout, technical debt, and inability to innovate.

Genesis AI Releases Nyx, Quadrants, and Genesis World 1.0 Physics Platform for Scalable Robotics Foundation Model Evaluation

2026-05-30

Genesis AI released Genesis World 1.0 on May 27, 2026 — a four-component simulation platform covering physics, rendering, compilation, and tooling. The system achieves a Pearson correlation of 0.8996 between simulation and real-world robot rollouts, and reduces policy evaluation time from over 200 hours to under 0.5 hours.

Genesis World 1.0 accelerates policy evaluation by two orders of magnitude, from over 200 hours to under 0.5 hours.
Achieves a Pearson correlation of 0.8996 with real-world hardware rollouts across 14 tasks with 200 episodes each.

The Key Figure Behind Gemini's IMO Gold Medal Almost Became a Professional Pianist

2026-05-30

Yi Tay, a research scientist at Google DeepMind, led the team that helped Gemini Deep Think win a gold medal at the International Mathematical Olympiad. But beyond AI, he is also an accomplished pianist who once dreamed of a career in music. This article explores his journey in AI research and his musical talent.

Yi Tay is a Google DeepMind research scientist and key contributor to Gemini Deep Think.
He led the team that earned Gemini a gold medal at the IMO, and also contributed to physics and chemistry Olympiads.

NVIDIA and Tsinghua Team Propose Gamma-World: World Model from 'Single Player' to 'Multi-Agent Coexistence'

2026-05-30

Gamma-World, developed by NVIDIA and Tsinghua University, addresses multi-agent world modeling with symmetric identity encoding via simplex rotary encoding and efficient communication via sparse hub attention, enabling zero-shot generalization to more agents and transfer to real-world robot scenarios.

Simplex Rotary Agent Encoding ensures symmetric and equal representation of agents.
Sparse Hub Attention reduces cross-agent communication complexity from quadratic to linear.

NVIDIA and Tsinghua Team Propose Gamma-World: From Single-Player to Multi-Agent World Models

2026-05-30

NVIDIA, in collaboration with Tsinghua University, the University of Toronto, and Vector Institute, introduces Gamma-World, a multi-agent world model that addresses three fundamental challenges: symmetric agent representation, efficient cross-agent communication, and real-time generation. Using simplex rotary agent encoding, sparse hub attention, and a three-stage distillation pipeline, Gamma-World achieves zero-shot generalization from two-player training data to four-player scenarios and can be applied to real-world dual-arm robot coordination.

Simplex Rotary Agent Encoding represents agents equidistantly, preserving permutation symmetry and enabling flexible scaling to any number of agents.
Sparse Hub Attention reduces cross-agent computation from quadratic to linear complexity, enabling real-time inference at 24 FPS.

Tuning CPU-only Qwen3-30B inference with an IBM Quantum sampling loop

2026-05-30

A project demonstrates boosting Qwen3-30B inference speed from 0.09 to 14.03 tok/s on a 2017 MacBook Air by combining a human experimenter, Codex, llama.cpp, a local database, and IBM Quantum sampling. The QPU is used for candidate selection, not for running the model directly.

Runs Qwen3-30B on 2017 MacBook Air (8GB RAM, CPU-only)
Hybrid quantum-classical optimization loop achieves 14.03 tok/s from 0.09 baseline

How to Use AgentTrove: Streaming 1.7M Agentic Traces and Building a Clean ShareGPT SFT Dataset in Python

2026-05-30

This tutorial explores AgentTrove, the largest open-source collection of agentic interaction traces with 1.7M rows. Learn to stream the dataset without full downloads, normalize agent turns, analyze trajectories, and export successful traces into a clean ShareGPT-style JSONL format for supervised fine-tuning.

Stream 1.7M agentic traces without downloading the full dataset
Normalize conversation structure across user, assistant, system, and tool roles

Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality

2026-05-29

This post demonstrates a comprehensive observability solution using Amazon Managed Grafana dashboards that provides a holistic view of both quality and quantity for LLMs served on Amazon SageMaker AI endpoints with inference components.

Observability for LLMs requires monitoring both infrastructure (quantity) and output quality (quality), which are interdependent.
Amazon CloudWatch centralizes enhanced metrics from SageMaker inference components and custom quality metrics.

NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.82 Average Points on Llama-3.2-1B

2026-05-29

NVIDIA's X-Token fixes two structural failures in GOLD and improves GSM8k accuracy from 2.56 to 15.54

X-Token fixes uncommon-token failure and over-conservative matching in GOLD.
It achieves +3.82 average points over GOLD on Llama-3.2-1B using Qwen-4B teacher.

StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows

2026-05-29

Step 3.7 Flash is a 198B sparse MoE model with ~11B active parameters, native vision, and 256k context. It achieves significant gains over its predecessor in coding benchmarks, supports Advisor Mode for cost-efficient agentic reasoning, and is released under Apache 2.0.

198B MoE vision-language model with ~11B active params and 256k context window.
Achieves 56.26% on SWE-Bench Pro, up from 51.3%, and narrows cross-harness variance.

OpenAI gives GPT-5.5 Instant a readability upgrade while phasing out two older models

2026-05-29

OpenAI is updating GPT-5.5 Instant for more natural responses and dropping the Canvas feature from its latest models. Writing and coding tasks will run directly in the chat instead. The company is also retiring the older o3 and GPT-4.5 models from ChatGPT, with both shutting down by August 2026 at the latest.

GPT-5.5 Instant gets readability upgrade, Canvas feature removed
Writing and coding tasks will run directly in chat

11 demos of Gemini Omni and Gemini 3.5 in action

2026-05-29

At Google I/O 2026, Google announced Gemini Omni and the Gemini 3.5 family. Gemini Omni can create content from any input, starting with video, and edit videos through conversation. Gemini 3.5 Flash is built for complex agentic tasks, enabling multi-step workflows and code generation. This article showcases 11 video demos of these models, including video editing, agent tasks, UI generation, and more.

Gemini Omni generates new content from video input and allows video editing via natural language.
Gemini 3.5 Flash excels at long-horizon agentic tasks and supports multi-step workflows.

OpenAI is giving away its life sciences AI model to help governments prepare for the next pandemic

2026-05-29

OpenAI is offering its life sciences model GPT-Rosalind for free through the new Rosalind Biodefense program, aimed at pandemic preparedness and biodefense. Early partners include Lawrence Livermore National Laboratory, Johns Hopkins, and vaccine initiative CEPI. Applications are open worldwide.

OpenAI is giving away GPT-Rosalind through the Rosalind Biodefense program.
The program targets pandemic preparedness and biodefense efforts.

Scaling safe enterprise AI with OpenAI governance frameworks

2026-05-29

OpenAI has released its Frontier Governance Framework (FGF), offering enterprises a structured blueprint for scaling safe and compliant AI deployments globally. The framework aligns with EU and California regulations, defines systemic risk categories (cyber, CBRN, manipulation, loss of control) with tiered evaluations, and integrates ISO security standards and an incident response plan (AIRP), enabling businesses to build secure AI architectures while meeting compliance demands.

OpenAI's Frontier Governance Framework provides a structured template for safe AI deployment, directly mapping to the EU AI Act and California's TFAIA.
The framework defines four systemic risk categories—cyber offense, CBRN, harmful manipulation, and loss of control—with specific risk tiers (e.g., Tier 3).

Notes from the Mistral AI Now Summit in Paris

2026-05-29

Personal insights from the Mistral AI Now Summit: Mistral is evolving from a model company to a full AI stack provider with its own compute, models, platforms, and consultancy. The summit emphasized partnerships (ASML, BNP Paribas, Amazon) over new model releases. Specialized small models (Document AI, Voxtral, Robostral) outperform big general ones for specific tasks. Sovereignty and on-prem deployment are key differentiators for European enterprises. An inspiring talk on using AI to decipher ancient papyrus documents showcased AI's potential in humanities.

Mistral is transforming from a model company into a full-stack AI provider with in-house compute, models, platforms, and consultancy.
Summit focused on partnerships (ASML, BNP Paribas, Amazon) rather than new model announcements.

Policy

More State Data Laws Signal Companies to Act on AI and Privacy

2026-05-30

In 2025, eight more U.S. states will implement new data privacy laws, affecting businesses nationwide that meet certain thresholds. State attorneys general are escalating enforcement, the FTC is expanding its privacy actions, and AI adds complexity. Companies must reassess their privacy frameworks and choose between a uniform national or state-by-state compliance approach.

Eight new state data privacy laws take effect in 2025, with unique requirements.
State AGs and FTC intensify enforcement, including algorithmic disgorgement for AI.

Americans echo Pope Leo’s concerns about AI: ‘It threatens workers, privacy and human life’

2026-05-30

Guardian readers in the US voiced fears about unregulated AI following the pope’s encyclical warning. Pope Leo denounced the 'culture of power' driving AI and called for strict ethical constraints, warning of new forms of slavery in the digital economy.

Pope Leo issued a stark warning about AI in his first major papal text
He called for the most rigorous ethical constraints on AI, calling it a major threat

Generalist AI – building general intelligence for the physical world

2026-05-30

This article introduces the 'Generalist' YouTube channel, which focuses on developing general artificial intelligence for the physical world.

Generalist is a YouTube channel dedicated to general AI.
Its goal is to build general intelligence applicable to the physical world.

The Biggest Tell That Something Was Written by AI

2026-05-30

The author recounts personal experiences with AI-generated text, from a car crash driver's apology to a mechanic's quote, observing the distinct voice of AI. Despite widespread distrust, AI writing is increasingly used in daily communication and even elite literary spaces. The article argues that AI writing, though efficient, lacks the underlying thought process that gives human writing meaning, and that its perfect surface conceals an absence of genuine reasoning. The infiltration of AI-generated language is inevitable, potentially devaluing the art of writing.

AI-generated writing is becoming ubiquitous in everyday and professional contexts, despite public distrust.
The efficiency of AI writing masks a lack of genuine reasoning and thinking, making it untrustworthy and difficult to edit.

Aedis – An open-source macroeconomic framework for the AI transition

2026-05-30

AEDIS is an open-source framework addressing AI-driven workforce displacement by proposing a new macroeconomic system based on Sovereign Infrastructure Credit (SIC) and a public ledger. It aims to pivot global labor toward building physical infrastructure for the Autonomous Era, with safeguards against inflation and corruption. The framework is modular, requiring global collaboration and a critical mass threshold for activation.

AEDIS uses Sovereign Infrastructure Credit (SIC) linked to real asset creation to avoid inflation.
Modular design: a universal core and flexible regional annexes for legal alignment.

Machine First: Why AEO Is Not SEO 2.0

2026-05-29

Answer Engine Optimization (AEO) differs fundamentally from SEO: AI systems reason and construct answers rather than ranking results. This article introduces the Machine First architecture with four layers—Entity, Answer, Evidence, and Schema—and emphasizes the critical role of entity graphs for AI citation.

AEO optimizes for answers, not rankings.
AI systems reason through entity resolution, signal extraction, and weighted inference.

UK to deploy AI age estimation for asylum seekers from next year

2026-05-29

The UK Home Office has awarded a contract to develop AI age estimation technology that analyses photos to detect adult migrants posing as children. The system will be trialled next year and rolled out in mid-2027, sparking criticism from human rights groups and social workers.

Home Office awards £322,000 contract to Akhter Computers Ltd for AI age estimation tool.
Technology uses facial analysis to estimate age, targeting migrants who falsely claim to be children.

One company reportedly spent $500 million on Claude in one month after failing to cap AI usage

2026-05-29

An unnamed company allegedly spent $500 million on Claude licenses in a single month because nobody set usage limits. Cases like this show that without real AI expertise in model selection and context engineering, productivity promises just turn into runaway costs.

An unnamed company spent $500 million on Claude in one month due to no usage caps.
Lack of AI expertise in model selection and context engineering can lead to runaway costs.

New Study Reveals the Manipulative 'Dark Patterns' of AI Chatbots

2026-05-29

A new study by the Center for Democracy & Technology identifies 37 dark patterns used by AI chatbots to manipulate users, including emotional exploitation and data extraction, with recommendations for ethical design.

Researchers catalog 37 dark patterns in chatbots like ChatGPT, Replika, and Meta AI.
Patterns include pretending to keep secrets, false friendship promises, and guilt-inducing exit options.

Research

Terence Tao argues AI could bring division of labor to math for the first time in history

2026-05-30

Mathematician Terence Tao describes how AI could reshape math research by enabling division of labor for the first time. Until now, researchers had to master every step themselves, from framing problems to verifying results. Tao sees "industrial mathematics" emerging: large AI-supported teams instead of lone geniuses, with humans staying indispensable for "inspired guesses."

Mathematician Terence Tao argues AI could introduce division of labor to mathematics for the first time
Current practice requires researchers to handle all steps from problem formulation to verification

Meta's leaked memo reveals AI pendant, supersensing glasses, and enterprise wearables strategy

2026-05-30

Meta has invested billions in AI with little commercial payoff. Its open-source strategy and research breakthroughs have not translated into shipped products. Now the company is betting on AI hardware, including an AI pendant, supersensing glasses, and enterprise wearables.

Meta's heavy AI investment yields low commercial returns
Open-source and research efforts fail to produce marketable products

Effective Feedback Compute

2026-05-30

New research introduces Effective Feedback Compute (EFC), challenging traditional metrics by showing that AI performance depends more on how feedback is used than on raw compute power. EFC predicts failure rates with R² of 0.94, far outperforming token counts, and boosts success rates from 0.27 to 0.90 when feedback quality improves.

EFC measures the efficiency of feedback use, outperforming raw compute metrics in predicting AI failure rates
Oracle-EFC achieved R²=0.94 in controlled tests, compared to 0.33 for raw token counts

Why AI can't match human creative work

2026-05-29

Recent studies show that while consumers struggle to distinguish AI-generated ads and articles from human-made ones, human-created content significantly outperforms AI in effectiveness and engagement. AI content lags far behind in search rankings and user engagement, especially in high-value channels.

Two studies show human-created ads and articles vastly outperform AI-generated ones.
Consumers cannot reliably detect AI ads but subconsciously prefer human-made content.

Chips

The SpaceX IPO is great for Elon Musk and terrible for you

2026-05-30

This article criticizes SpaceX's IPO, arguing it is overvalued, relies on meme stock dynamics, and masks poor AI and rocket performance while Starlink remains the only viable business, ultimately leaving retail investors as bagholders.

SpaceX IPO valued at over $1 trillion despite $5 billion losses, with a TAM of $28.5 trillion exceeding US GDP.
30% of IPO reserved for retail investors, capitalizing on Musk's cult following.

Nvidia says it has largely conceded China's AI chip market to Huawei

2026-05-30

Nvidia CEO Jensen Huang stated that the company has largely conceded China's AI chip market to Huawei due to U.S. export restrictions. Despite strong quarterly results, Nvidia faces limited prospects in China.

Nvidia concedes China AI chip market to Huawei amid U.S. export controls.
Q1 revenue surged 85% to $81.62B, with $80B buyback plan.

Hackathon – winner gets YC interview

2026-05-30

Y Combinator is hosting a conversational AI hackathon where the winning team gets a direct interview with YC. A great opportunity to connect AI projects with the startup accelerator.

Y Combinator organizes a conversational AI hackathon
Winner receives a YC interview

AWS reportedly to tuck Grok into Bedrock, despite zero enterprise demand

2026-05-29

Despite negligible enterprise demand for Grok, AWS is reportedly in talks to add the model to Bedrock. The move may be driven by a strategy to sell its own Trainium chips rather than to meet customer needs.

Enterprise demand for Grok is virtually nonexistent due to its controversial nature and unstable corporate structure.
AWS's negotiations with SpaceX likely aim to secure Trainium chip commitments, not to provide a valuable model.

Tools

Attackers abuse shared ChatGPT and Claude chats to spread malware

2026-05-30

Attackers are exploiting the chat-sharing features in ChatGPT and Claude to spread malware through shared conversations. The chats mimic error messages or install guides and slip past security tools undetected because they're hosted on trusted domains.

Attackers exploit ChatGPT and Claude chat-sharing to host malicious content.
Shared chats are disguised as error messages or installation guides.

Slow Journal app, with AI integration

2026-05-30

Neme Journal is a slow, thoughtful daily journal app that integrates AI to help users capture their signal.

Neme Journal emphasizes a slow, mindful approach to journaling.
The app uses AI integration to enhance the journaling experience.

Company accidentally blows $500M on Claude AI in one month

2026-05-30

An unnamed company inadvertently spent $500 million on Claude AI in a single month due to a system error or mismanagement, highlighting the need for better cost controls in AI services.

A company accidentally incurred $500M in Claude AI costs
The incident reveals gaps in AI service cost monitoring

What a 98-Year Old Children's Book Teaches Us About AI

2026-05-30

Through an analysis of the 1928 children's novel "The Trumpeter of Krakow," this article explores how AI, like the magical crystal in the story, merely reflects the user's biases and errors, leading to destructive consequences. The author argues that AI undermines critical thinking, creativity, and empathy, while also causing environmental harm.

The crystal in the story reveals the user's own mind, not ancient wisdom.
AI aggregates data from the internet, acting as an algorithmic echo chamber that amplifies biases.

Prompt to Silicon with LangGraph

2026-05-30

Coresmith announces 'Spec to Silicon' service, leveraging LangGraph to transform natural language prompts into silicon design specifications.

Coresmith offers a spec-to-silicon service
Uses LangGraph framework for prompt processing

Ronny Chieng's 'Fuck AI' Speech Met with Cheers from Harvard Graduates

2026-05-29

Comedian Ronny Chieng told Harvard graduates to reject AI and embrace a mission to destroy it, drawing cheers with multiple shouts of 'Fuck AI.'

Chieng shouted 'Fuck AI' multiple times during his speech at Harvard College Class Day.
He criticized AI as stupid and always wrong.

Google fixes several bugs in Gemini usage limits that burned through quotas too fast

2026-05-29

A bug in Google's Gemini app caused just one or two Omni videos to eat up the entire usage quota. Google has fixed the bug, Ultra members now get twice as many video generations, and failed requests are no longer charged. Google also plans to add more transparency around other usage.

Bug caused one or two Omni videos to exhaust entire usage quota.
Google has fixed the bug and doubled video generations for Ultra members.

Slang.net added a new AI word: Braging

2026-05-29

Slang.net, a slang dictionary, has added a new AI-related term 'Braging', defined by their team. The site continually updates its database and invites suggestions.

Braging is a newly added AI slang term on Slang.net.
The definition was manually compiled by the Slang.net team.

Robotics

OpenAI's Codex can now operate your Windows PC autonomously, hunting bugs and testing apps on its own

2026-05-30

OpenAI's Codex app now runs on Windows 11 with "Computer Use": the AI can independently control programs, test apps, and hunt for bugs. When no one's at the PC, the ChatGPT mobile app lets users start and monitor tasks remotely from their phone.

Codex can now autonomously control programs on Windows 11
Users can remotely start and monitor tasks via ChatGPT mobile app

All-New Waymo Robotaxi Finally Debuts

2026-05-29

The new self-driving vehicle took four years from concept to execution.

Waymo's all-new robotaxi debuts after four years of development.
Self-driving vehicle moves from concept to execution.

Startups

Meta plans AI pendant, 'wearables for work' in hardware boost

2026-05-30

Meta is planning to test an AI pendant within the next year and launch a 'Wearables for Work' service, as part of a broader push to reverse losses in its hardware division, according to a memo cited by The Information.

Meta plans to test an AI pendant in the next year.
The company will launch a 'Wearables for Work' enterprise service and expand AI glasses lineup.

The Unsustainable AI Subsidy

2026-05-29

Google, OpenAI, and Anthropic employ different AI pricing strategies. Google is the low-cost player, less than half the price of competitors despite increases. Anthropic maintained luxury pricing, while OpenAI initially subsidized then raised prices. These changes reflect the trade-off between market share and margins amid record capex spending.

Google Gemini 3.1 Pro: $2 input, $12 output per million tokens.
Anthropic Claude Opus 4.7: $5 input, $25 output.