DeepSeek AI News

Source Mix

Hacker News AI28
The Decoder5
MarkTechPost4
量子位4
AI Weekly1
Analytics Vidhya1
arXiv AI1
arXiv Machine Learning1

Topic Mix

Agents37
Models24
Chips23
Research16
Policy4
Startups2
Tools1

Timeline

2026-06-275
2026-05-263
2026-06-193
2026-06-253
2026-05-232
2026-05-272
2026-05-292
2026-06-032

Latest Updates

My AI Model Tier List for Mid-2026

2026-07-11 15:43 UTC

A personal, non-benchmark tier list of AI models for coding and auditing as of mid-2026, covering Anthropic Fable, OpenAI Sol, Mistral, Gemini, and DeepSeek, with commentary on US export controls and European perspectives.

Fable (Anthropic) gets a B: fluent but unreliable, prone to hiding bugs.
Sol (OpenAI) gets an S: trustworthy for low-level code and testing.

DeepSeek V3.2 Released on Hugging Bay

2026-07-11 01:44 UTC

DeepSeek V3.2 is now available on Hugging Bay, an open-source AI artifact registry offering provenance, license verification, and trusted hosting.

DeepSeek V3.2 has been published on Hugging Bay.
Hugging Bay is an open registry with provenance and trust features.

DeepSeek aims to make its own AI chip

2026-07-09 14:42 UTC

DeepSeek, a Hangzhou-based AI startup, is designing its own inference chip to reduce dependence on Nvidia and Huawei, leveraging its strengths in cost optimization and co-design. This move reflects China's adaptation to US export controls and could intensify the AI pricing war.

DeepSeek is developing its own chip targeting AI inference, not training.
The chip aims to cut serving costs and reduce reliance on Nvidia and Huawei.

DeepSeek DSpark: The Speculative Decoding Trick Behind 400% Faster LLM

2026-07-08 18:26 UTC

DeepSeek's new DSpark module brings speculative decoding to DeepSeek-V4, boosting per-user generation speed by 60-85% with no quality loss. It tackles both weak draft quality and verification waste simultaneously via a semi-autoregressive draft model with a Markov head. This article explains the method, the open-source DeepSpec toolkit, and experimental results.

DSpark uses a semi-autoregressive draft model combining parallel speed with sequential coherence.
A Markov head delivers near-full benefits with minimal overhead, chosen over an RNN head for production.

AI Models Overthink Problems—and It’s a Security Risk

2026-07-08 11:00 UTC

Research shows that large language models with reasoning capabilities can be tricked into 'overthinking' using logically inconsistent prompts, leading to a denial-of-service attack. Researchers from Zhejiang University and Alibaba developed an evolutionary algorithm that generates malicious prompts, causing outputs up to 26 times longer in leading models like DeepSeek-R1, Qwen3-Thinking, GPT-o3, and Gemini 2.5 Flash.

Researchers demonstrate a new attack exploiting 'overthinking' in AI reasoning models, causing excessive computation.
An evolutionary algorithm corrupts prompts to produce outputs up to 26 times longer than normal.

Chinese AI models are gaining ground with U.S. companies as costs surge

2026-07-07 21:48 UTC

Chinese-built AI models are gaining traction among U.S. companies as they narrow the performance gap with leading American rivals while remaining significantly cheaper to use. Recent model releases from DeepSeek and Z.ai are highly competitive with Anthropic and OpenAI. This comes as token prices for advanced models rise at U.S. labs, making companies seek cost-effective alternatives.

Chinese AI models are closing the performance gap with US leaders like Anthropic and OpenAI.
DeepSeek and Z.ai offer competitive models at lower token prices.

DeepSeek V4 Is Earning Agentic Token Share

2026-07-06 20:27 UTC

DeepSeek V4, released April 24, 2026, doubled its token share on OpenRouter from 9% to 18% within six months, driven primarily by agentic workloads. Its cost efficiency ($0.09/$0.18 per million tokens vs GPT-5.5's $5/$30) attracts diverse users, and Chinese models surpass US models in total token share.

DeepSeek V4 increased token share from 9% to 18% in six months post-release.
Agentic workloads are the main driver; V4-Flash accounts for 70% of DeepSeek's agentic tokens.

How NVIDIA’s Inference Software Stack Powers the Lowest Token Cost

2026-06-30 15:00 UTC

NVIDIA's inference software stack, co-designed with GPUs, CPUs, networking, and systems and strengthened by open source, continuously improves hardware performance. On Blackwell, it reduced token costs by up to 5x for DeepSeek V4 in one month. The article details how software optimizations across production operations, application acceleration, and infrastructure access compound to lower cost per token.

NVIDIA's full-stack inference software reduced token costs by 5x on Blackwell for DeepSeek V4 within a month.
Companies like Baseten, Cognition, Deep Infra, and Together AI leverage TensorRT-LLM and Dynamo for significant gains.

AI News: Not Much Happened Today

2026-06-30 06:47 UTC

A quiet day in AI news, but notable developments include Meta's non-invasive brain-computer interface Brain2Qwerty v2, Cursor's iOS launch with remote agents, DeepSeek's DSpark speculative decoding technique, growing commercial access to open-weight models, and Snowflake's Arctic RL training infrastructure. The Reddit community discussed running GLM-5.2 753B locally across two Macs.

Meta releases Brain2Qwerty v2, a non-invasive decoder achieving ~61% word accuracy in real-time typing tasks.
Cursor launches iOS app with always-on cloud agents and remote control of desktop agents.

Low-cost Chinese AI models like DeepSeek gain traction in the U.S.

2026-06-29 15:15 UTC

U.S. developers and small companies are turning to Chinese AI models to cut costs. Though lagging in performance, these models handle most tasks at a fraction of the price. Microsoft is also exploring DeepSeek as a cheaper alternative for Copilot. Chinese companies face challenges turning popularity into revenue under political scrutiny.

Stu Clott uses DeepSeek for coding, costing under 50 cents vs. $10 on Claude.
Chinese models lower costs due to cheaper salaries and infrastructure in China.

GitHub DeepSeek-AI/DeepSpec

2026-06-27 20:16 UTC

DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It provides data preparation utilities, draft model implementations, training code, and evaluation scripts for algorithms such as DSpark, DFlash, and Eagle3.

Offers a complete pipeline from data preparation to evaluation for speculative decoding.
Supports multiple draft model algorithms including DSpark, DFlash, and Eagle3.

DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1

2026-06-27 16:59 UTC

DeepSeek open-sourced DSpark, a speculative decoding framework that attaches a draft module to existing DeepSeek-V4 weights. It pairs a parallel draft backbone with a lightweight Markov head to cut suffix decay, then adds confidence-scheduled verification that tailors how many tokens get checked to real-time GPU load. Offline, accepted length rises 16–31% over DFlash and Eagle3; in production it speeds per-user generation 57–85% over the MTP-1 baseline, losslessly. The training repo, DeepSpec, ships under MIT.

DSpark pairs a parallel draft backbone with a lightweight Markov head to improve suffix acceptance.
Confidence-scheduled verification adjusts tokens checked based on GPU load.

Native Hacker News TUI client with AI comments summary written in Golang

2026-06-27 16:04 UTC

cwnews is a terminal-based Hacker News reader with six feeds, threaded comment folding, three themes, and AI summarization via DeepSeek V4 Flash. Built in Go with Bubbletea v2, it caches everything in SQLite for instant access.

Terminal UI client supporting all six Hacker News feeds (Top/New/Best/Ask/Show/Jobs).
Threaded comments with collapsible nesting and depth-colored indentation.

DeepSeek open-sources inference optimizations with 60–85% faster generation

2026-06-27 09:18 UTC

DeepSeek has open-sourced a set of inference optimizations that achieve 60–85% faster generation times, as detailed in a technical paper.

DeepSeek released inference optimization techniques
Achieves 60-85% speedup in generation

cwmail: A terminal email client in native Golang with LLM-based drafting

2026-06-27 03:36 UTC

cwmail is a terminal email client written in Go using Bubbletea v2. It features proper HTML rendering, inline image support, multi-account IMAP with IDLE push, and AI-drafted replies powered by DeepSeek V4 Pro. It includes undo delete, draft auto-save, CLI send mode, and full offline capability, with all data stored locally.

Written in Go with Bubbletea v2, providing a full TUI for email management in the terminal.
Supports multiple IMAP accounts side-by-side with IDLE push notifications, avoiding polling.

DeepSeek Flash Inverts the Economics of Agent Products

2026-06-25 22:56 UTC

DeepSeek Flash shatters the adversarial pricing relationship between developers and big AI labs by offering a cheap, fast, text-only code generation model. It enables agent builders to switch from expensive multimodal APIs to open-source models acting as compilers, drastically cutting costs and reshaping browser agent architectures.

DeepSeek Flash upends agent economics, letting developers stop subsidizing their competitors.
By turning the model from worker to compiler, agent workflows are reduced from dozens of API calls to a single planning call.

We got DeepSeek-V4-Pro serving in 20 seconds

2026-06-25 20:49 UTC

Inferize announces achieving DeepSeek-V4-Pro model serving in 20 seconds, showcasing highly optimized and elastic AI inference for LLMs, with a waitlist now open.

Inferize deployed DeepSeek-V4-Pro in 20 seconds
Provides highly optimized, elastic AI inference

Baidu Releases Unlimited OCR, a 3B Model That Keeps the KV Cache Flat for Long-Document Parsing

2026-06-25 05:39 UTC

Baidu open-sourced Unlimited OCR, a 3B-parameter MoE model that uses Reference Sliding Window Attention to keep the KV cache constant, enabling efficient parsing of dozens of pages in a single pass. It achieves 93.23 on OmniDocBench v1.5, surpassing DeepSeek OCR by 6.22 points, under an MIT license.

Unlimited OCR is a 3B MoE model with only 500M active parameters.
It uses Reference Sliding Window Attention to maintain constant KV cache size.

Show HN: eBook to Audiobook Narration with Realistic AI Voices

2026-06-24 15:04 UTC

A developer built ebookaloud, a service that converts eBooks to audiobooks using the open-source Kokoro model. The code was 99% written by AI (DeepSeek v4) in a multi-agent workflow. It offers pay-as-you-go pricing and aims for good-enough quality, with future plans for more languages and PDF extraction.

Uses the open-source Kokoro model for realistic, non-fatiguing AI voices.
99% of the code was generated by DeepSeek v4 in a multi-agent coding workflow costing $12.

A cheaper and safer agentic AI workflow

2026-06-21 18:39 UTC

A developer shares their experience with agentic AI coding, achieving low costs ($0.034) and high efficiency through models like GLM-5.2 and DeepSeek V4 Flash, while ensuring privacy via a VirtualBox sandbox. The article details the setup, cost comparisons, and reflections on the AI industry's business models.

Agentic task completed for $0.034 in 3 minutes using DeepSeek V4 Flash, with only 2 minor errors vs. human's 4 errors in 1 hour.
Privacy protected by running the agent in a Debian VM within VirtualBox, isolating project data.

Plotting AI model release cadence: two labs are accelerating, three aren't

2026-06-21 02:16 UTC

Analysis of frontier model release data shows Anthropic and OpenAI are accelerating their release cadence, while Google, Meta, and DeepSeek are not. The article explores the recursive self-improvement hypothesis and proposes a falsifiable test.

Anthropic and OpenAI show accelerating model release cadence; three other labs do not.
Acceleration may be due to recursive self-improvement, where labs use their own models to build successors.

Beyond the $7.4B Headline: DeepSeek's Series A signals Chinese AI alliance shift

2026-06-20 23:47 UTC

3 Takeaways This Week: DeepSeek's $7.4B Series A led by Tencent signals a shift in Chinese AI funding away from ecosystem players; Japan targets $65B in physical AI infrastructure by 2040; Zhipu AI's GLM 5.2 surpasses Anthropic's Claude in design benchmarks.

DeepSeek's $7.4B Series A led by Tencent, with Alibaba and ByteDance absent.
Japan plans $65B public-private investment in physical AI infrastructure by 2040.

VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the Spectrum-to-Signal Post-Training Pipeline

2026-06-19 22:06 UTC

VibeThinker-3B is a compact 3B-parameter reasoning model that matches large models like DeepSeek V3.2 on math and code benchmarks, using an efficient post-training pipeline and test-time scaling.

VibeThinker-3B is a 3B dense model, MIT-licensed, built on Qwen2.5-Coder-3B for verifiable reasoning.
It scores 94.3 on AIME26, comparable to DeepSeek V3.2 (671B) and Kimi K2.5 (1T).

Huawei chips refine DeepSeek model in major leap for China's AI self-reliance

2026-06-19 17:33 UTC

A research team including Huawei has successfully used Ascend 910C chips to complete post-training of the DeepSeek-V4-Pro model, marking a milestone in China's ability to perform complex AI training with domestic hardware. The project, involving over 1,000 chips and a 1.6 trillion parameter model, demonstrates a shift from inference-only capabilities to full training, bolstering China's AI self-sufficiency amid US sanctions.

Huawei and partners used Ascend 910C chips to post-train DeepSeek-V4-Pro.
The cluster of at least 1,000 chips performed full-parameter tuning on a 1.6 trillion parameter model.

Show HN: Wolffish – An OS personal desktop AI agent

2026-06-19 11:32 UTC

Wolffish is a desktop AI agent that installs and works out of the box, supporting DeepSeek, GLM, Claude, GPT, and offline models. It prioritizes privacy and security by keeping data local, is open source, and free to use.

Wolffish is a simple, easy-to-use desktop AI agent with no complex setup.
It supports multiple AI models, including local models for offline use.

Attribution-Guided and Coverage-Maximized Pruning for Structural MoE Compression

2026-06-18 04:00 UTC

This paper proposes a structural pruning framework for Mixture-of-Experts models by reformulating prune-ratio allocation as a channel-score coverage maximization problem, solved efficiently via attribution-based approximation. Experiments on DeepSeek and Qwen MoE models show accuracy preservation under 50% or 25% structured pruning with 4-bit quantization, achieving 5.27× memory reduction on Qwen3-30B-A3B and outperforming baselines.

Observation: information within MoE experts is highly concentrated in a small subset of channels, leaving substantial redundancy even in important experts
Proposes a channel-level structural pruning framework that models prune-ratio allocation as a coverage maximization problem

Native Coding Agent Optimized for Local LLM and DeepSeek v4 with Vector Memory

2026-06-16 22:36 UTC

cwcode is a Go-based terminal coding agent leveraging DeepSeek V4 Pro, Qwen3.6-27B, and more. It offers file editing, sub-agents, semantic memory, and autonomous recovery. Key features: low cost (~$0.40/hour), high cache hit ratio (>85%), hash-anchored edits, checkpoint/rewind, and no SaaS lock-in.

Go-based terminal coding agent supporting DeepSeek V4 Pro, Qwen3.6-27B, etc.
Hash-anchored edits and sticky prefix cache reduce token usage and cost

Track tokens usage and AI Subscriptions across major AI platforms

2026-06-14 04:09 UTC

Tokens 4 Breakfast is a macOS menu bar app for tracking and monitoring token usage, subscriptions, and rate limits across major AI platforms like Claude, OpenAI, Cursor, Copilot, Gemini, DeepSeek, Mistral, and more. It helps developers avoid unexpected overspending with real-time alerts, budgeting, and cross-provider visibility. Free for one provider; Pro one-time $7.99 unlocks all.

Real-time menu bar display of AI spend, rate limits, and subscription costs.
Supports 8 major AI providers including Claude, OpenAI, Cursor, etc.

How to Build a QwenPaw Agent Workspace with Custom Skills, Model Providers, Console Access, and Streaming API Testing

2026-06-13 17:27 UTC

This tutorial provides a step-by-step guide to setting up a QwenPaw agent workspace in Google Colab, including installation, configuration, authentication, connecting model providers (OpenAI, OpenRouter, DashScope, DeepSeek, Gemini), creating custom skills and local knowledge files, launching the console with optional Cloudflare tunnel, and testing the streaming chat API.

Step-by-step instructions for installing and initializing QwenPaw with a configured working directory.
Support for multiple model providers, auto-configured via Colab secrets.

China cracks down on Western AI models while US companies flock to DeepSeek

2026-06-13 02:51 UTC

China's Ministry of State Security warns of security risks in using Western AI models, while US firms increasingly adopt Chinese open-source models like DeepSeek due to cost advantages. Both nations' users circumvent restrictions, fueling a proxy market for AI access.

China's MSS warns against using third-party tools to access US AI models, citing security risks.
US companies flock to Chinese models such as DeepSeek and Alibaba's Qwen for lower costs.

Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation

2026-06-12 04:00 UTC

Pythagoras-Prover is a compute-efficient family of open-source Lean theorem provers, featuring autoregressive models (4B and 32B) and a diffusion-based prover (4B). It uses curriculum SFT with stratified data and dynamic proof filtering for training efficiency, and introduces Augmented Lean Formalisation (ALF) to expand verified corpora via self-distillation. The 4B model outperforms DeepSeek-Prover-V2-671B on MiniF2F-Test (86.1% vs 82.4%) with ~167x fewer parameters, while the 32B model sets a new open-source SOTA at 93.0% and solves 93 PutnamBench problems.

Pythagoras-Prover includes autoregressive models at 4B and 32B parameters and a 4B diffusion-based prover that refines proofs iteratively.
Training efficiency is achieved via curriculum SFT with stratified difficulty levels and dynamic proof reasoning filtering within an 8k-token context.

DeepSeek Made AI Cheap. Now It Needs Billions to Keep It Cheap

2026-06-08 12:36 UTC

DeepSeek's low-cost AI models reshaped the industry, but a US evaluation shows its latest model trails frontier by 8 months. Now the company is raising billions to fund its next phase, facing a capital race it helped create.

CAISI evaluation: DeepSeek V4 Pro is most capable Chinese model but 8 months behind US frontier.
DeepSeek raising ~$7.4B at $50-60B valuation to sustain operations.

Deepseek topped Ramp's trending software vendors in June 2026 as US companies chase cheaper AI

2026-06-07 16:06 UTC

In June 2026, Deepseek became the top paid software vendor on Ramp's platform as US companies send data directly to the service. Ramp chief economist Ara Kharazian cites cost awareness as a driver but warns about security risks of using Chinese models.

Deepseek ranked first among Ramp's trending software vendors in June 2026.
US companies are turning to Deepseek's paid AI service to reduce costs.

Job Searcher

2026-06-06 15:36 UTC

Job Searcher is an AI-powered job search assistant for new grads. It analyzes resumes, generates LinkedIn search queries, and scores job postings across five dimensions: skills, experience, education, industry, and seniority. Built with a teacher-student model (DeepSeek V4 Pro and Qwen3-8B), it uses a curated dataset of 2,500 resumes and 10,000 job postings. Open-source and available on HuggingFace Spaces.

Automates LinkedIn job search with resume-based queries and multi-dimension scoring
Uses DeepSeek V4 Pro as teacher and Qwen3-8B as student

DeepSWE results are unreliable – 3/3 DSv4 "failed" tasks solved with same model

2026-06-04 16:32 UTC

An audit of the DeepSWE benchmark reveals that deepseek-v4-pro's reported results (8% solve rate, $4.22 avg cost) are invalid due to multiple issues: cost inflated ~5x by ignoring cache pricing, all three reported failures were solved with the same model, OpenRouter privacy settings silently block DeepSeek, and the model received no reasoning/effort tuning unlike competitors.

Cost inflated ~5x: benchmark bills all input tokens at cache-miss rate, ignoring 78% cache hits at 99.2% discount.
All three 'failed' tasks solved with same model deepseek-v4-pro for ~$0.86 total.

More US Firms Turn to China's DeepSeek over Pricey Silicon Valley AI

2026-06-04 07:18 UTC

Chinese AI startup DeepSeek tops a US corporate spending index as US firms adopt cheaper Chinese models, replacing expensive options like OpenAI and Anthropic, amid a broader shift to open-source models.

DeepSeek ranked first on Ramp's June trending software vendors list, surpassing PheedLoop and Fireworks AI.
US companies are paying DeepSeek directly, sending data to China-hosted servers, a sign of seeking cost-effective AI.

DigitalOcean says it is now an OpenRouter AI model provider

2026-06-03 08:25 UTC

DigitalOcean announced on X that it is now a model provider on OpenRouter, offering DeepSeek V3.2, Kimi K2.6, and DeepSeek V4 Flash. The move signals the company's expansion from cloud infrastructure into AI inference.

DigitalOcean announced on X that it has become a model provider on OpenRouter
Initial models include DeepSeek V3.2, Kimi K2.6, and DeepSeek V4 Flash

Dropstone 1.5: 2× Claude Code's usage at $15/mo

2026-06-03 03:59 UTC

Dropstone 1.5 is an AI coding agent for the terminal, offering roughly 450 deep coding sessions per week for $15/month—about twice what Claude Code Pro delivers for $20. It runs on DeepSeek and Kimi models hosted in the US, with no data stored. Safety features require permission for file writes, shell commands, and network calls.

$15/month for ~450 deep coding sessions per week, 2x Claude Code Pro's usage.
Uses DeepSeek V4 Flash, V4 Pro, and Kimi K2.6 models hosted in the US, no data stored.

Show HN: Tkcore AI – A multi-model workspace with custom knowledge grounding

2026-06-02 07:26 UTC

Tkcore AI offers a multi-model workspace integrating various AI models like DeepSeek, Qwen, GLM, Kimi, and MiniMax, with features for low-latency responses, long context, multimodal inputs, and custom knowledge grounding via file uploads.

Supports models from DeepSeek, Qwen, GLM, Kimi, and MiniMax for diverse tasks.
Provides low-latency, high-throughput text and multimodal capabilities with image/video input.

New review paper argues code is how AI agents think and act, not just what they produce

2026-05-29 13:10 UTC

A new review paper argues that the real bottleneck for autonomous AI agents is the software layer around the language model—tools, memory, testing, and permissions. DeepSeek is building a dedicated 'Harness' team in Beijing, confirming the formula: model + harness = AI agent.

The paper claims the bottleneck for AI agents is the software harness, not the model.
Key components include tools, memory, testing, and permission boundaries.

PPIO Selected for '2026 Global AI 100' by FeiFan Research, Leading the New Wave of AI Globalization

2026-05-29 11:24 UTC

PPIO has been named to the '2026 Global AI 100' list by FeiFan Research, recognized at the FeiFan Awards – Annual AI Globalization Summit. The list honors AI-native companies with global vision. PPIO offers a global distributed computing infrastructure, full-stack cloud services, a model platform supporting DeepSeek, GLM, MiniMax, Kimi, Qwen, and an innovative Agent Sandbox. As of April 2026, PPIO has integrated over 4,800 distributed nodes, with daily token calls exceeding 1 trillion, over 570,000 developers, and Agent Sandbox business growing more than 50x since launch. PPIO was also designated as a pilot unit for Shanghai's Digital Overseas Service Platform and a GDA Pilot Service Station.

PPIO selected for '2026 Global AI 100', highlighting its leadership in AI globalization.
Provides global distributed computing infrastructure with full GPU coverage for training and inference.

Show HN: I packaged a Python AI agent and Vue dashboard into one Electron app

2026-05-28 10:12 UTC

Hermes Desktop is a cross-platform desktop app that bundles a Python runtime, hermes-agent (a self-improving AI agent), and hermes-web-ui (a Vue 3 + Koa chat dashboard) into a single Electron application, requiring no separate Python or Node installation. It integrates with DingTalk and is powered by DeepSeek.

Bundles Python runtime and hermes-agent for a zero-dependency user experience
Uses Electron shell with hermes-web-ui frontend

DeepSeek Researcher Develops Automated Research Skill: Writing a Paper with Only 2 Hours of Human Brain Time

2026-05-27 01:14 UTC

DeepSeek researcher Chen Deli used his self-developed DeliAutoResearch skill, collaborating with DeepSeek-V4-Pro and GPT-Image2, to complete a 46-page paper in just 6 days. The paper introduces an L1-L5 autonomy classification for research agents, analyzes four architectural patterns and 17 mainstream systems, and identifies six open problems. Chen Deli says only about 2 hours of human 'CPU time' were needed, with the rest handled by AI agents.

Chen Deli's DeliAutoResearch skill enabled the paper to be 99% written by AI agents.
The paper proposes an L1-L5 autonomy classification for research agents, analogous to SAE levels for autonomous driving.

AI Weekly Issue #496: Anthropic's Pentagon model is now everyone's model

2026-05-27 00:00 UTC

Anthropic released its formerly classified Mythos model to the public, collapsing the gap between sovereign and developer AI. DeepMind's Demis Hassabis moved AGI timeline to 2029. Critical vulnerabilities in Starlette impacted millions of AI agents, and a coordinated takedown dismantled the Glassworm botnet. BNP Paribas partnered with Mistral for sovereign AI security, while China restricted travel for top AI engineers at Alibaba and DeepSeek. Corporate AI spending and layoffs made headlines: Uber burned its full-year AI budget by April, ClickUp restructured with a 3:1 AI-to-human ratio, and Sam Altman reversed his white-collar apocalypse prediction. However, MIT Technology Review data showed AI-exposed roles have lower unemployment.

Anthropic releases Mythos, previously limited to government contractors, now available via standard API.
DeepMind CEO Hassabis advances AGI timeline to 2029, citing AlphaProof Nexus solving nine Erdős problems cheaply.

China reportedly now requires top AI researchers to get permission before leaving the country

2026-05-26 14:25 UTC

China is restricting overseas travel for top AI researchers at private firms like Alibaba and DeepSeek, requiring official approval to leave the country, due to fears of data leaks and talent poaching.

China requires top AI researchers to obtain permission before traveling abroad.
The policy applies to private companies like Alibaba and DeepSeek.

Introducing DSA Attention to Multimodal: Kuaishou Keye 2.0 Opens a New Paradigm of Enhanced Reasoning

2026-05-26 10:17 UTC

Kuaishou releases Keye-VL-2.0-30B-A3B, a multimodal large language model that first applies DeepSeek Sparse Attention (DSA) to multimodal scenarios, enabling 256K ultra-long context deep perception. It achieves SOTA on long-video temporal understanding benchmarks and introduces built-in Agent collaboration, paving the way for enhanced reasoning and real-world business applications.

First to integrate DSA attention into multimodal, solving long-video understanding bottlenecks.
Achieves SOTA on TimeLens, LongVideoBench, MLVU; reverses long-context decay by boosting accuracy from 35.34% to 42.44% when scaling from 64 to 512 frames.

Cited AI Workspace: No More Re-Uploading Files

2026-05-26 02:18 UTC

UUMuse is a cloud AI knowledge base platform where you upload files once and use them across GPT, Claude, DeepSeek, Qwen, and more — with cited answers, persistent memory, agent mode, a multi-expert debate feature (Spark), and flexible deployment as docs sites, APIs, or MCP servers.

Upload files once and query multiple AI models (GPT, Claude, DeepSeek, Qwen) with source citations.
Persistent memory remembers your writing style and project context across conversations.

DeepSeek V4 Gets Even Cheaper: New Tool Boasts 99.82% Cache Hit Rate, Slashes Bills to 20%

2026-05-25 04:40 UTC

One month after DeepSeek V4's release, the open-source community unveiled Reasonix, a tool specifically designed to minimize API costs by maximizing cache efficiency. It achieves a staggering 99.82% cache hit rate, reducing a $61 bill for 400M+ tokens to just $12.

Reasonix is a dedicated coding harness for DeepSeek, focusing on cost reduction.
Its cache-first loop, tool-call repair, and automatic context compression maintain over 90% cache hit rate in long sessions.

Deepseek makes its 75 percent discount permanent, pricing output tokens at least 34x below GPT-5.5

2026-05-23 17:10 UTC

Deepseek is making the 75 percent discount on its top model V4-Pro permanent. At $0.435 per million input tokens, it's at least 11.5 times cheaper than GPT-5.5 and over 34 times cheaper on output. For token-hungry agentic systems, this kind of pricing could squeeze Western providers hard.

Deepseek's 75% discount on V4-Pro is now permanent.
Input token price is $0.435 per million, 11.5x cheaper than GPT-5.5.

Alibaba's latest AI model ran autonomously for 35 hours to optimize code for its own custom chip

2026-05-23 10:17 UTC

Alibaba's Qwen team releases Qwen3.7-Max, a proprietary model built for long-running autonomous agent tasks. It matches Claude Opus 4.6 on benchmarks and beats Chinese rivals like DeepSeek V4 Pro and Kimi K2.6. The team also demos the model steering a four-legged robot.

Qwen3.7-Max designed for long-running autonomous tasks
Matches Claude Opus 4.6, beats Chinese rivals

DeepSeek