General AI company VAST has announced nearly $200 million in new funding and publicly unveiled its world model roadmap, Project Eden. The approach decouples state prediction from visual rendering, enabling persistent environments, scene reuse, and native multi-player interaction, targeting both creators and embodied AI research.
Secured nearly $200 million in Series A+ and A++ funding
Unveiled world model roadmap Project Eden with state-rendering decoupling
Agnes AI, a top-10 AI lab, has indefinitely opened its core models' APIs for free, covering text, image, and video. This includes Agnes-2.0-Flash for text, Agnes-Image-2.0-Flash for images, and Agnes-Video-2.0 for videos. The move aims to lower barriers for developers and creators, enabling high-frequency testing and experimentation.
Agnes AI (Top 10 AI Lab) releases free APIs for text, image, and video models.
Models include Agnes-2.0-Flash (text), Agnes-Image-2.0-Flash (image), Agnes-Video-2.0 (video).
OpenAI is aggressively hiring robotics engineers, posting four core roles in electrical engineering, simulation, actuator design, and control systems software, with annual base salaries reaching up to $310,000 plus equity. The move signals a renewed push into embodied AI, reviving efforts that were shelved in 2020. The company has also recruited prominent researchers, including several Chinese scientists, to advance its robotics agenda.
OpenAI re-enters robotics with four key engineering positions: electrical, simulation, actuator design, and control systems software. Salaries range from $210,000 to $310,000 annually.
The company previously developed the Dactyl robotic hand (2017-2019) but disbanded the team in 2020 due to data scarcity and a shift to large language models.
At the 2026 China AIGC Industry Summit, Wang Xiaoye, Technical Director of AWS Product Technology, pointed out that 87% of enterprises claim to have deployed AI at scale, but only 10% have gained actual value. He emphasized the huge gap between personal and enterprise-level agent deployment, and proposed that enterprises need to focus on five layers: compute, models, data & knowledge, agentic platform, and applications. He also noted that token costs are often high because too much useless information is fed to the model.
87% of enterprises have deployed AI, but only 10% see value
Personal and enterprise agent deployment are fundamentally different
Deep Principle's new materials foundation model MPA (Materials Property Axiom) leverages LLM-inspired three-stage training to achieve state-of-the-art results on 40 real-world industrial datasets. By incorporating physics-guided alignment during mid-training and a hybrid readout head, MPA excels at predicting properties of unseen structures, marking a significant advance in AI for science.
MPA uses pre-training, mid-training, and fine-tuning with physics-guided alignment.
Hybrid Readout head handles both size-dependent and size-independent properties.
Jiaming Song, known as the father of DDIM (Denoising Diffusion Implicit Models), announced his departure from Luma AI on LinkedIn. DDIM dramatically accelerated the sampling process of diffusion models, laying the foundation for tools like Stable Diffusion, DALL-E, and Midjourney. During his nearly three years at Luma AI, he led key technical transitions from 3D generation to video generation and finally to multimodal foundation models. His next move remains undisclosed.
Jiaming Song, creator of DDIM, left Luma AI after nearly three years
DDIM enabled fast sampling in diffusion models, powering mainstream AI image generators
Fudan University and Tongyi Lab jointly introduce ToolCUA, a Computer Use Agent designed to master hybrid GUI-Tool action spaces. It achieves 46.85% accuracy on OSWorld-MCP, surpassing Claude-4-Sonnet, through a two-stage training pipeline that teaches agents when to use GUI vs tools.
Hybrid GUI-Tool action spaces cause path confusion, reducing accuracy even with strong models
ToolCUA uses two-stage training: synthetic data generation for interleaved trajectories, then online RL with tool-efficient rewards
NVIDIA teases a new PC era with its upcoming N1X chip, a self-developed ARM CPU paired with a Blackwell GPU, targeting AI-native laptops. The chip features 20 cores, 6144 CUDA units, and 128GB unified memory, but gaming performance is limited.
NVIDIA posted a teaser with coordinates pointing to Computex Taipei, hinting at a new PC era with its N1X chip.
The N1X chip, developed with MediaTek on TSMC N3B, includes a 20-core ARM CPU and a Blackwell GPU with 6144 CUDA cores, similar to RTX 5070.
At the 2026 China AIGC Industry Summit, Wang Xiaoye, Technical Director of Amazon Web Services, pointed out that 87% of enterprises claim to have deployed AI at scale, but only 10% have gained real production value. He emphasized that enterprise-grade Agent deployment must bridge four major gaps: model selection, construction complexity, usage threshold, and talent shortage. He introduced AWS's five-layer architecture—compute, model, data, harness platform, and agent applications—and products like Quick to help enterprises move from demo to production.
87% of enterprises deploy AI, but only 10% gain production value.
Enterprise-grade agents differ vastly from personal ones, requiring solutions for security, stability, and trust.
The τ0-World Model (τ0-WM), a 5B-parameter open-source embodied world model, is pre-trained on nearly 30,000 hours of data, including 17,800 hours of real-world teleoperation data. It incorporates test-time computation to let robots simulate and evaluate multiple action sequences before execution, achieving state-of-the-art results on long-horizon manipulation tasks.
τ0-WM is the largest open-source pre-trained embodied world model with 5B parameters and ~30,000 hours of training data.
Real robot teleoperation data (17,800 hours) dominates the pre-training, a first in the field.
Professor Huang Chao from the University of Hong Kong proposes rebuilding digital infrastructure for the Agent era: instead of forcing AI to mimic human interfaces, make software speak AI's native language (CLI). His team's lightweight open-source Agent nanobot has surpassed 200,000 downloads, and innovations like CLI-Anything demonstrate a paradigm shift toward AI-native computer use.
Huang argues for redesigning the digital world to optimize for Agents rather than forcing Agents to adapt to human tools.
Open-sourced nanobot saw 100 days of daily updates and over 200,000 downloads.
MiniMax, an AI startup focusing on multimodal models, went public on the Hong Kong Stock Exchange in January 2026. The company adheres to a dual strategy of large models + applications and ToC + ToB. Internally, it provides unlimited tokens to all employees, uses agents to automate workflows, and targets high-value tasks that humans dislike, significantly improving efficiency and flattening the organization. In the next 2-3 years, AI will deeply integrate with various industries.
MiniMax has been committed to next-generation AI since its founding, advocating 'Intelligence with Everyone' and dual driving of models/applications and ToC/ToB.
Internal practices: unlimited tokens for all, agent-assisted HR and coding, flatter organization, and 30% R&D efficiency boost.
Yi Tay, a research scientist at Google DeepMind, led the team that helped Gemini Deep Think win a gold medal at the International Mathematical Olympiad. But beyond AI, he is also an accomplished pianist who once dreamed of a career in music. This article explores his journey in AI research and his musical talent.
Yi Tay is a Google DeepMind research scientist and key contributor to Gemini Deep Think.
He led the team that earned Gemini a gold medal at the IMO, and also contributed to physics and chemistry Olympiads.
Gamma-World, developed by NVIDIA and Tsinghua University, addresses multi-agent world modeling with symmetric identity encoding via simplex rotary encoding and efficient communication via sparse hub attention, enabling zero-shot generalization to more agents and transfer to real-world robot scenarios.
Simplex Rotary Agent Encoding ensures symmetric and equal representation of agents.
Sparse Hub Attention reduces cross-agent communication complexity from quadratic to linear.
NVIDIA, in collaboration with Tsinghua University, the University of Toronto, and Vector Institute, introduces Gamma-World, a multi-agent world model that addresses three fundamental challenges: symmetric agent representation, efficient cross-agent communication, and real-time generation. Using simplex rotary agent encoding, sparse hub attention, and a three-stage distillation pipeline, Gamma-World achieves zero-shot generalization from two-player training data to four-player scenarios and can be applied to real-world dual-arm robot coordination.
Simplex Rotary Agent Encoding represents agents equidistantly, preserving permutation symmetry and enabling flexible scaling to any number of agents.
Sparse Hub Attention reduces cross-agent computation from quadratic to linear complexity, enabling real-time inference at 24 FPS.
BYD unveiled its first self-developed 4nm automotive-grade smart driving chip, Xuanji A3, achieving over 2100 TOPS with three chips combined. The dedicated NPU architecture offers 20% lower power per unit and 100% higher compute utilization compared to general-purpose GPUs. BYD also promises full compensation for accidents during city navigation.
BYD unveils fully self-developed 4nm smart driving chip Xuanji A3
Dedicated NPU delivers 20% lower power and 100% higher compute efficiency
LightSail Technology announced a strategic partnership with Tencent Travel Services to integrate its AI full-sensing wearable device into the mobility platform. The device previously topped JD.com's bestseller list and sold out; now a new pre-sale round is open with discounts.
LightSail Technology and Tencent Travel Services partner to integrate AI wearable into travel services.
The LightSail AI wearable topped JD.com's bestseller list for 8 consecutive days and sold out.
PPIO has been named to the '2026 Global AI 100' list by FeiFan Research, recognized at the FeiFan Awards – Annual AI Globalization Summit. The list honors AI-native companies with global vision. PPIO offers a global distributed computing infrastructure, full-stack cloud services, a model platform supporting DeepSeek, GLM, MiniMax, Kimi, Qwen, and an innovative Agent Sandbox. As of April 2026, PPIO has integrated over 4,800 distributed nodes, with daily token calls exceeding 1 trillion, over 570,000 developers, and Agent Sandbox business growing more than 50x since launch. PPIO was also designated as a pilot unit for Shanghai's Digital Overseas Service Platform and a GDA Pilot Service Station.
PPIO selected for '2026 Global AI 100', highlighting its leadership in AI globalization.
Provides global distributed computing infrastructure with full GPU coverage for training and inference.
From May 25 to 29, ModelBest jointly organized an 'On-Device LLM Open Source Week' with the OpenBMB community, releasing five key technological achievements that form a full-stack closed loop: BitCPM-CANN (1.58-bit low-bit training model supporting Ascend), MiniCPM5-1B (outperforming models twice its size), ForgeTrain (AI-written training framework 10% faster than Megatron), PilotDeck (agent operating system), and UltraData (core dataset). These releases demonstrate that the on-device AI competition is a systemic engineering challenge, not a single technology race. MiniCPM5-1B surpasses parts of GPT-4o, validating the 'density law.' ModelBest's two-year lead and deep tech stack position it as a key player in the shift from cloud to edge.
ModelBest held an On-Device LLM Open Source Week from May 25-29, 2026, releasing one key technology each day.
The five releases cover training framework, model compression, data, and agent OS, showcasing systemic innovation.
Lenovo launches the world's first commercial AI host series, designed for one-person companies (OPC) and growing enterprises. By combining local and cloud hybrid architecture, it addresses high token costs and data security issues, offering generous token bonuses and out-of-box experience.
Lenovo unveils three AI hosts: mini 100, 300, and Pro 700, catering from individuals to teams.
Local inference plus cloud elasticity reduces token costs by 70%-95%.
The next wave of AI creation is hitting gaming. Tencent has unveiled 'Project Craft', an AI-powered game creation platform that lets users generate playable games through natural language, supports 2D and 3D, and comes with AIGC tools and free assets to slash the barrier to game development.
Tencent launches 'Project Craft', an AI game creation platform that generates playable games from natural language prompts
Supports both 2D and 3D games, with a full AIGC pipeline and over 20,000 free assets
Tencent has released Miora, an AI-powered creative studio that integrates image, video, UI/UX, and 3D generation. It features a memory system, multi-modal canvas, and customizable Skills, aiming to enable one person to have a whole creative studio.
Tencent launches Miora, a creative AI agent studio
Supports generation of images, videos, UI/UX, and 3D content
A startup founded by Tsinghua University alumni, Shishi Technology, has developed a proprietary parallel optimization technology that integrates heterogeneous computing resources and inference optimization engines, reducing per-token cost by 40%. The company aims to build a domestic token optimization factory to lower the barrier for AI deployment.
Founded in 2021 by the core team of the National Supercomputing Center in Wuxi, with founder Yan Bowen holding a postdoctoral degree from Tsinghua.
A unified heterogeneous computing pool supports NVIDIA GPUs and various domestic AI chips, turning idle resources into usable compute.
Anthropic released Claude Opus 4.8, showing improvements in terminal engineering and knowledge work, outperforming Mythos in certain benchmarks. The model features enhanced honesty and a new Dynamic Workflows capability that orchestrates hundreds of parallel sub-agents. Early testers report significant gains in code quality and task reliability.
Claude Opus 4.8 was released just 43 days after 4.7, with notable gains in coding and knowledge tasks
Dynamic Workflows: Claude generates JavaScript orchestration scripts to coordinate hundreds of parallel sub-agents
Jijia Vision unveiled the world's first physical AGI 'Dual Pyramid' system, launching the home robot Shiguang S1 with 100-unit household orders, targeting the 'GPT-3 moment' of physical AGI within 12 months.
Jijia Vision introduces the 'Dual Pyramid' system comprising a data pyramid and an algorithm pyramid for physical AGI.
The Shiguang S1 home robot adopts a wheeled-arm configuration and has secured 100-unit real-home orders.
Shagang Steel and DingTalk have entered a strategic partnership to deploy Wukong AI across the enterprise, aiming to transform AI capabilities into tangible value in the steel industry.
Shagang partners with DingTalk to integrate AI into steel manufacturing
Wukong AI serves as the core engine for a unified collaboration platform
Axiom Math, founded by Chinese post-00s entrepreneur Hong Letong, has had 5 out of 8 AI-generated math papers accepted in peer-reviewed journals. The company raised $2 billion in March, achieving a $16 billion valuation.
Five of eight math papers generated by Axiom Math's AI system, AxiomProver, have been accepted by academic journals.
Founder Hong Letong dropped out of Stanford to start the company, which secured $2 billion in funding and is valued at $16 billion.
The LeapQuest team at Shanghai Innovation Institute, in collaboration with multiple universities, introduces a new medical AI paradigm that enables models to actively use visual tools during reasoning, transforming from passive input receivers to active evidence seekers. Two papers are accepted at ICML 2026.
LeapQuest proposes Ophiuchus and MedScope for medical images and videos, adopting the Think with Images/Videos paradigm.
Ophiuchus-7B achieves an average score of 68.0 on 8 VQA benchmarks, surpassing o3 (62.2) and GPT-5 (59.9).
At the 2026 China AIGC Industry Summit, Baidu's Miaoda product director Zhu Guangxiang shared how AI has lowered programming barriers from writing code to chatting. 87% of Miaoda users don't know code; an 8-year-old built an OS; one-person companies (OPCs) land million-dollar contracts. Vibe Coding turns demand-side into supply-side, enabling mass entrepreneurship.
Fourth programming revolution: natural language programming, massively expanding creators
87% of Miaoda users have no coding skills; OPCs are the largest user group (16% entrepreneurs)
Liang Dai, a 2021 Sloan Research Fellowship recipient and former assistant professor at UC Berkeley, has joined Fudan University as a full professor in the Department of Physics and the Center for Astronomy and Astrophysics. Fudan has recently recruited several top talents, including Hao Su, Feng Yuan, and Suoqing Ji.
Liang Dai (2021 Sloan Fellow) joins Fudan University full-time
Former assistant professor at UC Berkeley, alumnus of Peking University Physics