2026-06-01 05:32 UTCIn-site rewrite2 min readUpdated: 2026-06-30 13:03 UTC

Tokens Are Expensive Because You Feed the Model Too Much Junk | @Wang Xiaoye from AWS AIGC2026

At the 2026 China AIGC Industry Summit, Wang Xiaoye, Technical Director of AWS Product Technology, pointed out that 87% of enterprises claim to have deployed AI at scale, but only 10% have gained actual value. He emphasized the huge gap between personal and enterprise-level agent deployment, and proposed that enterprises need to focus on five layers: compute, models, data & knowledge, agentic platform, and applications. He also noted that token costs are often high because too much useless information is fed to the model.

Source量子位Author: 梦晨

At the 2026 China AIGC Industry Summit hosted by QbitAI, Wang Xiaoye, Technical Director of Amazon Web Services (AWS) Product Technology, delivered a keynote on bridging the gap between AI demos and production-grade enterprise agents. He presented striking data: while 87% of enterprises report large-scale AI deployment, only 10% have actually derived business value from it. According to Wang, running a fun agent on a Mac Mini is completely different from deploying thousands of agents securely, reliably, and continuously in a distributed enterprise environment.

Wang identified four major gaps enterprises face when adopting agents: model selection and response speed, construction complexity, usability for non-technical staff, and a talent shortage for end-to-end agent integration. He emphasized that AI is not just about large language models—the "harness" (the operational, control, and production-grade capabilities around the model) is equally important. He compared a model to a CPU: no one gives users a bare motherboard; the full system with OS, software, and usability is needed.

The core of AWS's solution is a five-layer architecture for enterprise agent deployment. The first layer is AI compute, where AWS custom chips like Graviton and Trainium provide optimized performance per cost. The second layer is models, with Amazon Bedrock offering a wide selection of models while protecting enterprise data. The third layer is data and knowledge: traditional data platforms serve humans, but AI agents require AI-ready data platforms with features like memory sharing/isolation, lifecycle management, and token efficiency. Wang noted that high token costs are often due to feeding the model too much irrelevant information, not the token price itself. The fourth layer is the agentic platform, AWS Bedrock AgentCore, which provides runtime, memory, code interpreter, identity, gateway, policy, evaluation, and observability modules. The fifth layer is agent applications like Amazon Q, a personalized working agent that acts as a proactive assistant, integrating across email, chat, and CRM tools.

Wang also announced AWS's collaboration with OpenAI to offer managed agents and advanced models on Bedrock, giving enterprises a choice between flexible open frameworks and ready-to-use managed solutions. He concluded that every application will be reinvented, and AWS aims to accelerate this process by providing better models, trusted data, and production-grade platforms. The speech was based on real customer examples, such as Zixun, which used AgentCore to reduce infrastructure planning and focus on business value.