2026-06-08站内改写6 min readUpdated: 2026-06-08

DeepSeek Made AI Cheap. Now It Needs Billions to Keep It Cheap

DeepSeek's low-cost AI models reshaped the industry, but a US evaluation shows its latest model trails frontier by 8 months. Now the company is raising billions to fund its next phase, facing a capital race it helped create.

SourceHacker News AIAuthor: zacfire

Zac

Jun 04, 2026

DeepSeek’s story used to be easy to tell.

A small Chinese AI lab, backed by the profits of a quantitative trading firm rather than Silicon Valley-style venture capital, released open models that seemed to perform far above what their resource base should have allowed. It made frontier AI feel less like a closed priesthood and more like an engineering problem. It made intelligence cheaper. It made open source feel dangerous again.

That story is no longer enough.

On May 1, the U.S. government’s Center for AI Standards and Innovation, or CAISI, published an evaluation of DeepSeek V4 Pro. The conclusion cut in two directions. CAISI said V4 Pro was the most capable Chinese model it had evaluated so far. It also estimated that the model lagged the leading U.S. frontier by about eight months.

That sounds like a demotion.

But the same evaluation also made a more commercially uncomfortable point: DeepSeek V4 Pro was often cheaper than a U.S. reference model at a similar capability level. In other words, DeepSeek may not have caught the frontier, but it may still be changing the price of being near it.

Then came the financing reports.

Reuters reported in early May that DeepSeek could be valued at up to $50 billion in its first external funding round. On June 3, the South China Morning Post reported that DeepSeek was finalizing a round of more than 50 billion yuan, or roughly $7.4 billion, at a valuation just under $60 billion. A Chinese market report put the high end of the valuation at $59 billion. Axios, citing Bloomberg, reported a similar $7.4 billion raise, but at around a $52 billion valuation. The numbers do not match perfectly. DeepSeek has not confirmed the transaction. The investor list, valuation basis, and final terms may still change.

Still, the direction is hard to ignore.

The company that made AI look cheap may now need billions of dollars to keep doing it.

That is the real DeepSeek story now. Not “China has caught OpenAI.” Not “DeepSeek is just another overvalued AI startup.” The more interesting question is this:

Can a research-led, open-weight, low-cost AI lab survive the capital race it helped create?

The Wrong Scoreboard

Most English-language coverage wants to turn DeepSeek into a scoreboard.

China versus America. Open source versus closed source. Cheap models versus expensive models. Export controls versus algorithmic efficiency. These frames are not useless. They are just too flat.

The CAISI evaluation is useful precisely because it makes the scoreboard harder to read.

If you only care about the absolute frontier, the result is clear enough: DeepSeek V4 Pro is not the best model in the world. CAISI’s benchmark suite, which included non-public or held-out tests, placed it closer to an earlier U.S. frontier tier than to the newest top-end systems. That matters. DeepSeek’s own public benchmark comparisons made V4 Pro look closer to the latest Opus and GPT models, but independent evaluation suggests the gap is real.

The mistake is to stop there.

Most users and companies do not always buy the absolute best intelligence available. They buy enough intelligence at a usable price, in a form that fits their workflow. That is especially true for agentic workflows, coding assistants, document processing, routing systems, long-context retrieval, and high-volume API calls. When a task consumes many tokens, a slightly weaker model can become the better business decision if it is cheap, open, and good enough.

That is why “eight months behind” can still be commercially dangerous.

The frontier is not a single line. It is a stack of tradeoffs: raw capability, price, context length, latency, tool use, deployment flexibility, model availability, trust, ecosystem support, and legal permission to modify or host the model yourself. DeepSeek’s advantage is not that it wins every dimension. It is that it puts pressure on several dimensions at once.

This is the part of the story global readers should watch. DeepSeek does not need to be the best model in every benchmark to change the economics of AI adoption. It only needs to make a large enough category of work feel too expensive on closed frontier models.

The Price Cut Is the Product

As of June 5, DeepSeek’s official API pricing page lists V4 Pro at $0.003625 per million cached input tokens, $0.435 per million uncached input tokens, and $0.87 per million output tokens. Those prices are lower than the developer-reported token prices CAISI used in its May 1 cost comparison, and they may change again. But the message is unmistakable: DeepSeek is trying to make high-context, agent-ready model use feel cheap enough to be routine.

That is not a side detail. It is part of the product.

In the AI industry, price is often treated as a go-to-market lever. A company cuts prices to gain share, then raises prices once customers are dependent. DeepSeek’s pricing strategy is more interesting because it is tied to a technical identity. Since the V2 era, founder Liang Wenfeng has framed DeepSeek’s low prices not as subsidy warfare, but as the result of architecture and systems work. In a 2024 interview with 36Kr’s An Yong, later reposted by Sina Finance, Liang argued that DeepSeek’s pricing shock came from cost reduction through technical exploration rather than a deliberate plan to burn money for users.

That claim should not be accepted as a full explanation. Every AI company has strategic reasons to describe its pricing as efficient rather than subsidized. But the claim is not empty either.

V4 is not merely a cheaper API wrapper. DeepSeek’s own release describes V4 Pro as a 1.6 trillion parameter mixture-of-experts model with 49 billion active parameters, 1 million context length through official services, and stronger agentic coding and reasoning capabilities. The model is also distributed through open-weight channels under a permissive license.

The Chinese technical discussion around V4 is useful here because it is less interested in the geopolitical scoreboard and more interested in the engineering mechanism. In a LateTalk discussion shortly after V4’s release, AI engineers described V4 as an “infra whale”: not another R1-style paradigm shock, but a system-level engineering push that combines new attention mechanisms, Muon optimization, FP4-related training and inference work, TileLang kernels, and long-context efficiency. One key point from that discussion is that 1 million context is not just a product bullet. It only matters if the cost of using long context becomes tolerable.

This is where DeepSeek’s work matters.

Long context, agents, and coding workflows are token-hungry. A model can advertise a huge context window, but if the cost of filling that window is too high, the feature remains theatrical. DeepSeek’s strategic move is to make long-context intelligence cheap enough that developers actually use it.

The simplest way to put it:

DeepSeek is not only selling a model. It is selling permission to spend more intelligence.

Open Weights as Distribution

DeepSeek’s open-weight strategy is often discussed as ideology. It is also distribution.

Open weights allow developers to inspect, host, modify, benchmark, fine-tune, route, and integrate the model in ways a closed API cannot easily match. They also move part of the integration burden from the company to the ecosystem. When a model is open and useful, inference frameworks, cloud platforms, local deployment tools, coding-agent wrappers, and model routers all have incentives to support it.

This has real commercial value even if DeepSeek does not capture every dollar of it.

That is the paradox. Open source spreads the model faster than a closed commercial product might. It also allows others to monetize the model without paying DeepSeek directly. Cloud platforms can host DeepSeek. Developers can deploy DeepSeek. Enterprises can run DeepSeek inside their own stack. The model can become infrastructure without the original lab owning every customer relationship.

For a normal startup, that looks like leakage.

For DeepSeek, it may be part of the strategy. Liang has argued that closed-source secrecy is not the company’s real moat. The moat, in his telling, is the team’s accumulated know-how, innovation culture, and ability to keep pushing model architecture under constraint.

That is a noble answer. It is also an expensive one.

If the business model is not to extract maximum margin from every token, then the company needs another way to fund the next round of compute, talent, and experimentation. For the first phase of DeepSeek, that answer was High-Flyer, the quantitative trading firm Liang co-founded. High-Flyer’s profits provided a rare internal funding base. It allowed DeepSeek to avoid the normal startup sequence: pitch deck, venture round, growth narrative, commercialization pressure.

That independence became part of the myth.

Now the myth is meeting the next phase of the industry.

Why Cheap AI Still Needs Capital

The June fundraising reports do not mean DeepSeek’s earlier strategy was fake. They mean the game has changed.

There is a simple but underappreciated distinction here:

Making one breakthrough model under constraint is not the same as sustaining a frontier-adjacent AI lab through repeated model cycles.

DeepSeek’s R1 moment proved that a focused Chinese team could produce a globally important model with far less visible capital than the U.S. frontier labs. But 2026 is not only a model-release competition. It is becoming an agent-infrastructure competition.

Agents change the energy requirement.

A chatbot can answer a prompt and stop. An agent may plan, call tools, write code, inspect files, browse, retry, evaluate, and recover from failure. That means more tokens, more context, more inference, more coding ability, more product feedback, and more real-world failure data. The model is still central, but the surrounding loop becomes more important.

Chinese reporting before V4 captured this pressure well. Caijing framed DeepSeek as facing a crossroads: could it maintain a research-led, low-frequency release rhythm while OpenAI, Anthropic, Google, ByteDance, Alibaba, Moonshot, Zhipu, MiniMax, and others kept accelerating model iteration, coding ability, agent products, and commercial revenue? A separate Chinese model-market essay made the same point in capital-market language: the foundation-model race is becoming an “energy pool” competition, not a one-model sprint. That was not just a media question. It was a company problem.

DeepSeek also has a talent problem, or more precisely, a talent-pricing problem.

LatePost’s organizational reporting portrays DeepSeek as one of the strangest AI labs in China: flat, research-heavy, unusually resistant to normal fundraising, with a rhythm that is almost anti-involution by Chinese technology standards. Employees were described as having no rigid clock-in culture, no obvious hard deadlines, and a rare non-overtime working rhythm in a field where 70-hour weeks are common.

That culture is part of the company’s advantage. But it exists inside a market that prices top AI researchers aggressively. Rival AI companies are raising large rounds, courting public-market narratives, and offering clearer compensation benchmarks. Internet giants can offer enormous packages. Frontier research contributors now have external options. If DeepSeek employees hold equity or options in a company that has never raised outside capital, the question “what is this company worth?” becomes very practical.

This is why the reported personal contribution from Liang Wenfeng is important if it is accurate. SCMP and Bloomberg-linked reporting both suggest Liang may contribute a large amount of his own capital to the round. The details remain unconfirmed, but the logic is clear enough: this would not be an ordinary founder follow-on check. It would be a control signal.

DeepSeek m

[truncated for AI cost control]