DeepSeek V4 Gets Even Cheaper: New Tool Boasts 99.82% Cache Hit Rate, Slashes Bills to 20%
One month after DeepSeek V4's release, the open-source community unveiled Reasonix, a tool specifically designed to minimize API costs by maximizing cache efficiency. It achieves a staggering 99.82% cache hit rate, reducing a $61 bill for 400M+ tokens to just $12.
Article intelligence
Key points
- Reasonix is a dedicated coding harness for DeepSeek, focusing on cost reduction.
- Its cache-first loop, tool-call repair, and automatic context compression maintain over 90% cache hit rate in long sessions.
- Easy to set up with a simple command, it prioritizes cheaper models and auto-switches to Pro when needed.
- The community is excited, though some question the need for a specialized DeepSeek agent.
Why it matters
This matters because reasonix is a dedicated coding harness for DeepSeek, focusing on cost reduction.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
DeepSeek V4, the latest model from DeepSeek, has been making waves with its aggressive pricing strategy. But one month after its release, the open-source community has found a way to make it even cheaper. A new tool called Reasonix, specifically designed for DeepSeek, claims to achieve a cache hit rate of up to 99.82%, slashing API costs to just 20% of the original.
For context, a typical long session using DeepSeek V4 might involve over 400 million tokens and cost $61. With Reasonix, the same session can be completed for as little as $12. The tool has quickly gained popularity on GitHub, with users praising its efficiency.
Reasonix is built around a cache-first loop and an append-only run cycle that leverages DeepSeek's byte-precise prefix-cache design. The key insight is that most AI agent loops tend to reorder, rewrite, or inject timestamps with each interaction, breaking the cache. Reasonix fixes this by dividing the context into three zones: fixed prefixes that are computed only once per session, history that is appended but never rewritten, and a draft area where tool calls are refined before being logged.
Another critical feature is tool-call repair. DeepSeek occasionally struggles with tool calls that are generated but not included in the final message, malformed JSON parameters, repeated calls with identical arguments, or truncated JSON. Reasonix attempts to fix these issues through a four-round repair process before execution.
Cost control is further enhanced by defaulting to the cheaper V4 Flash model for most tasks, only switching to the more powerful V4 Pro for difficult problems. The tool automatically compresses context at the end of each round, and users can manually trigger a model upgrade with the "/pro" command. If a task fails repeatedly, the tool automatically escalates to a higher model for the remainder of the round.
Installation is straightforward: navigate to the project directory and run "npx reasonix code" to start a TUI session. A desktop version is also available. However, the developers emphasize that Reasonix is not a general-purpose tool; it is tightly coupled with DeepSeek's features and will not support other APIs.
The community response has been overwhelmingly positive, with hundreds of discussions. While some users question whether a dedicated DeepSeek agent is necessary, early adopters report significant savings. For example, one user built a micro-bridge that achieves over 95% cache hit rate with DeepSeek V4 Pro in Codex, merely by adjusting the API format. Nonetheless, harness quality varies: some note that using DeepSeek V4 within Claude Code is more cost-effective than in OpenCode.
Whether you choose Reasonix, a custom bridge, or another solution, the message is clear: with the right approach, DeepSeek V4 can be remarkably affordable. Share your experiences in the comments!