AI News HubLIVE
In-site rewrite4 min read

Show HN: Smart model routing directly in Claude, Codex and Cursor

Weave Router is an open-source model router that intelligently selects the best AI model per request, supporting multiple API formats and reducing costs by 40-70%.

SourceHacker News AIAuthor: adchurch

Uh oh!

There was an error while loading. Please reload this page.

Notifications You must be signed in to change notification settings

Fork 7

Star 61

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

455 Commits

455 Commits

.claude/skills

.claude/skills

.github/workflows

.github/workflows

cmd

cmd

db

db

docs

docs

frontend

frontend

install

install

internal

internal

scripts

scripts

.dockerignore

.dockerignore

.env.example

.env.example

.gitignore

.gitignore

AGENTS.md

AGENTS.md

CLAUDE.md

CLAUDE.md

CONTRIBUTING.md

CONTRIBUTING.md

Dockerfile

Dockerfile

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

docker-compose.yml

docker-compose.yml

go.mod

go.mod

go.sum

go.sum

Repository files navigation

What it does

Point Claude Code, Codex, Cursor, or your own app at localhost:8080. The router:

🎯 Routes per request. A cluster scorer derived from Avengers-Pro 2 picks the right model from your enabled providers, every turn.

🔌 Speaks everyone's API. Anthropic Messages, OpenAI Chat Completions, Gemini native. Streaming, tools, vision, the works.

🧠 Knows OSS too. DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via OpenRouter (or any OpenAI-compatible endpoint).

🔒 BYOK by default. Provider keys stay on your box, encrypted at rest.

📊 Observable. OTLP traces out of the box. See your dashboard in the Weave dashboard (http://localhost:8080/ui/dashboard) or drop in Honeycomb, Datadog, Grafana, whatever.

30-second quickstart

The fastest way: point Claude Code, Codex, or opencode at the hosted Weave Router with one command. No clone, no Docker, no Postgres.

npx @workweave/router

That's it. The installer asks which tool (Claude Code, Codex, or opencode), walks you through scope (user vs. project), grabs a router key, and wires the right config file. Other flavors:

npx @workweave/router --claude # skip the picker, Claude Code npx @workweave/router --codex # skip the picker, OpenAI Codex CLI npx @workweave/router --opencode # skip the picker, opencode npx @workweave/router --scope project # per-repo, commits settings.json (or .codex/ / opencode.json) npx @workweave/router --local # self-hosted localhost:8080 npx @workweave/router --base-url https://router.acme.internal npx @workweave/[email protected] # pin a version

Requires Node ≥ 18 (Claude Code and opencode paths also need jq). Full flag reference: install/npm/README.md.

Or: self-host the whole stack

If you want the router (and dashboard) running on your own box:

1. Drop a provider key in. OpenRouter is the recommended baseline.

echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env.local

2. Boot Postgres + router on :8080 and seed an rk_ key.

make full-setup

The router is up at http://localhost:8080, the dashboard at http://localhost:8080/ui/ (password: admin), and your rk_... key prints in the logs.

Call it like Anthropic

curl -sS http://localhost:8080/v1/messages \ -H "Authorization: Bearer rk_..." \ -d '{"model":"claude-sonnet-4-5","max_tokens":256, "messages":[{"role":"user","content":"hi"}]}'

...or like OpenAI

curl -sS http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer rk_..." \ -d '{"model":"gpt-4o-mini", "messages":[{"role":"user","content":"hi"}]}'

Peek at the routing decision without proxying

curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..." -d '...'

Wire it into your tools

Claude Code. Run make install-cc to wire Claude Code at the local self-hosted router (it's also invoked automatically at the end of make full-setup). For the hosted router, use npx @workweave/router above.

Codex (OpenAI CLI). npx @workweave/router --codex patches ~/.codex/config.toml (or /.codex/config.toml with --scope project) with a managed [model_providers.weave] block and sets model_provider = "weave". Codex's existing OPENAI_API_KEY flows through to api.openai.com for the plan-based passthrough; the router key rides in an X-Weave-Router-Key HTTP header. Re-install and --uninstall --codex rewrite/remove only the managed block, leaving the rest of your Codex config untouched.

opencode. npx @workweave/router --opencode merges a provider.weave entry into ~/.config/opencode/opencode.json (or /opencode.json with --scope project). It uses opencode's bundled @ai-sdk/anthropic provider pointed at the router's /v1 endpoint — the router speaks the Anthropic Messages API natively, so opencode works unmodified. The router key and identity headers ride alongside the provider config; re-install rewrites only the managed block and --uninstall --opencode strips it.

Cursor (early beta, performance may not be the best). Settings → Models → Override OpenAI Base URL → http://localhost:8080/v1, paste rk_... as the API key.

Switching on/off. After installing, npx @workweave/router off --claude (or --codex / --opencode) routes that client straight to its provider again without discarding the router config; on flips it back, and status reports which way it's pointing. Claude Code also gets /router-off, /router-on, and /router-status slash commands. Cursor toggles via the same Settings → Models override above. See install/README.md.

Two keys, don't mix them up:

sk-or-... / sk-ant-... / sk-... = your upstream provider key. Lives in .env.local.

rk_... = your router key. Clients send this as a Bearer token.

Endpoints

Endpoint Format

POST /v1/messages Anthropic Messages, routed

POST /v1/chat/completions OpenAI Chat Completions, routed

POST /v1beta/models/:action Gemini generateContent, routed

POST /v1/route Returns the decision, no upstream call

GET /v1/models  ·  POST /v1/messages/count_tokens Anthropic passthrough

GET /health  ·  GET /validate liveness + key check

Deeper docs

📐 Configuration reference: every env var, BYOK encryption, OTel knobs, cluster routing.

🛠️ Contributing: layering rules, hot-reload dev, migrations, tests, the whole engineering loop.

🏗️ Architecture: package layout, import contracts, recipes for adding endpoints / providers / strategies.

Roadmap

Token-aware rate limiting (Redis sliding window per installation)

Sub-installations for tenant hierarchies

Speculative dispatch + hedging for tail latency

Footnotes

Lu, Y., Liu, R., Yuan, J., Cui, X., Zhang, S., Liu, H., & Xing, J. RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers. arXiv:2510.00202, 2025. https://arxiv.org/abs/2510.00202 ↩

Zhang, Y. et al. Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing (Avengers-Pro). arXiv:2508.12631, 2025. https://arxiv.org/abs/2508.12631 ↩

About

Model router for agentic systems. Routes every prompt to the right model in <50ms. Cut costs 40-70% with just an endpoint change.

weaverouter.com

Topics

codex

anthropic

ai-gateway

openai-compatible

claude-code

agentic-coding

model-router

Resources

Readme

License

View license

Contributing

Contributing

Uh oh!

There was an error while loading. Please reload this page.

Activity

Custom properties

Stars

61 stars

Watchers

0 watching

Forks

7 forks

Report repository

Releases

12 tags

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

Languages

Go 83.5%

TypeScript 8.7%

Shell 5.0%

PLpgSQL 1.3%

Python 0.8%

Dockerfile 0.3%

Other 0.4%