AI News HubLIVE
站内改写

Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in most benchmarks

Anthropic releases Claude Opus 4.8, which beats GPT-5.5 and Gemini 3.1 Pro in most benchmarks. The model also catches its own coding errors four times more often than its predecessor. Alongside the launch, Anthropic is rolling out dynamic workflows that can spin up hundreds of parallel sub-agents to handle tasks like codebase-wide migrations.

Article intelligence

EngineersAdvanced

Key points

  • Claude Opus 4.8 outperforms GPT-5.5 and Gemini 3.1 Pro in most benchmarks.
  • The model catches its own coding errors four times more often than its predecessor.
  • Anthropic introduces dynamic workflows supporting hundreds of parallel sub-agents.

Why it matters

This matters because claude Opus 4.8 outperforms GPT-5.5 and Gemini 3.1 Pro in most benchmarks.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

Anthropic releases Claude Opus 4.8, which beats GPT-5.5 and Gemini 3.1 Pro in most benchmarks. The model also catches its own coding errors four times more often than its predecessor. Alongside the launch, Anthropic is rolling out dynamic workflows that can spin up hundreds of parallel sub-agents to handle tasks like codebase-wide migrations.

The article Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in most benchmarks appeared first on The Decoder.