Ollama Blog AI News Source

Public articles 13Collected articles 14Trust 84Refresh 120 min

Health HealthySource type OfficialFull-text rights Official full textLast ingested 2026-06-12ID ollama-blogStatus Enabled

Official local AI model runtime blog; confirm reuse terms before full body display.

Latest public articles

Ollama's highest performance on Apple Silicon yet with MLX

2026-06-11 00:00 UTC

Ollama's MLX engine has been updated to deliver its highest performance on Apple Silicon yet. By leaning more heavily on Apple's unified memory and the Metal-backed MLX framework, models output higher quality responses, respond faster, and use less memory. The update includes support for NVFP4 format, up to 20% faster output, and a snapshot system for agent workflows.

Ollama's MLX engine now supports NVFP4 format, halving quantization quality loss.
Output speed increased by up to 20% due to fused Metal kernels and optimized sampling.

Improved performance and model support with GGUF

2026-06-05 00:00 UTC

Ollama 0.30 is now available with improved performance and GGUF model compatibility through llama.cpp, augmenting MLX on Apple silicon and supporting more models on wider hardware.

Up to 20% faster throughput on NVIDIA GPUs
Vulkan enabled by default for AMD and Intel GPUs

NVIDIA Nemotron 3 Ultra

2026-06-04 00:00 UTC

NVIDIA Nemotron 3 Ultra is a 550 billion parameter (55B active) open model designed for long-running agentic workflows, with 1M token context and NVFP4 optimization, leading in agentic benchmarks and cost efficiency.

550B total parameters with 55B active per token, optimized for agent orchestration and coding agents.
1M token context window for entire codebases and tool histories.

OpenJarvis: a local-first personal AI is now available to run with Ollama

2026-05-28 00:00 UTC

OpenJarvis v1.0 is now available: an open-source framework for building personal AI agents that run on your own hardware, with Ollama support built-in.

OpenJarvis v1.0 is released with native Ollama support.
Developed by Stanford's Hazy Research and Scaling Intelligence labs.

Ollama is now powered by MLX on Apple Silicon in preview

2026-03-30 00:00 UTC

Ollama announces a preview release powered by Apple's MLX framework, delivering significant performance improvements on Apple Silicon, including NVFP4 support and enhanced caching.

Ollama preview leverages MLX framework for fastest performance on Apple Silicon.
Supports NVFP4 quantization for higher quality and production parity.

The simplest and fastest way to setup OpenClaw

2026-02-23 00:00 UTC

Setup OpenClaw in under two minutes with a single Ollama command.

Ollama 0.17 introduces `ollama launch openclaw` for one-command setup.
OpenClaw is a personal AI assistant for inbox, email, calendar, and chat apps.

Subagents and web search in Claude Code

2026-02-16 00:00 UTC

Ollama now supports subagents and web search in Claude Code. No MCP servers or API keys required. Subagents can run tasks in parallel, keeping context clean. Web search is built-in via the Anthropic compatibility layer.

Ollama integrates subagents and web search into Claude Code.
Subagents parallelize tasks like code exploration and research.

OpenClaw: A Local AI Assistant for Coding

2026-02-01 00:00 UTC

OpenClaw is a personal AI assistant that connects your messaging apps to local AI coding agents, all running on your own device for privacy.

Connects messaging apps to local AI coding agents.
Supports WhatsApp, Telegram, Slack, Discord, iMessage.

ollama launch

2026-01-23 00:00 UTC

Ollama introduces `ollama launch`, a new command that sets up and runs coding tools like Claude Code, OpenCode, and Codex with local or cloud models, without needing environment variables or config files.

New `ollama launch` command simplifies coding tool setup.
Supports Claude Code, OpenCode, Codex, and Droid.

Claude Code with Anthropic API compatibility

2026-01-16 00:00 UTC

Ollama v0.14.0+ now supports the Anthropic Messages API, enabling tools like Claude Code to work with open-source models. Run locally or connect to cloud models via ollama.com.

Ollama v0.14.0 added compatibility with the Anthropic Messages API, allowing Claude Code to use open models.
Configure environment variables to connect Ollama locally or use cloud models.

OpenAI Codex with Ollama

2026-01-15 00:00 UTC

Open models can be used with OpenAI's Codex CLI through Ollama. Codex can read, modify, and execute code in your working directory using models such as gpt-oss:20b, gpt-oss:120b, or other open-weight alternatives.

OpenAI Codex CLI now supports open models via Ollama, including gpt-oss:20b and gpt-oss:120b.
Users must install the npm package and start Codex with the --oss flag; default model is local gpt-oss:20b.

OpenAI gpt-oss-safeguard

2025-10-29 00:00 UTC

Ollama partners with OpenAI and ROOST to launch the gpt-oss-safeguard reasoning models for safety classification. Available in 20B and 120B sizes under Apache 2.0 license, these models support custom policies, interpretable reasoning, and configurable effort.

Ollama collaborates with OpenAI and ROOST to release gpt-oss-safeguard safety models.
Two model sizes (20B and 120B) with permissive Apache 2.0 license.

MiniMax M2

2025-10-28 00:00 UTC

MiniMax M2 is now available on Ollama's cloud. It is a model built for coding and agentic workflows, with 10 billion activated parameters (230B total). It ranks #1 among open-source models in composite intelligence benchmarks and excels at multi-file edits, agentic tool use, and long-horizon task execution.

MiniMax M2 is a coding and agentic-focused open-source model now available on Ollama cloud.
It achieves top composite intelligence score among open-source models per Artificial Analysis benchmarks.

Ollama Blog