AI News HubLIVE
Public articles 13Collected articles 14Trust 84Refresh 120 min
Health HealthySource type OfficialFull-text rights Official full textLast ingested 2026-06-12ID ollama-blogStatus Enabled

Official local AI model runtime blog; confirm reuse terms before full body display.

Latest public articles

Ollama's highest performance on Apple Silicon yet with MLX

Ollama's MLX engine has been updated to deliver its highest performance on Apple Silicon yet. By leaning more heavily on Apple's unified memory and the Metal-backed MLX framework, models output higher quality responses, respond faster, and use less memory. The update includes support for NVFP4 format, up to 20% faster output, and a snapshot system for agent workflows.

  • Ollama's MLX engine now supports NVFP4 format, halving quantization quality loss.
  • Output speed increased by up to 20% due to fused Metal kernels and optimized sampling.
In-site article

Improved performance and model support with GGUF

Ollama 0.30 is now available with improved performance and GGUF model compatibility through llama.cpp, augmenting MLX on Apple silicon and supporting more models on wider hardware.

  • Up to 20% faster throughput on NVIDIA GPUs
  • Vulkan enabled by default for AMD and Intel GPUs
In-site article

NVIDIA Nemotron 3 Ultra

NVIDIA Nemotron 3 Ultra is a 550 billion parameter (55B active) open model designed for long-running agentic workflows, with 1M token context and NVFP4 optimization, leading in agentic benchmarks and cost efficiency.

  • 550B total parameters with 55B active per token, optimized for agent orchestration and coding agents.
  • 1M token context window for entire codebases and tool histories.
In-site article

OpenJarvis: a local-first personal AI is now available to run with Ollama

OpenJarvis v1.0 is now available: an open-source framework for building personal AI agents that run on your own hardware, with Ollama support built-in.

  • OpenJarvis v1.0 is released with native Ollama support.
  • Developed by Stanford's Hazy Research and Scaling Intelligence labs.
In-site article

Ollama is now powered by MLX on Apple Silicon in preview

Ollama announces a preview release powered by Apple's MLX framework, delivering significant performance improvements on Apple Silicon, including NVFP4 support and enhanced caching.

  • Ollama preview leverages MLX framework for fastest performance on Apple Silicon.
  • Supports NVFP4 quantization for higher quality and production parity.
In-site article

The simplest and fastest way to setup OpenClaw

Setup OpenClaw in under two minutes with a single Ollama command.

  • Ollama 0.17 introduces `ollama launch openclaw` for one-command setup.
  • OpenClaw is a personal AI assistant for inbox, email, calendar, and chat apps.
In-site article

Subagents and web search in Claude Code

Ollama now supports subagents and web search in Claude Code. No MCP servers or API keys required. Subagents can run tasks in parallel, keeping context clean. Web search is built-in via the Anthropic compatibility layer.

  • Ollama integrates subagents and web search into Claude Code.
  • Subagents parallelize tasks like code exploration and research.
In-site article

OpenClaw: A Local AI Assistant for Coding

OpenClaw is a personal AI assistant that connects your messaging apps to local AI coding agents, all running on your own device for privacy.

  • Connects messaging apps to local AI coding agents.
  • Supports WhatsApp, Telegram, Slack, Discord, iMessage.
In-site article

ollama launch

Ollama introduces `ollama launch`, a new command that sets up and runs coding tools like Claude Code, OpenCode, and Codex with local or cloud models, without needing environment variables or config files.

  • New `ollama launch` command simplifies coding tool setup.
  • Supports Claude Code, OpenCode, Codex, and Droid.
In-site article

Claude Code with Anthropic API compatibility

Ollama v0.14.0+ now supports the Anthropic Messages API, enabling tools like Claude Code to work with open-source models. Run locally or connect to cloud models via ollama.com.

  • Ollama v0.14.0 added compatibility with the Anthropic Messages API, allowing Claude Code to use open models.
  • Configure environment variables to connect Ollama locally or use cloud models.
In-site article

OpenAI Codex with Ollama

Open models can be used with OpenAI's Codex CLI through Ollama. Codex can read, modify, and execute code in your working directory using models such as gpt-oss:20b, gpt-oss:120b, or other open-weight alternatives.

  • OpenAI Codex CLI now supports open models via Ollama, including gpt-oss:20b and gpt-oss:120b.
  • Users must install the npm package and start Codex with the --oss flag; default model is local gpt-oss:20b.
In-site article

OpenAI gpt-oss-safeguard

Ollama partners with OpenAI and ROOST to launch the gpt-oss-safeguard reasoning models for safety classification. Available in 20B and 120B sizes under Apache 2.0 license, these models support custom policies, interpretable reasoning, and configurable effort.

  • Ollama collaborates with OpenAI and ROOST to release gpt-oss-safeguard safety models.
  • Two model sizes (20B and 120B) with permissive Apache 2.0 license.
In-site article

MiniMax M2

MiniMax M2 is now available on Ollama's cloud. It is a model built for coding and agentic workflows, with 10 billion activated parameters (230B total). It ranks #1 among open-source models in composite intelligence benchmarks and excels at multi-file edits, agentic tool use, and long-horizon task execution.

  • MiniMax M2 is a coding and agentic-focused open-source model now available on Ollama cloud.
  • It achieves top composite intelligence score among open-source models per Artificial Analysis benchmarks.
In-site article

All sources