AI News HubLIVE

MCP updates

Building Supercharger: How Rocket Close optimized title operations with agentic AI

Rocket Close collaborated with AWS to create Supercharger, an agentic AI solution using Strands Agents, Amazon Bedrock, and MCP tools to streamline title operations. By centralizing knowledge and automating research-heavy tasks, the solution reduced contact center calls and emails by 30%, improved title exam accuracy, and enhanced client satisfaction. This post details the technical architecture, business impact, and lessons learned.

  • Supercharger automates research-heavy title operations using agentic AI, reducing manual queries across multiple systems.
  • Built with Strands Agents and MCP tools for a modular architecture, allowing easy addition of new data sources.
In-site article

Cortex – Agent-Native Knowledge OS on Markdown (Karpathy's LLM Wiki, via MCP)

PULSE8.ai Cortex is an agent-native knowledge OS built on Markdown, providing a shared vault for AI agents and humans with a typed knowledge graph, full-text search, and a MarkItDown-powered compiler, all accessible via a unified MCP interface. Inspired by Andrej Karpathy's LLM Wiki pattern, it requires no database.

  • Cortex is an agent-native knowledge OS on Markdown, inspired by Karpathy's LLM Wiki pattern. It provides knowledge graph, QMD search, file compiler, and MCP server.
  • Supports converting PDF, DOCX, PPTX, etc. to Markdown; features bulk ingest with SHA-256 dedup, real-time vault watching, daily activity logging, and flexible authentication.
In-site article

Show HN: Nenya – A lightweight, highly secure AI API Gateway/Proxy written in Go

Nenya is a lightweight, zero-dependency AI API gateway written in Go. It sits between AI coding clients and upstream LLM providers, adding secret redaction, context management, agent routing, and MCP tool integration with transparent SSE streaming. Security-hardened features include non-root execution, mlock for secrets, seccomp, and no-new-privileges.

  • Written in Go with zero external dependencies, compatible with OpenAI and Anthropic APIs.
  • Built-in adapters for 23 providers, with routing, fallback chains, and circuit breakers.
In-site article

Build a meeting prep and follow-up assistant with Amazon Quick and Cisco Webex MCP servers

This post shows how to build a custom meeting prep and follow-up assistant using Amazon Quick and Cisco Webex MCP servers. From a single prompt, the agent finds an upcoming Webex meeting, reviews prior meeting summaries and transcripts, and pulls related Vidcast highlights and transcript context. It then searches Webex message threads for unresolved follow-ups and creates a concise prep brief. After the meeting, the same assistant can summarize the discussion and identify action items. It can also find related Vidcast updates and draft a follow-up message for the right Webex space.

  • Amazon Quick integrates with Cisco Webex MCP servers to create a conversational meeting assistant that simplifies pre-meeting preparation and post-meeting follow-up.
  • The assistant leverages Webex Meetings MCP, Vidcast MCP, and Webex Messaging MCP to retrieve meeting information, video content, and messages.
In-site article

Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents

Compact language models face challenges beyond isolated function calling when using tools. Evoflux uses evolutionary search at inference time to repair executable tool workflows, raising execution feasibility from 3% to 17-24% on MCP-Bench tasks, outperforming SFT and DPO baselines.

  • Small language models struggle with tool workflow dependencies and execution.
  • Evoflux evolves typed workflow graphs via structured edits and execution feedback.
In-site article

xAI Ships Grok Build Plugin Marketplace With MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers Plugins at Launch

xAI today released the Grok Build Plugin Marketplace, a built-in catalog of plugins for its terminal coding agent. Plugins bundle skills, commands, agents, hooks, MCP servers, and LSPs into one package, installable without leaving the terminal. Six plugins launch with partners including MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers, with commit-SHA pinning for security.

  • xAI launches Grok Build Plugin Marketplace, built into the terminal coding agent.
  • Plugins bundle skills, commands, agents, hooks, MCP, and LSP in one install.
In-site article

Three insights you may have missed from theCUBE’s coverage of Snowflake Summit 2026

The next wave of enterprise AI focuses on software and data infrastructure needed to make models useful in real businesses. Snowflake is positioning itself as a connector between proprietary data and AI models. Key insights include strong data foundations, security and governance frameworks, and the importance of trusted, governed intelligence for production AI.

  • Strong data foundations turn enterprise AI into business outcomes, as seen with DoorDash and Fanatics.
  • Enterprise AI requires new frameworks for security, governance, and trust, including practices from Tenable and Komodo Health.
In-site article

Empower your healthcare agents with ready-to-use MCP on Databricks Marketplace

Databricks Marketplace now offers pre-built MCP servers from partners including Climb, Atropos Health, Kythera Labs, and Redox, covering biomedical, clinical evidence, medical semantics, and interoperability. These servers are centrally governed in the MCP Catalog with Unity AI Gateway, enabling rapid development of secure healthcare AI agents via low-code or custom coding.

  • Ready-to-use MCP servers on Databricks Marketplace lower the barrier for healthcare AI agent development.
  • Partner-provided MCP servers cover target-drug interactions, clinical trials, FDA labels, medical semantic translation, and data interoperability.
In-site article

Nous Research Ships Hermes Agent Profile Builder: Identity, Model, Skills, and MCP Servers in One Dashboard Flow

Nous Research has released a Profile Builder for Hermes Agent within its local web dashboard, replacing the multi-step CLI setup with a single guided flow. Users can define identity, select model/provider, toggle built-in skills, install skills from the hub, and attach MCP servers, all producing isolated profile directories for running multiple agents without state collision.

  • Hermes Agent's new Profile Builder consolidates multi-step CLI profile creation into a single browser-based guided flow.
  • Users configure agent identity, model/provider, built-in and hub skills, and MCP servers in one place.
In-site article

CrustRecruiter

CrustRecruiter is a set of MCP-based skills for Claude that combines Claude's reasoning with Crustdata's live database of 800M+ candidate profiles to automate recruiting tasks such as sourcing, market mapping, email verification, preference memory, and outreach/ATS sync, enabling personalized recruiting at scale.

  • CrustRecruiter turns Claude into a recruiter via MCP, merging reasoning with automation.
  • Accesses 800M+ candidate profiles with real-time career updates.
In-site article

CLI Market: Commerce Infrastructure for AI Agents Launches with 51K+ Verified Prices Across Latin America

CLI Market is a new platform providing verified retail pricing data for AI agents and commercial teams in Latin America, covering 51,000+ prices from 68 retailers across 8 countries, with normalized prices by kg/L, updated every 4 hours, and offering 22 MCP tools and an API.

  • CLI Market offers 51,000+ verified retail prices across 8 Latin American countries, covering 68 retailers with data refreshed every 4 hours.
  • The platform provides 22 MCP tools and a REST API, enabling AI agents to autonomously search, compare, and build multi-retailer baskets.
In-site article

Show HN: CalmSEO – Keyword and Google Search Console Tools for Your AI Agent

CalmSEO is an MCP server that exposes Google Search Console data, live SERPs, keyword volumes, and on-page audits to AI agents like Claude, ChatGPT, Cursor, and Codex. It offers a free tier and paid plans with credit-based usage.

  • CalmSEO provides SEO tools via MCP protocol for AI agents.
  • Includes Google Search Console, live SERPs, keyword volumes, ranked keywords, and on-page audits.
In-site article

Show HN: HeadlessTracker – MCP server that gives your AI eyes on your portfolio

HeadlessTracker is an MCP server for portfolio tracking across crypto exchanges, on-chain wallets, and prediction markets. It supports connectors for Bybit, Binance, MetaMask, Solana, and Polymarket, offering 15 MCP tools, an interactive dashboard, and CLI queries. Users can query their portfolio via AI hosts like Claude Desktop using natural language, without building a UI.

  • HeadlessTracker is a MCP server that aggregates portfolio data from multiple platforms, including Bybit, Binance, MetaMask, Solana, and Polymarket.
  • It provides 15 MCP tools, an interactive dashboard with portfolio, weekly, and risk tabs, and CLI queries.
In-site article

Agent-First Authentication and Authorization

AI agents should be first-class software users with their own identity, credentials, and permissions, rather than borrowing human sessions or acting as faceless service accounts. This article explores the agent-first authentication and authorization model, analyzes the problems with borrowed identity, discusses converging industry trends (e.g., MCP, GitHub Apps, Microsoft Entra), and outlines concrete authentication and authorization requirements, along with a GitHub-compatible implementation called agent-git-service.

  • Agents should be durable software users with their own accounts, credentials, and lifecycle.
  • Borrowed human tokens or generic service accounts lead to excessive permissions and unclear audit trails.
In-site article

Deep Memory – Vocabulary-driven graph memory for AI agents

Deep Memory is an open-source library that provides AI agents with structured, persistent memory using vocabulary-driven knowledge graphs. It solves the cold-start problem by predefining entity types, relationships, and property constraints, enabling agents to create and traverse entities efficiently. Supports multiple storage backends, MCP server integration, and domain starter kits.

  • Vocabulary-driven approach eliminates guesswork and inconsistency when AI agents interact with knowledge graphs. The agent sees a complete schema of entity types, relationships, and constraints before storing any data.
  • Offers multiple storage backends (in-memory, SQL Server, Neo4j, CosmosDB) and an MCP server exposing 20+ tools for repository management, entity creation, and semantic search.
In-site article

Show HN: Web tools an AI agent pays for per call in USDC, no API key (x402+MCP)

Superhighway is a web-search API designed for AI agents: no signup, no API key, and agents pay per call (0.001 USDC each) via the x402 protocol on Base. The API returns clean JSON results, and the entire payment flow is automated, requiring zero human involvement.

  • AI agents can autonomously use Superhighway's search API without human intervention.
  • No signup or API key required; payments are made in USDC on Base via the x402 protocol.
In-site article

The AI Agents Stack (2026 Edition)

This article updates the 2024 AI agents stack diagram, introducing a six-layer architecture for 2026: Models & Inference, Protocols & Tools, Memory & Knowledge, Frameworks & SDKs, Eval & Observability, and more. Key changes include MCP standardization, reasoning models, and memory as a first-class primitive. It offers honest takes and guidance on evaluating each layer.

  • The AI agents stack has evolved significantly from 2024 to 2026, with MCP becoming the standard protocol and reasoning models transforming agent capabilities.
  • The six layers are Models & Inference, Protocols & Tools, Memory & Knowledge, Frameworks & SDKs, Eval & Observability, and an emerging layer.
In-site article

CogCore: An API-native TypeScript runtime for building agents

CogCore is a lightweight TypeScript runtime library that helps developers build AI agents around their application APIs. It offers model role separation, tool calling, worker agents, sandboxed execution, skill learning, and more, allowing apps to integrate AI capabilities securely while retaining their own UI, data, permissions, and release workflow.

  • CogCore embeds as a runtime library in existing TypeScript apps, not a framework replacement.
  • Supports model role separation: chat, execute, and text can use different models for cost and quality optimization.
In-site article

Substrate vs. Broker: Two Emerging Strategies for Enterprise AI

Salesforce and SAP adopted opposite strategies for enterprise AI using the same MCP protocol: Salesforce exposes its platform as a substrate for any agent to call directly, while SAP forces external agents to go through its first-party agent Joule. This divergence has profound implications for security, liability, pricing, and the entire enterprise software ecosystem.

  • Salesforce's Headless 360 exposes CRM capabilities as MCP tools for any agent to invoke, shifting to consumption-based pricing.
  • SAP's MCP Gateway requires external agents to interact via its first-party agent Joule, prioritizing agent-to-agent (A2A) protocol for control.
In-site article

Show HN: AI Boost – an MCP for accessing your everyday patterns

AI Boost is an MCP server that lets developers capture, index, and automatically inject their coding patterns and conventions into future LLM agent sessions. It emphasizes privacy and community sharing.

  • AI Boost captures expertise from developer sessions and creates private 'boosters'.
  • Boosters are indexed via semantic and keyword search and proactively suggested in future sessions.
In-site article

Show HN: Context Mode Insight – observability layer for AI coding agents

Context Mode Insight is an observability platform for enterprise AI engineering, built on an open-source plugin trusted by 250K+ developers. It supports 14 AI assistants, analyzes 222 patterns, and provides role-aware insights via a privacy-first design. The paid tier ($20/seat/month) offers org-level dashboards, REST API, and remote MCP for agents, addressing needs of CTOs, EMs, CISOs, and more.

  • Context Mode Insight is the first observability layer for AI coding agents, priced at $20 per seat per month.
  • Built on an open-source plugin with 250K+ developers, supporting 14 AI assistants and 222 patterns.
In-site article

Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents

Moonshot AI has launched Kimi Code CLI, an open-source terminal AI coding agent built with TypeScript. It features subagents for parallel tasks, MCP configuration, video input, and lifecycle hooks. The tool is MIT-licensed and supports Kimi models or other providers.

  • Kimi Code CLI is an MIT-licensed terminal AI coding agent from Moonshot AI.
  • Built in TypeScript, it offers subagents (coder, explore, plan) and MCP configuration.
In-site article

Claude-tinderbox: Search your Claude.ai conversation history locally via MCP

A personal project called tinderbox allows users to export Claude.ai conversations, index them locally, and search them from any Claude session via an MCP server. Supports hybrid retrieval, Supabase storage, and Ollama embeddings.

  • Export Claude.ai conversation ZIPs, automatically parse and ingest them
  • Hybrid semantic + full-text search over messages and artifacts
In-site article

Show HN: Apple Contacts MCP – Local AI Access to macOS Contacts

A new local-first MCP server enables AI agents to safely search, edit, and maintain Apple Contacts on macOS using AppleScript automation, with dry-run writes and privacy-first design.

  • Provides tools like search_contacts, create_contact, update_contact with dry-run by default
  • Requires macOS with Contacts.app and Node.js 18+
In-site article

OpenAI Codex Tech Lead Does AI-Assisted Engineering

Michael Bolin, OpenAI Codex Tech Lead, shares his simple and straightforward AI-assisted engineering workflow: write spec, simple prompt, review code. He uses Notion docs for requirements, leverages Codex's Notion connector to auto-read context, breaks work into right-sized PRs, and lets Codex handle merge conflicts and CI babysitting. The approach emphasizes code review quality and fast iteration.

  • Workflow: write spec → simple prompt → review code
  • Uses Notion docs for requirements, Codex reads directly
In-site article

agentgateway Joins AAIF as an Open Gateway for Agentic AI Infrastructure

agentgateway, a unified open source gateway for AI and agent workloads, has joined the Agentic AI Foundation (AAIF) under the Linux Foundation as its fourth hosted project. It manages MCP, A2A, LLM inference, HTTP, and gRPC traffic through a single plane, providing security, observability, routing, and governance.

  • agentgateway becomes the fourth AAIF-hosted project under the Linux Foundation.
  • Offers a unified control and data plane for MCP, A2A, LLM, HTTP, and gRPC traffic.
In-site article

AI assistant shouldn't have your passwords

Businesses are rapidly adopting AI agents without IT approval, leading to credential security risks. Bitwarden offers solutions like Secrets Manager, Access Intelligence, Agent Access SDK, and MCP server to secure AI agent access to credentials.

  • Shadow AI poses credential security risks as employees deploy unvetted AI agents.
  • Over-scoped access, unapproved actions, and data leakage are key dangers.
In-site article

Show HN: Nexus, ask AI about sensitive spreadsheets locally

Nexus is a local-first open-source tool that lets AI agents (like Claude Code) query and manipulate local spreadsheets (CSV, XLSX, SQLite, Google Sheets) without uploading data to the cloud. It exposes data via MCP protocol, supports non-destructive derivations (views, branches, snapshots), and includes an optional semantic reading layer called Iris.

  • Supports CSV, XLSX, SQLite, and Google Sheets as input sources.
  • Exposes data via MCP server for local AI agent querying and manipulation.
In-site article

Show HN: MCP for the ChatGPT Ads API – Query ChatGPT Ads from Claude and Codex

A new open-source MCP server enables AI assistants like Claude and Codex to read OpenAI Ads data through natural language queries. The read-only tool supports 11 API endpoints for campaigns, ad groups, ads, and insights. Works with Claude Desktop, Cursor, VS Code, and other MCP clients. Requires Node.js 20+ and an OpenAI Ads API key.

  • MCP server for OpenAI Ads API allows AI assistants to query ads data using natural language.
  • Read-only design: view campaigns, ad groups, ads, and performance insights; no write operations.
In-site article

Gate – deterministic PII redaction for AI agent tool output (Rust)

Gate is a deterministic PII redaction tool written in Rust for AI agents. It uses regex, column heuristics, and Luhn checks instead of LLMs. It intercepts Bash commands and MCP tool calls via hooks, supports multiple harnesses, and provides scanning, real-time redaction, and aggregated reporting while keeping data local.

  • Gate uses deterministic methods (regex, column heuristics, Luhn) for PII redaction, not LLMs, ensuring consistent results and low latency.
  • It intercepts agent Bash commands and MCP tool calls via hooks, automatically rewriting commands to add a redaction layer.
In-site article

Show HN: Lookspan – local-first observability for AI agents (npx lookspan)

Lookspan is a local-first observability dashboard for AI agents, supporting MCP, LangGraph, CrewAI, and OpenTelemetry. All data stays in local SQLite, no cloud required. Features include real-time tracing, cost tracking, alerts, replay evaluation, and dataset experiments. Launch with one command.

  • Local-first: data never leaves your machine, zero infrastructure cost
  • Supports multiple AI agent frameworks including MCP, LangGraph, CrewAI, and OpenTelemetry
In-site article

DNS-AID

DNS-AID leverages existing DNS infrastructure for AI agent discovery, publishing, and verification without new overlay networks, using DNSSEC for trust. It supports MCP, A2A, and HTTPS, and integrates into existing DNS zones.

  • Uses DNS instead of centralized registries for AI agent discovery.
  • End-to-end cryptographic verification via DNSSEC.
In-site article

Bring Databricks into Kiro IDE with the AI Dev Kit Power

This article outlines two methods to connect Kiro IDE to Databricks: a quick setup using four managed MCP servers, or a one-click install via the Databricks AI Dev Kit Power. Both approaches leverage Unity Catalog metadata to ground AI assistants, ensuring they only access authorized data and reducing hallucinations. The article also highlights Databricks' advantages for AI-assisted development, including unified governance, a single data copy, and an integrated AI stack.

  • Two integration paths: lightweight MCP server setup (Path A) and comprehensive AI Dev Kit Power (Path B).
  • Both paths inherit Unity Catalog permissions, ensuring accurate and secure AI queries.
In-site article

Five Levels of Adding AI to Your SaaS App

A practical framework for moving from simple SaaS to an AI-native platform, outlining five levels of AI integration: from MCP server and personal access tokens to embedded chat, conversation history, custom UI generation, and finally an agentic harness with planning and scheduling. The author shares insights from building multiple internal agents and retrofitting AI into existing flows.

  • Level 1: Expose API endpoints via MCP server without UI changes. Build prompt library and evals.
  • Level 2: Embed AI chat window in the dashboard with streaming and page context.
In-site article

I built an API that stops AI hallucinating colour

Colour Memory API provides deterministic colour matching, search, and brand audit tools based on historical archives, with REST and MCP interfaces, no LLM required for core functions.

  • Matches hex values to over 19,000 archived colours with CIEDE2000 distance, Lab/LCh metrics, WCAG contrast, and provenance.
  • Offers 65 REST endpoints and MCP tools for LLMs, design systems, and automated workflows.
In-site article

Composer: Multiplayer Markdown for Humans and AI Agents

Composer is a real-time multiplayer markdown editor that enables people and AI agents to work side-by-side on documents, with features like real-time editing, comments, suggestions, and agent collaboration via MCP.

  • Real-time multiplayer markdown editor for humans and AI agents.
  • Agents can access documents, reply to comments, and post suggestions via MCP.
In-site article

An MCP tool that lets ChatGPT check if a store is AI-readable

BridgeToAgent launches an MCP connector enabling AI assistants to check a store's AI-readiness and simulate agent shopping.

  • BridgeToAgent's MCP tool lets AI agents evaluate store AI-readiness with a score from 0 to 100.
  • The connector works with Claude, ChatGPT, and Cursor, and requires no API key to set up.
In-site article

Adding MCP Tools to Reachy Mini

Reachy Mini can now use remote tools hosted in Hugging Face Spaces via MCP, allowing it to check weather or search the web with a single command. The article covers built-in tools, profile-based control, tool installation, naming conventions, and current limitations.

  • Add remote MCP tools from Hugging Face Hub with one command.
  • Remote tools are managed alongside built-in tools via profiles (tools.txt).
In-site article

Scholar Sidekick: Citation Verifier for 'Real DOI, Wrong Paper'

Scholar Sidekick is a free tool that generates formatted citations from DOIs, PubMed IDs, arXiv IDs, and more. It also verifies citations, checks open access, and retraction status. It offers a REST API and MCP server for developers and AI agents.

  • Supports 8 identifier types and 10,000+ CSL citation styles
  • Includes tools to verify citations, check retractions, and detect open access
In-site article

Building a secure auth code flow setup using AgentCore Gateway with MCP clients

This post demonstrates how to implement Open Authorization (OAuth) Code flow as an inbound authorization mechanism for MCP servers hosted on Amazon Bedrock AgentCore Gateway. By the end of this guide, you will have a production-ready setup where each AI assistant request is authenticated with a valid user identity token issued from your organization’s identity provider.

  • Implement OAuth authorization code flow as inbound authorization for MCP servers.
  • Configure identity provider and AgentCore Gateway.
In-site article

Personalizing Genie Code with instructions, skills, memory, and MCP

Databricks introduces new personalization features for Genie Code, including custom instructions, skills, and MCP servers, enabling the AI coding assistant to better align with team coding standards, internal workflows, and external tools. Instructions set global preferences, skills capture repeatable workflows, and MCP servers connect to systems like Jira, GitHub, and Google Drive.

  • Genie Code personalization with instructions, skills, and MCP servers adapts to team workflows.
  • Custom instructions set global preferences; skills provide task-specific guidance; MCP connects external tools.
In-site article

Equibles – Open-source, self-hosted mini Bloomberg Terminal for AI agents

Equibles is an open-source, self-hosted mini Bloomberg Terminal designed for AI agents. It scrapes, stores, and serves SEC filings, institutional holdings, insider trading, congressional trades, short data, economic indicators, and daily stock prices, and exposes them via MCP for direct AI assistant queries.

  • Open-source, self-hosted financial data aggregation platform
  • Covers 8+ data categories including SEC filings, holdings, insider trading
In-site article

Extending MCP support for Amazon Bedrock AgentCore Gateway

Amazon Bedrock AgentCore Gateway has been extended with new capabilities to support enterprise deployments of Model Context Protocol (MCP) servers, including enhanced tool schema support, dynamic listing, streaming, session management, and OAuth 2.0 delegated authentication, providing centralized governance, security, and observability.

  • AgentCore Gateway centralizes credential management, policy enforcement, and observability for MCP servers.
  • New capabilities include extended MCP tool schema support, and MCP prompts and resources as first-class primitives.
In-site article

Paste MCP & AI Tools

An infinite clipboard tool for Claude, Codex, and other AI assistants.

  • Provides unlimited clipboard functionality
  • Supports Claude, Codex, and other AI tools
In-site article

Amazon Quick integration with time-series databases for market intelligence using MCP

This post walks through a practical implementation using KDB-X MCP server integration with Amazon Quick, demonstrating how traders and analysts can ask questions using conversational language and receive actionable insights from datasets. The integration pattern applies to various domains, from financial market analysis to IoT sensor monitoring to DevOps performance dashboards.

  • Amazon Quick integrates with MCP to eliminate complex database queries for time-series data.
  • The KDB-X MCP server is deployed on EC2 and connected via Amazon Bedrock AgentCore Gateway.
In-site article

Uselink: Turn HTML into a Controlled Link with Comments and Permissions

Uselink is a newly launched tool that converts any HTML or Markdown into a link you control. Paste a document, get a URL under your handle, decide who can read and who can comment. Readers can reply in threads without an account, which Google Docs and Notion can't do. Interactive HTML actually runs. It supports password, expiry, view limits, and versioning. New features include a visual editor and an MCP server for AI tools to publish automatically.

  • Uselink converts HTML or Markdown into a controlled link with comment and permission settings
  • Readers can comment without an account, solving a pain point with Google Docs and Notion
In-site article

MAVEN: Improving Generalization in Agentic Tool Calling

MAVEN (Modular Agentic Verification and Execution Network) is a lightweight symbolic reasoning scaffold designed to enhance generalization in tool-calling environments through structured decomposition, adaptive tool orchestration, and intermediate verification. On the MAVEN-Bench stress test, MAVEN improves the GPT-OSS-120b base model from 48% to 71% accuracy without additional training, using an open-weight backbone at roughly 1/10 the cost of proprietary baselines.

  • MAVEN is a lightweight symbolic reasoning scaffold for improving generalization in agentic tool calling.
  • On MAVEN-Bench, MAVEN boosts GPT-OSS-120b accuracy from 48% to 71% without extra training.
In-site article

Moxie Docs

Moxie Docs creates living documentation for GitHub repos, provides MCP context for AI agents, a searchable workspace, and PR checks to keep docs honest.

  • Indexes GitHub repos once
  • Provides repo context to AI agents via MCP
In-site article

New AI Agent Architecture to fix LLM deviations and token costs

Botcircuits is an open-source AI agent that combines LLM step reasoning with a deterministic state machine, enabling token-efficient and predictable multi-step automation. It features a CLI, workflow system with natural language authoring, skills, and MCP support.

  • Uses a state machine to control workflow, reducing LLM deviations and token costs.
  • Supports natural language authoring of workflows via CLI slash commands.
In-site article

More growth tags

MCP AI News | AI News Hub