AI News HubLIVE
站内改写5 min read

Show HN: Cortex – local-first encrypted memory for AI agents (Rust, MCP)

Cortex is a local-first, encrypted memory engine for AI agents written in Rust. It features a four-tier memory model (working, episodic, semantic, procedural), Bayesian belief system, people graph, and sub-millisecond performance. All data runs locally with no cloud dependency, zero cost, and MIT open-source license. It significantly outperforms Mem0 and OpenAI Memory in privacy, latency, features, and cost.

SourceHacker News AIAuthor: gambletan

Notifications You must be signed in to change notification settings

Fork 2

Star 16

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

198 Commits

198 Commits

.claude

.claude

.github

.github

bench

bench

configs

configs

cortex-core

cortex-core

cortex-http

cortex-http

cortex-mcp-server

cortex-mcp-server

cortex-python

cortex-python

cortex-wasm

cortex-wasm

docs

docs

marketing

marketing

npm

npm

obsidian-plugin

obsidian-plugin

openclaw-plugin

openclaw-plugin

scripts

scripts

$TMPDB4

$TMPDB4

$TMPDB4-shm

$TMPDB4-shm

$TMPDB4-wal

$TMPDB4-wal

.dockerignore

.dockerignore

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

CONTRIBUTING.md

CONTRIBUTING.md

Cargo.lock

Cargo.lock

Cargo.toml

Cargo.toml

DEPLOYMENT_CHECKLIST_ITER14.md

DEPLOYMENT_CHECKLIST_ITER14.md

Dockerfile

Dockerfile

LICENSE

LICENSE

QUICK_START_PROMPT.md

QUICK_START_PROMPT.md

README.md

README.md

README_CN.md

README_CN.md

README_JA.md

README_JA.md

README_KO.md

README_KO.md

SECURITY.md

SECURITY.md

SECURITY_AUDIT_ITER_10.md

SECURITY_AUDIT_ITER_10.md

SECURITY_AUDIT_ITER_11.md

SECURITY_AUDIT_ITER_11.md

SECURITY_AUDIT_ITER_12.md

SECURITY_AUDIT_ITER_12.md

SECURITY_AUDIT_ITER_13.md

SECURITY_AUDIT_ITER_13.md

SECURITY_AUDIT_ITER_14.md

SECURITY_AUDIT_ITER_14.md

SECURITY_AUDIT_ITER_28.md

SECURITY_AUDIT_ITER_28.md

install.sh

install.sh

server.json

server.json

timing_leak_findings.json

timing_leak_findings.json

x-article.md

x-article.md

x-community-post.md

x-community-post.md

x-integration-guide.md

x-integration-guide.md

x-thread-cross-device-memory.md

x-thread-cross-device-memory.md

x-v2.2-security-launch.md

x-v2.2-security-launch.md

Repository files navigation

🧠 Try Cortex in your browser — zero install, 124KB WASM, runs entirely client-side.

If Cortex helps your AI remember, give it a ⭐ — it takes 1 second and helps others discover the project.

中文 | 日本語 | 한국어

Private. Free. Local. — Memory engine for personal AI agents.

Your AI's memory lives on your device — your data never leaves, never costs, never spies. Pure Rust. 3.8MB binary. No third-party servers in the data path, zero telemetry, zero cost. Syncs through your own cloud storage. (On-device semantic search downloads a ~30MB model once on first use, then runs fully offline — or go 100% offline with CORTEX_NO_EMBEDDINGS=1. See Security & Privacy.)

Philosophy: Your memories are yours — not a cloud provider's training data, not a startup's monetization asset, not a government's surveillance target. Cortex runs 100% on your hardware, stores everything in your own database, and syncs only through your own cloud storage (iCloud, Google Drive, OneDrive, Dropbox). No middleman ever sees your data. No API key required. No account to create. Just plug it into your AI agent and it remembers — privately, permanently, and at sub-millisecond speed.

LLMs start blank every session. Your assistant forgets your name, your preferences, the conversation you had yesterday, the decision you made last week. Current "memory" solutions are flat text files, keyword grep, or cloud APIs that add 200-500ms latency, charge you for the privilege, and send your personal data to someone else's server.

Cortex fixes this. It gives your AI a structured, queryable, self-evolving long-term memory that persists across sessions, channels, and contexts — with Bayesian beliefs that self-correct, a people graph that resolves identities across platforms, and sub-millisecond performance on everything. All running locally, all yours.

Cortex vs Mem0 vs OpenAI Memory

Cortex Mem0 OpenAI Memory

Privacy 100% local, zero cloud Cloud API (your data on their servers) OpenAI servers

Latency 156µs ingest, 568µs search ~200-500ms ~300-800ms

Cost Free, forever $99+/mo (Pro) ChatGPT Plus ($20/mo)

Memory tiers 4 (Working/Episodic/Semantic/Procedural) 1 (flat) 1 (flat)

Bayesian beliefs Self-correcting with evidence No No

People graph Cross-channel identity resolution Paid tier only No

Conversation compression Automatic session summarization No No

Relationship inference Pattern-based (EN + CN) No No

Temporal retrieval Intent-aware ("recently" / "first time") No No

Contradiction detection Automatic with confidence scores No No

Consolidation Episodic → Semantic auto-promotion No No

Context injection Token-budgeted LLM-ready output Manual Automatic but opaque

Import/Export Full JSON backup & restore API only No export

Self-hosted Native binary, Docker, MCP Cloud only Cloud only

Binary size 3.8 MB npm package N/A

Dependencies 0 runtime services (single binary) Node.js + cloud N/A

Open source MIT Partial No

Encryption AES-256-GCM encrypted sync (opt-in) No No

Key rotation Versioned envelopes, forward secrecy No No

Privacy levels Private (default, never syncs) / Shared / Public — per-memory opt-in, demote retracts from other devices No No

Tool authorization Deny-by-default capability policy on the MCP surface No No

Zero telemetry No analytics, no phone-home, verifiable Unknown No

Cost Free forever, unlimited $99+/mo (Pro) $20/mo (Plus)

Chinese NLP Native (inference, retrieval, relationships) No Limited

Namespace isolation Per-user/context memory separation No No

Plugin system Compile-time hooks for ingest/retrieve/consolidation No No

MCP tools 30 tools for Claude/LLM integration 3rd party N/A

Performance Benchmarks

Operation Cortex Mem0 (cloud) File-based

Ingest 156µs ~200ms ~1ms

Search (top-10) 568µs ~300ms ~10ms

Context generation 621µs ~500ms manual

Belief update 66µs N/A N/A

People graph 51µs paid tier N/A

Structured facts 45µs N/A N/A

1K memories search 1.6ms ~500ms ~50ms

528x faster than Mem0 cloud. With features neither Mem0 nor OpenAI Memory offer.

Note: Benchmarks include proactive inference (auto-extracting facts, preferences, relationships) on every ingest. Raw ingest without inference is ~15µs. Numbers from cargo bench on M-series Mac.

LoCoMo Benchmark (ACL 2024)

Academic-grade long-term conversation memory evaluation — 10 conversations, 1540 QA pairs across 4 categories.

System Single-hop Multi-hop Open-domain Temporal Overall

Backboard 89.4% 75.0% 91.2% 91.9% 90.0%

MemMachine v0.2 — — — — 84.9%

Cortex v1.7 72.5% 59.5% 88.8% 74.1% 73.7%

Mem0-Graph 65.7% 47.2% 75.7% 58.1% 68.4%

Mem0 67.1% 51.2% 72.9% 55.5% 66.9%

OpenAI Memory — — — — 52.9%

Key findings:

Open-domain 88.8% — leads Mem0 (72.9%) by +15.9%

Temporal 74.1% — leads Mem0 (55.5%) by +18.6%

Single-hop 72.5% — leads Mem0 (67.1%) by +5.4%

Multi-hop 59.5% — leads Mem0 (51.2%) by +8.3%

Overall 73.7% — beats Mem0 (66.9%) by +6.8%, beats OpenAI Memory (52.9%) by +20.8%

Cortex outperforms Mem0 on all 4 categories — while running 100% locally, end-to-end encrypted, at $0 cost.

Setup: Claude Sonnet 4 (QA + judge), nomic-embed-text (embeddings via Ollama), top-30 retrieval. Fully reproducible: python3 bench/locomo_bench.py

Architecture

Cortex implements a 4-tier memory model inspired by human cognition:

+---------------------+ | Working Memory | Current session context +---------------------+ | +---------------------+ | Episodic Memory | Raw experiences: conversations, events, observations +---------------------+ | consolidation (decay, promotion, pattern extraction) +---------------------+ | Semantic Memory | Distilled facts, preferences, relationships +---------------------+ | +---------------------+ | Procedural Memory | Learned routines, user-specific workflows +---------------------+

Working holds the current session scratch pad. Episodic stores raw experiences with timestamps and source metadata. The Consolidation Engine periodically promotes recurring patterns into Semantic facts and decays stale episodes. Procedural captures learned workflows and routines.

Key Components

People Graph

Cross-channel identity resolution. The same person messaging you on Telegram, emailing you, and showing up in calendar events gets unified into a single identity node. Interactions, relationship strength, and communication patterns are tracked per-person.

Bayesian Belief System

Self-correcting understanding of the world. Beliefs are formed from evidence, updated with each new observation, and can be contradicted. Confidence scores reflect actual certainty rather than recency bias.

cortex.observe_belief("user_prefers_morning_meetings", true, 0.8)?; cortex.observe_belief("user_prefers_morning_meetings", false, 0.6)?; // Confidence adjusts automatically via Bayesian update

Consolidation Engine

Episodic-to-semantic promotion, decay of stale memories, and pattern extraction. Runs as a background cycle that keeps the memory store lean and queryable. Returns a report of what was promoted, decayed, and merged.

Multi-signal Retrieval

Queries combine five signals for relevance ranking:

Similarity -- vector cosine distance against query embedding

Temporal -- recency weighting with configurable decay

Salience -- importance scoring from access patterns and explicit hints

Social -- boost for memories involving specific people

Channel -- filter or boost by source channel

Context Injection Protocol

Generates LLM-ready context strings from memory state. Pass a token budget, optional channel/person filters, and get back a structured text block your LLM can consume directly.

Storage

SQLite for persistence, in-memory vector index for fast similarity search. Single-file database, no external services required. Designed for edge deployment -- runs on a laptop, a Raspberry Pi, or a server.

Cloud Sync

Sync memories across devices through your own cloud storage — no third-party server involved.

Device A (Mac) Your Cloud Storage Device B (iPhone) ┌──────────┐ ┌──────────────────────┐ ┌──────────┐ │ SQLite DB │ ──W──> │ iCloud / GDrive / │ │ │ └──────────┘ └──────────────────────┘ └──────────┘

Changelog-based: Each device writes append-only operation logs to its own subfolder

No conflicts: Devices never write to the same file. Merge uses Last-Writer-Wins with Hybrid Logical Clocks

Encrypted: AES-256-GCM encryption (opt-in). Even if your cloud account is compromised, memories stay private

Tamper-evident: the sync manifest and every operation carry an HMAC; tampered or plaintext-injected oplog lines are rejected, and a manifest without integrity protection refuses to load (no key-rollback path)

Key rotation & forward secrecy: rotate to a new key version (ENC2 envelopes) without re-encrypting history; old versions stay readable, new writes are unreadable to a leaked old key

Privacy-aware, per-memory opt-in: Private memories (the default) never leave your device. Mark a memory shared to sync it; demote it back to private and a retraction deletes it from your other devices (local copy kept)

Survives restarts: sync settings persist in the database (passphrase never touches disk — macOS login Keychain or CORTEX_SYNC_PASSPHRASE); the server resumes sync and starts background pull (30s poll + fs watcher) automatically

Supported providers: iCloud Drive, Google Drive, OneDrive, Dropbox (auto-detected).

use cortex_core::sync::SyncConfig; use cortex_core::types::PrivacyLevel;

// Enable sync with encryption (settings persist; passphrase goes to the OS keychain) let config = SyncConfig::new(sync_dir, device_id, device_name) .with_encryption("my-strong-passphrase"); cortex.enable_sync(config)?;

// Opt a memory into sync — everything is Private unless you say otherwise cortex.set_memory_privacy(mem_id, PrivacyLevel::Shared { scope: "all".into() })?;

// Pull changes from other devices (also happens automatically

[truncated for AI cost control]