Show HN: Cortex – local-first encrypted memory for AI agents (Rust, MCP)
Cortex is a local-first, encrypted memory engine for AI agents written in Rust. It features a four-tier memory model (working, episodic, semantic, procedural), Bayesian belief system, people graph, and sub-millisecond performance. All data runs locally with no cloud dependency, zero cost, and MIT open-source license. It significantly outperforms Mem0 and OpenAI Memory in privacy, latency, features, and cost.
Notifications You must be signed in to change notification settings
Fork 2
Star 16
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
198 Commits
198 Commits
.claude
.claude
.github
.github
bench
bench
configs
configs
cortex-core
cortex-core
cortex-http
cortex-http
cortex-mcp-server
cortex-mcp-server
cortex-python
cortex-python
cortex-wasm
cortex-wasm
docs
docs
marketing
marketing
npm
npm
obsidian-plugin
obsidian-plugin
openclaw-plugin
openclaw-plugin
scripts
scripts
$TMPDB4
$TMPDB4
$TMPDB4-shm
$TMPDB4-shm
$TMPDB4-wal
$TMPDB4-wal
.dockerignore
.dockerignore
.gitignore
.gitignore
CHANGELOG.md
CHANGELOG.md
CODE_OF_CONDUCT.md
CODE_OF_CONDUCT.md
CONTRIBUTING.md
CONTRIBUTING.md
Cargo.lock
Cargo.lock
Cargo.toml
Cargo.toml
DEPLOYMENT_CHECKLIST_ITER14.md
DEPLOYMENT_CHECKLIST_ITER14.md
Dockerfile
Dockerfile
LICENSE
LICENSE
QUICK_START_PROMPT.md
QUICK_START_PROMPT.md
README.md
README.md
README_CN.md
README_CN.md
README_JA.md
README_JA.md
README_KO.md
README_KO.md
SECURITY.md
SECURITY.md
SECURITY_AUDIT_ITER_10.md
SECURITY_AUDIT_ITER_10.md
SECURITY_AUDIT_ITER_11.md
SECURITY_AUDIT_ITER_11.md
SECURITY_AUDIT_ITER_12.md
SECURITY_AUDIT_ITER_12.md
SECURITY_AUDIT_ITER_13.md
SECURITY_AUDIT_ITER_13.md
SECURITY_AUDIT_ITER_14.md
SECURITY_AUDIT_ITER_14.md
SECURITY_AUDIT_ITER_28.md
SECURITY_AUDIT_ITER_28.md
install.sh
install.sh
server.json
server.json
timing_leak_findings.json
timing_leak_findings.json
x-article.md
x-article.md
x-community-post.md
x-community-post.md
x-integration-guide.md
x-integration-guide.md
x-thread-cross-device-memory.md
x-thread-cross-device-memory.md
x-v2.2-security-launch.md
x-v2.2-security-launch.md
Repository files navigation
🧠 Try Cortex in your browser — zero install, 124KB WASM, runs entirely client-side.
If Cortex helps your AI remember, give it a ⭐ — it takes 1 second and helps others discover the project.
中文 | 日本語 | 한국어
Private. Free. Local. — Memory engine for personal AI agents.
Your AI's memory lives on your device — your data never leaves, never costs, never spies. Pure Rust. 3.8MB binary. No third-party servers in the data path, zero telemetry, zero cost. Syncs through your own cloud storage. (On-device semantic search downloads a ~30MB model once on first use, then runs fully offline — or go 100% offline with CORTEX_NO_EMBEDDINGS=1. See Security & Privacy.)
Philosophy: Your memories are yours — not a cloud provider's training data, not a startup's monetization asset, not a government's surveillance target. Cortex runs 100% on your hardware, stores everything in your own database, and syncs only through your own cloud storage (iCloud, Google Drive, OneDrive, Dropbox). No middleman ever sees your data. No API key required. No account to create. Just plug it into your AI agent and it remembers — privately, permanently, and at sub-millisecond speed.
LLMs start blank every session. Your assistant forgets your name, your preferences, the conversation you had yesterday, the decision you made last week. Current "memory" solutions are flat text files, keyword grep, or cloud APIs that add 200-500ms latency, charge you for the privilege, and send your personal data to someone else's server.
Cortex fixes this. It gives your AI a structured, queryable, self-evolving long-term memory that persists across sessions, channels, and contexts — with Bayesian beliefs that self-correct, a people graph that resolves identities across platforms, and sub-millisecond performance on everything. All running locally, all yours.
Cortex vs Mem0 vs OpenAI Memory
Cortex Mem0 OpenAI Memory
Privacy 100% local, zero cloud Cloud API (your data on their servers) OpenAI servers
Latency 156µs ingest, 568µs search ~200-500ms ~300-800ms
Cost Free, forever $99+/mo (Pro) ChatGPT Plus ($20/mo)
Memory tiers 4 (Working/Episodic/Semantic/Procedural) 1 (flat) 1 (flat)
Bayesian beliefs Self-correcting with evidence No No
People graph Cross-channel identity resolution Paid tier only No
Conversation compression Automatic session summarization No No
Relationship inference Pattern-based (EN + CN) No No
Temporal retrieval Intent-aware ("recently" / "first time") No No
Contradiction detection Automatic with confidence scores No No
Consolidation Episodic → Semantic auto-promotion No No
Context injection Token-budgeted LLM-ready output Manual Automatic but opaque
Import/Export Full JSON backup & restore API only No export
Self-hosted Native binary, Docker, MCP Cloud only Cloud only
Binary size 3.8 MB npm package N/A
Dependencies 0 runtime services (single binary) Node.js + cloud N/A
Open source MIT Partial No
Encryption AES-256-GCM encrypted sync (opt-in) No No
Key rotation Versioned envelopes, forward secrecy No No
Privacy levels Private (default, never syncs) / Shared / Public — per-memory opt-in, demote retracts from other devices No No
Tool authorization Deny-by-default capability policy on the MCP surface No No
Zero telemetry No analytics, no phone-home, verifiable Unknown No
Cost Free forever, unlimited $99+/mo (Pro) $20/mo (Plus)
Chinese NLP Native (inference, retrieval, relationships) No Limited
Namespace isolation Per-user/context memory separation No No
Plugin system Compile-time hooks for ingest/retrieve/consolidation No No
MCP tools 30 tools for Claude/LLM integration 3rd party N/A
Performance Benchmarks
Operation Cortex Mem0 (cloud) File-based
Ingest 156µs ~200ms ~1ms
Search (top-10) 568µs ~300ms ~10ms
Context generation 621µs ~500ms manual
Belief update 66µs N/A N/A
People graph 51µs paid tier N/A
Structured facts 45µs N/A N/A
1K memories search 1.6ms ~500ms ~50ms
528x faster than Mem0 cloud. With features neither Mem0 nor OpenAI Memory offer.
Note: Benchmarks include proactive inference (auto-extracting facts, preferences, relationships) on every ingest. Raw ingest without inference is ~15µs. Numbers from cargo bench on M-series Mac.
LoCoMo Benchmark (ACL 2024)
Academic-grade long-term conversation memory evaluation — 10 conversations, 1540 QA pairs across 4 categories.
System Single-hop Multi-hop Open-domain Temporal Overall
Backboard 89.4% 75.0% 91.2% 91.9% 90.0%
MemMachine v0.2 — — — — 84.9%
Cortex v1.7 72.5% 59.5% 88.8% 74.1% 73.7%
Mem0-Graph 65.7% 47.2% 75.7% 58.1% 68.4%
Mem0 67.1% 51.2% 72.9% 55.5% 66.9%
OpenAI Memory — — — — 52.9%
Key findings:
Open-domain 88.8% — leads Mem0 (72.9%) by +15.9%
Temporal 74.1% — leads Mem0 (55.5%) by +18.6%
Single-hop 72.5% — leads Mem0 (67.1%) by +5.4%
Multi-hop 59.5% — leads Mem0 (51.2%) by +8.3%
Overall 73.7% — beats Mem0 (66.9%) by +6.8%, beats OpenAI Memory (52.9%) by +20.8%
Cortex outperforms Mem0 on all 4 categories — while running 100% locally, end-to-end encrypted, at $0 cost.
Setup: Claude Sonnet 4 (QA + judge), nomic-embed-text (embeddings via Ollama), top-30 retrieval. Fully reproducible: python3 bench/locomo_bench.py
Architecture
Cortex implements a 4-tier memory model inspired by human cognition:
+---------------------+ | Working Memory | Current session context +---------------------+ | +---------------------+ | Episodic Memory | Raw experiences: conversations, events, observations +---------------------+ | consolidation (decay, promotion, pattern extraction) +---------------------+ | Semantic Memory | Distilled facts, preferences, relationships +---------------------+ | +---------------------+ | Procedural Memory | Learned routines, user-specific workflows +---------------------+
Working holds the current session scratch pad. Episodic stores raw experiences with timestamps and source metadata. The Consolidation Engine periodically promotes recurring patterns into Semantic facts and decays stale episodes. Procedural captures learned workflows and routines.
Key Components
People Graph
Cross-channel identity resolution. The same person messaging you on Telegram, emailing you, and showing up in calendar events gets unified into a single identity node. Interactions, relationship strength, and communication patterns are tracked per-person.
Bayesian Belief System
Self-correcting understanding of the world. Beliefs are formed from evidence, updated with each new observation, and can be contradicted. Confidence scores reflect actual certainty rather than recency bias.
cortex.observe_belief("user_prefers_morning_meetings", true, 0.8)?; cortex.observe_belief("user_prefers_morning_meetings", false, 0.6)?; // Confidence adjusts automatically via Bayesian update
Consolidation Engine
Episodic-to-semantic promotion, decay of stale memories, and pattern extraction. Runs as a background cycle that keeps the memory store lean and queryable. Returns a report of what was promoted, decayed, and merged.
Multi-signal Retrieval
Queries combine five signals for relevance ranking:
Similarity -- vector cosine distance against query embedding
Temporal -- recency weighting with configurable decay
Salience -- importance scoring from access patterns and explicit hints
Social -- boost for memories involving specific people
Channel -- filter or boost by source channel
Context Injection Protocol
Generates LLM-ready context strings from memory state. Pass a token budget, optional channel/person filters, and get back a structured text block your LLM can consume directly.
Storage
SQLite for persistence, in-memory vector index for fast similarity search. Single-file database, no external services required. Designed for edge deployment -- runs on a laptop, a Raspberry Pi, or a server.
Cloud Sync
Sync memories across devices through your own cloud storage — no third-party server involved.
Device A (Mac) Your Cloud Storage Device B (iPhone) ┌──────────┐ ┌──────────────────────┐ ┌──────────┐ │ SQLite DB │ ──W──> │ iCloud / GDrive / │ │ │ └──────────┘ └──────────────────────┘ └──────────┘
Changelog-based: Each device writes append-only operation logs to its own subfolder
No conflicts: Devices never write to the same file. Merge uses Last-Writer-Wins with Hybrid Logical Clocks
Encrypted: AES-256-GCM encryption (opt-in). Even if your cloud account is compromised, memories stay private
Tamper-evident: the sync manifest and every operation carry an HMAC; tampered or plaintext-injected oplog lines are rejected, and a manifest without integrity protection refuses to load (no key-rollback path)
Key rotation & forward secrecy: rotate to a new key version (ENC2 envelopes) without re-encrypting history; old versions stay readable, new writes are unreadable to a leaked old key
Privacy-aware, per-memory opt-in: Private memories (the default) never leave your device. Mark a memory shared to sync it; demote it back to private and a retraction deletes it from your other devices (local copy kept)
Survives restarts: sync settings persist in the database (passphrase never touches disk — macOS login Keychain or CORTEX_SYNC_PASSPHRASE); the server resumes sync and starts background pull (30s poll + fs watcher) automatically
Supported providers: iCloud Drive, Google Drive, OneDrive, Dropbox (auto-detected).
use cortex_core::sync::SyncConfig; use cortex_core::types::PrivacyLevel;
// Enable sync with encryption (settings persist; passphrase goes to the OS keychain) let config = SyncConfig::new(sync_dir, device_id, device_name) .with_encryption("my-strong-passphrase"); cortex.enable_sync(config)?;
// Opt a memory into sync — everything is Private unless you say otherwise cortex.set_memory_privacy(mem_id, PrivacyLevel::Shared { scope: "all".into() })?;
// Pull changes from other devices (also happens automatically
[truncated for AI cost control]