2026-06-03 10:45 UTCIn-site rewrite5 min readUpdated: 2026-06-30 13:03 UTC

Which LLM Memory for AI Agents?

An analysis of top GitHub projects for AI agent memory, covering mem0, MemPalace, and others, with architectural comparisons and strengths/limitations.

SourceHacker News AIAuthor: grigio

Executive Summary

Project Breakdowns

mem0ai/mem0 (⭐57.3k)

MemPalace/mempalace (⭐53.2k)

Lum1104/Understand-Anything (⭐47.8k)

pingcap/tidb (⭐40.1k)

volcengine/OpenViking (⭐25k)

supermemoryai/supermemory (⭐23.5k)

humanlayer/12-factor-agents (⭐22.9k)

rohitg00/agentmemory (⭐20.3k)

memvid/memvid (⭐15.6k)

vectorize-io/hindsight (⭐15.4k)

Cross-Cutting Analysis

Conflict Resolution Taxonomy

Recommendations

Executive Summary

The GitHub memory topic spans 6,187+ public repositories — a sprawling landscape that includes system memory profilers, AI agent memory layers, distributed databases, and knowledge graphs. The top 10, however, tell a more focused story: eight out of ten are AI agent memory projects, a category that barely existed two years ago and now dominates the conversation.

What the top 10 reveal:

AI Agent Memory (8 projects): mem0, MemPalace, Understand-Anything, OpenViking, supermemory, agentmemory, memvid, hindsight

Database/Infrastructure (1): TiDB — a distributed SQL database that has repositioned itself for agentic workloads

Principles/Framework (1): 12-Factor Agents — a methodology, not software

A fundamental architectural divide runs through the ecosystem: embedded/local-first projects (MemPalace, memvid, agentmemory) keep data and inference on-device, while client-server/cloud projects (mem0, supermemory, OpenViking, hindsight) rely on remote infrastructure. A small subset — supermemory on Cloudflare, mem0 on FastAPI+Postgres — leans fully into cloud-native architectures.

Project Breakdowns

mem0ai/mem0 — Universal Memory Layer for AI Agents

Aspect Detail

Stars ⭐57,257

Language Python (53%), TypeScript (42%)

License Apache 2.0

Funding YC S24, $24M raised

Latest May 31, 2026; 326 releases

Website https://mem0.ai

Overview. mem0 positions itself as a universal memory layer for AI agents, offering multi-level memory (User/Session/Agent), graph memory support, multi-signal retrieval (semantic, BM25, entity), and integrations with over 30 vector stores. It is the most well-funded project in the space, with Y Combinator backing and a $24M raise.

Architecture & dependencies. Built on Python 3.9+ with qdrant-client, pydantic, openai, and sqlalchemy at its core. The optional ecosystem is vast: 30+ vector stores (Chroma, Pinecone, Weaviate, Milvus, pgvector, FAISS), 24+ LLM providers, 15+ embedders, and 5 rerankers. Graph memory uses Neo4j 5.x. Self-hosted deployments require FastAPI, PostgreSQL, and Docker.

Strengths.

Top benchmark scores: 91.6 LoCoMo, 94.8 LongMemEval, 64.1 BEAM

Single-pass ADD-only algorithm avoids the complexity of in-place updates

Massive provider ecosystem with no single-vendor lock-in at the infrastructure level

Multi-signal retrieval combining entity linking with temporal reasoning

Rich surface area: MCP server, browser extension, CLI, Python and TypeScript SDKs

Peer-reviewed publication at ECAI 2025

Limitations.

Requires an external LLM (defaults to OpenAI, creating a de facto dependency)

Self-hosted setup is complex — Docker, PostgreSQL, and Neo4j are all prerequisites

The pre-April 2026 algorithm was significantly less capable

Deduplication only activates with infer=True, which is easy to miss

A known issue: silent memory loss when batch embedding partially fails

Graph memory adds meaningful overhead for marginal gain in some use cases

Conflict resolution approach. mem0's architecture is fundamentally ADD-only — memories accumulate, nothing is overwritten. Conflicts are resolved at retrieval time through multi-signal ranking (semantic similarity, BM25, entity matching, temporal recency). The old algorithm used a more traditional detection → recency evaluation → explicitness check → merge-or-replace → logging pipeline. Graph memory introduces LLM-driven entity/relation extraction with duplicate merging via semantic similarity. Deduplication uses a cosine-similarity threshold controlled by the infer flag.

MemPalace/mempalace — Local-First AI Memory System

Aspect Detail

Stars ⭐53,198

Language Python (94%)

License MIT

Latest v3.3.6 (May 24, 2026)

Website https://mempalaceofficial.com

Overview. MemPalace is a local-first AI memory system inspired by the method of loci — a classical mnemonic technique. It stores content verbatim (never summarizes or lossy-compresses) and retrieves via semantic search. A knowledge graph with temporal validity, AAAK compression index, and an MCP server with 29 tools rounds out the offering.

Architecture & dependencies. Pure Python 3.9+ with ChromaDB 1.5+, huggingface_hub, and ONNX Runtime. The default multilingual embedding model is ~300 MB, with a 30 MB English-only alternative. The knowledge graph lives in SQLite. All embeddings run locally via ONNX — no API keys required.

Strengths.

Exceptional benchmark results: 96.6% R@5 raw, 98.4% hybrid, 99%+ with LLM reranking

Truly local-first: zero external API calls by default, no telemetry

Verbatim storage guarantee — never summarizes or applies lossy compression

Knowledge graph with temporal validity windows for time-aware queries

Living memory dynamics: Hebbian potentiation (strengthening frequently accessed paths) and Ebbinghaus decay (fading unused memories)

MIT license — completely free, no SaaS fees or vendor lock-in

Limitations.

Beta status (Dev Status 4) — not yet production-hardened

ChromaDB dependency introduces fragility: HNSW segment corruption and Windows deadlocks have been reported

The default embedding model is large (~300 MB)

Primarily designed for Claude Code; integrations with other tools are less mature

The palace/wing/room/drawer conceptual model has a steep learning curve

Historical silent data loss issues (now fixed, but trust takes time)

Multi-hour rebuild times on large palaces

Conflict resolution approach. Every fact carries explicit valid_from/valid_to timestamps; as_of queries return state at any point in time. Cosine-similarity deduplication (default threshold 0.15) keeps the longest version of near-duplicate entries. Entity disambiguation uses context pattern matching for ambiguous names (e.g., "Apple" the company vs. "apple" the fruit). An entity registry priority system resolves conflicts by source: Onboarding (1.0) > Learned (0.75) > Researched > Wiki Cache. Repair tools include HNSW rebuild, SQLite recovery, segment quarantine, and truncation guards. Operations are designed to be idempotent with deterministic IDs, atomic writes, and triple deduplication.

Lum1104/Understand-Anything — Interactive Knowledge Graphs

Aspect Detail

Stars ⭐47,847

Language TypeScript (70%)

License MIT

Latest v2.7.3 (May 19, 2026)

Website https://understand-anything.com

Overview. A plugin for AI coding assistants that transforms codebases into interactive knowledge graphs. It uses a multi-agent pipeline combining Tree-sitter (deterministic structural analysis) with LLM-based semantic enrichment. Strictly speaking, this is not a memory system — it is a codebase-understanding tool that uses knowledge graphs as a memory substrate.

Architecture & dependencies. Node.js 22+, pnpm 10+, web-tree-sitter (WASM) with grammars for 10+ languages. The dashboard uses React 19, @xyflow/react, graphology, d3-force, and dagre. Search relies on Fuse.js, validation on Zod.

Strengths.

Compatible with 15+ AI coding platforms (Claude Code, Cursor, Codex, Copilot)

Tree-sitter + LLM hybrid gives you a deterministic structural graph with semantic enrichment on top

Incremental updates via fingerprint-based change detection — no full rebuilds

Guided tours, diff impact analysis, business domain mapping, persona-adaptive UI

Shareable as plain JSON — team-friendly without proprietary formats

MIT license

Limitations.

Requires Node.js 22+, which is too new for some enterprise environments

LLM dependency for the semantic layer adds latency and API costs

pnpm-specific; not compatible with npm or yarn

Very new (March 2026) — limited real-world track record

No formal conflict resolution for contradictory LLM claims

Large graphs require git-lfs

Conflict resolution approach. The deterministic structural layer (Tree-sitter) produces identical output for identical input — no conflicts are possible at this level. A graph-reviewer agent validates completeness and referential integrity. Zod schema validation catches malformed graphs on load. The LLM semantic layer is purely advisory: annotations are overlaid on the deterministic skeleton, and the latest pass simply overwrites the previous one. Fingerprint-based change detection preserves validated edges during incremental updates. For accumulated issues, a --full flag triggers a complete re-analysis from scratch.

pingcap/tidb — Distributed SQL Database for Agentic Workloads

Aspect Detail

Stars ⭐40,122

Language Go (94.5%)

License Apache 2.0

Latest v8.5.6 (April 14, 2026)

Website https://pingcap.com/tidb

Overview. A cloud-native distributed SQL database with MySQL compatibility, ACID transactions, HTAP (hybrid transactional/analytical processing), vector search via HNSW, and database branching designed for AI agents. TiDB can serve all four agent memory layers — short-term, episodic, procedural, semantic — in a single system.

Architecture & dependencies. Go runtime, Bazel build system. Storage is handled by TiKV (RocksDB + Raft consensus), with TiFlash providing columnar storage via Raft Learner nodes. A Placement Driver (PD) manages TSO timestamp allocation and load balancing. Kubernetes is the standard deployment mechanism via TiDB Operator.

Strengths.

MySQL compatible — a drop-in replacement for many existing workloads

HTAP eliminates the need for separate ETL pipelines

True horizontal scaling with separated compute and storage

AI-native features: database branching, durable agent state, vector search

ACID distributed transactions via Raft + Percolator two-phase commit

Generous free serverless tier: 5 GB storage, 50M RUs per month

Limitations.

MySQL compatibility is not 100% — limited stored procedures and triggers

Distributed system complexity requires significant operational expertise

Dedicated clusters are expensive at scale

Vector search is still in beta

Self-hosted deployments need a minimum of 3 PD + 3 TiKV nodes

Conflict resolution approach. Raft consensus requires a majority quorum (2/3) for every write. Percolator two-phase commit uses TSO timestamps for global ordering. MVCC is implemented across three column families (Lock, Data, Write) with Snapshot Isolation. Both optimistic and pessimistic transaction modes are available, configurable per workload. Region splitting and rebalancing are handled automatically by the Placement Driver.

volcengine/OpenViking — Context Database for AI Agents

Aspect Detail

Stars ⭐24,987

Language Python (primary)

License AGPL-3.0

Latest v0.3.22 (May 29, 2026)

Website https://openviking.ai

Overview. OpenViking is an open-source context database for AI agents that organizes memories, resources, and skills hierarchically using a filesystem paradigm (viking:// URIs). Its tiered context loading system (L0/L1/L2) is designed for token efficiency, and directory recursive retrieval makes it easy to pull related context.

Architecture & dependencies. Python 3.10+, Rust toolchain (Cargo), C++ compiler (GCC 9+). Supports VLM providers including Volcengine Doubao, OpenAI, Kimi, GLM, Gemini, and Ollama. Embedding providers span 13+ options. Deployment via Docker or Helm on Kubernetes.

Strengths.

Up to 91% token reduction with 3.39x accuracy improvement over baseline

Sub-0.2s retrieval latency

Observable retrieval trajectories make RAG debuggable

Self-evolving memory through automatic session management

The filesystem paradigm is intuitive for developers

Published at VLDB 2026 (peer-reviewed)

Limitations.

Very early stage (v0.3.x, created January 2026)

Complex setup requiring Python, Rust, C++, and model provider accounts

AGPL-3.0 license is restrictive

[truncated for AI cost control]