Show HN: MemoryOps – governed memory infrastructure for AI assistants
MemoryOps is an enterprise-shaped, loop-engineered memory governance layer for AI assistants. It implements a governed memory lifecycle with capture, policy evaluation, typed storage, hybrid retrieval, controlled forgetting, auditability, and tenant isolation, treating memory as a governed decision system rather than a simple database.
Notifications You must be signed in to change notification settings
Fork 0
Star 5
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
22 Commits
22 Commits
.github/workflows
.github/workflows
.hermes/skills
.hermes/skills
apps/web
apps/web
docs
docs
evals
evals
infra
infra
railway
railway
scripts
scripts
services
services
.env.example
.env.example
.gitignore
.gitignore
AGENTS.md
AGENTS.md
CLAUDE.md
CLAUDE.md
CLAUDE_ENTERPRISE.md
CLAUDE_ENTERPRISE.md
CONTRIBUTING.md
CONTRIBUTING.md
README.md
README.md
RELEASING.md
RELEASING.md
SECURITY.md
SECURITY.md
docker-compose.yml
docker-compose.yml
Repository files navigation
MemoryOps AI is an enterprise-shaped, loop-engineered memory governance layer for AI assistants. It implements a ChatGPT-style memory lifecycle with capture, policy evaluation, typed storage, hybrid retrieval, controlled forgetting, auditability, and tenant isolation.
Most demos treat memory as a vector database. MemoryOps AI treats memory as governed state.
Tagline: Enterprise memory governance for AI assistants. Core claim: Memory is not a database. Memory is a governed decision system that decides what information is valuable enough to carry into the future.
Why this exists
Most AI "memory" demos do this:
chat message → vector database → retrieve later
MemoryOps AI does this:
WRITE PATH Message → Extractor → Evaluator / Policy Broker → Write Service → Typed Memory Stores → Audit Log
READ PATH Message → Retriever → Ranker → Context Composer → Response LLM
BACKGROUND Decay Job → Reflection Agent → Conflict Resolver → Compression Worker
CROSS-CUTTING PLANES Security · Governance · Observability · Evaluation · Reliability
The five verbs the system must demonstrate:
Capture → Store → Retrieve → Update → Forget (Governance wraps all five)
flowchart LR M["chat message"] --> GW["Gateway"] GW --> EX["Extractor"] --> PB["Policy Broker"] --> WS["Write Service"] --> ST[("Typed Store")] GW --> RT["Retriever"] --> RK["Ranker"] --> CC["Context Composer"] --> RESP["Response"] PB --> AUD[["Audit Log (append-only)"]] WS --> AUD ST -. background .-> BG["Decay · Reflection · Conflict · Compression"]
Loading
More diagrams (system architecture, lifecycle state machine, request sequence) are in docs/architecture.md.
Enterprise invariants
These are non-negotiable and are enforced in code and tests.
Tenant isolation — User A's memory is never returned to User B or another tenant.
Deletion guarantee — Deleted memories are never retrieved again.
Provenance — Every stored memory traces back to its source message/document/manual input.
Graceful degradation — Retrieval failure never blocks response generation.
Policy-before-storage — Unsafe / secret-like content is filtered before it reaches the store.
Temporary chat — Temporary sessions never write or retrieve memory.
Auditability — Every memory lifecycle event produces an append-only audit event.
Explainability — The system can show which memories affected a response.
Typed memory — Episodic, semantic, procedural, project, knowledge, system memories differ.
Evaluation — Memory quality is testable through a golden set, not just manual inspection.
See docs/architecture.md for the full design and where each invariant is enforced.
Repository layout
memoryops-ai/ apps/web/ Next.js frontend (chat, memories, governance, audit, loops, admin, architecture) services/api/ FastAPI backend (gateway, extractor, policy broker, write/read path, audit) services/worker/ Background jobs (decay, reflection, conflict resolution, compression) packages/shared/ Shared types infra/db/ Postgres + pgvector migrations and seed infra/adr/ Architecture Decision Records infra/observability/ OpenTelemetry / metrics notes evals/ Golden + adversarial cases and the eval runner docs/ architecture, security, governance, rollout, demo-script docker-compose.yml
Quickstart
Option A — API only, no infra (fastest)
The API ships with an in-memory repository so you can run the write path and tests without Postgres.
cd services/api python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt export MEMORYOPS_STORAGE=memory # default; uses in-memory store uvicorn app.main:app --reload --port 8000
open http://localhost:8000/docs
Run the invariant test suite:
cd services/api pip install -r requirements-dev.txt pytest -q
Run the eval harness against a running API (or in-process):
cd evals python run_evals.py
Option B — Full stack with Docker Compose
cp .env.example .env docker compose up --build
web → http://localhost:3000
api → http://localhost:8000/docs
db → localhost:5432 (postgres/pgvector)
redis→ localhost:6379
Compose runs migrations from infra/db/migrations on first boot and sets MEMORYOPS_STORAGE=postgres for the API.
Embeddings (v0.3)
Retrieval uses a swappable embedding provider. The default is a deterministic, offline stub — no API key required — so tests and demos are reproducible.
export MEMORYOPS_EMBEDDING_PROVIDER=stub # default; deterministic, no key
optional real embeddings:
export MEMORYOPS_EMBEDDING_PROVIDER=openai export OPENAI_API_KEY=sk-... export OPENAI_EMBEDDING_MODEL=text-embedding-3-small
An unconfigured or failing provider degrades to the stub, and a query-embedding failure degrades retrieval to keyword-only (retrieval_mode="fallback").
LLM provider adapters (v0.4)
Extraction and conflict detection run through a provider-neutral LLM layer (app/llm/). The default is a deterministic, offline stub — no API key — so behavior is reproducible and tests never touch the network. Optional OpenAI, Anthropic, and Gemini adapters are used only when their key is set.
export MEMORYOPS_LLM_PROVIDER=stub # default; deterministic, no key
optional real providers (used only when the key is present):
export MEMORYOPS_LLM_PROVIDER=anthropic export ANTHROPIC_API_KEY=... ANTHROPIC_MODEL=claude-haiku-4-5-20251001
also: openai (OPENAI_API_KEY/OPENAI_MODEL), gemini (GEMINI_API_KEY/GEMINI_MODEL)
export MEMORYOPS_LLM_FALLBACK_TO_HEURISTIC=true # invalid JSON / failure → heuristic
LLM output is advisory: the deterministic policy broker runs after extraction and stays authoritative — a model can never override policy, and secret-like content is still blocked. See docs/provider-llm-adapters.md, docs/structured-memory-intelligence.md, and ADR-008.
Verify enforced Row-Level Security against a running Postgres:
python scripts/check_rls_policies.py # SKIPs cleanly if no DB is reachable
Frontend
cd apps/web npm install npm run dev # http://localhost:3000
The frontend reads NEXT_PUBLIC_API_URL (defaults to http://localhost:8000).
Deployment — Railway only (v0.3.2)
MemoryOps deploys to Railway only. There is no Vercel path. One Railway project (memoryops-ai) runs five services:
Service Role Source
memoryops-web Next.js frontend apps/web/Dockerfile
memoryops-api FastAPI backend services/api/Dockerfile
memoryops-worker Background loops services/worker/Dockerfile
Railway Postgres Store + pgvector plugin
Railway Redis Queue / cache plugin
Build/deploy is config-as-code under railway/. Docs:
docs/deployment/railway.md — topology, order, rollback
docs/deployment/railway-env.md — env var matrix
docs/deployment/railway-smoke-test.md — post-deploy checks
Post-deploy verification:
python scripts/railway_smoke_test.py \ --api-url https://memoryops-api.up.railway.app \ --web-url https://memoryops-web.up.railway.app
What works today (Phase 0 + Phase 1)
Full design spine: README, architecture/security/governance/rollout docs, 5 ADRs, DB schema.
FastAPI write path: Gateway → Extractor → Policy Broker → Write Service → Memory Store → Audit.
Heuristic extractor + policy broker (works with no API keys); pluggable LLM adapter interface.
Typed memory classification, importance/confidence/sensitivity scoring, provenance capture.
Policy decisions: SAVE, PENDING_APPROVAL, BLOCK, DROP_LOW_UTILITY, UPDATE_EXISTING, MERGE_WITH_EXISTING.
Secret / PII detection blocks API keys and credentials before storage.
Append-only audit log for every lifecycle event.
Temporary chat short-circuits both read and write.
Memory dashboard + admin/audit + architecture pages (frontend skeleton).
Invariant test suite + eval harness scaffolding.
Loop Engineering Layer (v0.3.1)
MemoryOps models memory as a set of governed loops rather than a passive store.
The core loops are:
Memory Write Loop
Memory Read Loop
Governance Loop
Evaluation Loop
Release Gate Loop
Continuous Learning Loop
Each loop has explicit states, policy gates, audit events, fallback behavior, and evidence requirements. Loop definitions live in services/api/app/loops/, loop runs/events are exposed through /api/loops, and the frontend includes a Loops page.
See docs/loop-engineering.md, docs/loop-contracts.md, and docs/release-loop.md.
Token Compression Layer (v0.2.1)
MemoryOps supports an optional Headroom-powered context compression layer. Compression runs after policy checks, governance filtering, and context composition, and only on the composed context block — never the raw user message and never before the policy broker. It reduces tokens sent to the LLM while preserving MemoryOps invariants (provenance, deletion guarantee, tenant isolation, temporary-chat behavior, explainability metadata).
It is off by default and not a dependency — the app runs without headroom-ai installed, and any compression failure degrades safely to the uncompressed context.
pip install "headroom-ai[all]" # optional export MEMORYOPS_CONTEXT_COMPRESSION=headroom # default: none
Each chat response carries a compression block with estimated tokens saved and the compression ratio. See docs/token-compression.md, docs/integrations/headroom.md, and ADR-007. Headroom is Apache-2.0; MemoryOps integrates it via an adapter and does not vendor its source.
What works as of v0.3 (real data layer)
Swappable embedding provider (app/embeddings/): deterministic offline stub + optional OpenAI.
Hybrid retrieval: pgvector cosine (search_candidates) + keyword overlap, blended by the ranker.
Per-memory score_breakdown + response retrieval_mode (hybrid / fallback / none).
Enforced Postgres Row-Level Security (migration 004, FORCE + tenant policy + session GUC).
Expanded evals (semantic / keyword / archived / score-breakdown) + new tests; RLS test is DB-guarded.
What works as of v0.4 (provider LLM adapters)
Provider-neutral LLM layer (app/llm/): deterministic StubProvider default + optional OpenAI/Anthropic/Gemini adapters, selected by MEMORYOPS_LLM_PROVIDER.
Structured memory intelligence: schema-validated extraction + minimal conflict detection, with prompt registry and deterministic heuristic fallback.
Invalid JSON / provider failure / timeout degrades to the heuristic and never blocks chat; LLM output is advisory and cannot override the policy broker.
New observability events (llm_provider_call, llm_provider_failure, structured_output_invalid, llm_fallback_used, memory_extraction_structured, conflict_detection_result) + structured/conflict evals; tests need no API keys.
What works as of v0.5 (governance UI + memory control plane)
Browser control plane over the governed lifecycle: /memories (filterable inventory), /memories/[id] (detail + provenance + per-memory audit timeline + inline edit), /governance (approval queue + recorded policy decisions), /audit (tenant-wide append-only history).
Additive read routes: GET /api/memories/{id}, /{id}/provenance, /{id}/audit, plus a memory_id filter on /api/audit. Approve/reject/edit/ archive/restore/delete reuse the existing PATCH/DELETE — every action is audited and the policy broker stays authoritative.
Deletion guarantee holds in the UI: deleted memories are nev
[truncated for AI cost control]