OctaMem: Auditable memory for AI agents, no vector DB to run
OctaMem provides a persistent memory layer for AI agents with three memory types (semantic, episodic, procedural), eliminating the need for vector databases. It offers auditable, role-based memory with file ingestion and supports multiple runtimes.
An infrastructure for everything your agents shouldn’t forget.
Memory is all you need.
The persistent memory layer for AI agents. Built for the stacks where forgetting isn’t an option.
Start freeTalk to sales
No card. 2 GB on the free tier.
Works with the stack you already run· OpenAI
OpenAI·
MCP·
Cursor·
LangGraph·
Vercel AI SDK
Design partnersInitial cohort in healthcare, finance, defense, and legal.
Apply to join
§ 01Problem & solution
The cost of forgetting.
Abstract
№ 01
Agents without memory fail in two ways that show up on the balance sheet: they burn tokens re-reading context, and they let hard-won institutional knowledge walk out the door. A memory layer answers both.
01The cost
Token spend compounds every turn.
Without memory, the same context is re-sent on every call. Conversations re-explain themselves, prompts balloon, and you pay frontier-model rates to re-read what the model was already told an hour ago.
The solution
Send less. Repeat nothing. Pay less.
Less context per call — only what's relevant is retrieved and injected
No repetition — facts and decisions persist instead of being re-sent
Cheaper models hold their own once the context they receive is sharper
02The leak
Institutional knowledge isn't centralised.
What your agents and teams learn lives in scattered sessions, local notes, and individual heads. When an employee leaves, it leaves with them. Nothing compounds, and nothing is owned by the organisation.
The solution
One memory the whole organisation owns.
Organisation-wide intelligence — every agent reads from one shared layer
Knowledge stays when people leave — it lives in the memory, not the person
Context compounds across teams instead of resetting every session
Memory has to be infrastructure — not a patch.
→See how the architecture solves it
§ From failure to system
§ 02The Architecture
Memory in motion.
Every request passes through the same disciplined cycle. OctaMem doesn’t fire a generic search across one bucket of text. It rebuilds context from three memory types that each serve a distinct purpose, then reassembles them for the model.
Read the technical brief
Search cycle · in motion
Stage 01 / 05
01 / Caller
App or MCP
02 / Access · quota
Security layer
03 / OctaMem agent
Retrieval service
04 / Three layers
Memory layers
05 / Back to app
Unified context
fig. 1 · Search cycle · stage 1 of 5
fig. 1 · Search · stage 1 / 5
§ 03File ingestion
Any file. Now memory.
Hand OctaMem the document itself. Contracts, decks, spreadsheets, emails, PDFs. We parse, structure, and store it as typed memory your agents can query forever.
Not embeddings of a blob. Clauses, parties, obligations.
Batch upload5 files
Avg pages40
Max file30 MB
RetentionConfigurable
Drop the file. Memory does the rest.
contract-v3.pdfPDF
Master Services Agreementparties · term · obligations
Parsed memory record
contract-v3.pdf
Master Services Agreement, v3 · executed 2026-04-12
›parties: Acme Corp, OctaMem Inc.
›term: 24 months, auto-renew 12
›obligations: 99.9% uptime SLA, 30-day deletion
Searchable across the account under previous_context: legal-msas.
01 / category
Documents
Contracts, briefs, reports, knowledge bases.
Supported
.docx
.docm
.dotx
.dotm
.odt
.rtf
.txt
02 / category
Spreadsheets
Tables, ledgers, datasets — reasoned over rows.
Supported
.xlsx
.xls
.csv
03 / category
Presentations
Decks parsed slide by slide. Factual recall, not pixels.
Supported
.pptx
04 / category
Email & Data
Threads, payloads, structured exports.
Supported
.eml
.json
§ From input to inheritance
§ 04Compounding intelligence
Intelligence that compounds.
Every session without memory is a reset. Every session with memory is an upgrade.
Day 1
Recognition
Day 30
Pattern awareness
Day 180
Operational depth
fig. 2 · capability over time. Day 360 is off this chart, the curve keeps climbing.
0Day 1
Recognition.
Names, preferences, initial constraints. Conversations feel slightly personalized. The kind a thoughtful intern manages on day one.
0Day 30
Pattern awareness.
The agent remembers your decisions, avoids past mistakes, and follows your workflows without repeated instruction. Fewer questions, fewer corrections.
0Day 180
Operational depth.
Deep institutional context. The agent operates with continuity across teams, releases, and tools. A system of record your AI can actually use.
Day 360 isn’t on the chart. The curve keeps climbing.
+ Compounds with every session
One memory layer. Two paths.
Start on the general cloud, or run on a vertical-specific memory cloud tuned to your sector’s schemas, policies, and compliance posture.
Search pathreads
One memory layer
Ssemanticfacts & knowledge
Eepisodicevents & history
Pproceduralworkflows & rules
writesAdd path
fig. 3 · search reads, add writes, same three layers.
01 / FACTS & KNOWLEDGES
Semantic.
Stable knowledge, preferences, account facts, business rules, domain context. Relationship-aware retrieval over a graph store.
CycleRead-heavy
02 / EVENTS & HISTORYE
Episodic.
What happened, when, and why. Time-based retrieval gives the model a sense of history, ordered and explainable.
CycleAppend-only
03 / WORKFLOWS & RULESP
Procedural.
How work should be done, escalation paths, compliance workflows, recurring routines. Retrieved by intent.
CycleVersioned
Foundation
General Memory Cloud.
Sector-agnostic persistent memory for any agent workflow. Model-agnostic. Protocol-native.
Cross-model memory under one account
High-recall retrieval and context rebuild
API and MCP-compatible access
Semantic, episodic, and procedural memory types
Granular deletion controls
Start building
Vertical clouds
In progress
Specialized Memory Clouds.
In progress
Domain-aware memory structures, sector-specific behavioural models, and compliance-aligned memory policies.
Everything in General Memory Cloud
Domain-specific memory schemas and retrieval
Sector-aware behavioural continuity
Policy-bound enforcement and guardrails
Vertical-optimized context rebuild
Priority support and deployment options
Healthcare, Legal, and Defense memory clouds in development.
§ 06Use cases
Built for real systems.
The same memory layer, accessed however your team already builds. No bespoke vertical stack. No rewrite. The platform shapes to the workflow, not the other way around.
Coverage at a glance
Verticals
08
Healthcare, finance, defense, public sector.
Runtimes
08
REST, MCP, SDKs, IDE plugins.
Memory layer
01
Unified across stacks.
Stack rewrites
00
Drop in through existing interfaces.
i.
Enterprise verticals
01
Healthcare
Patient continuity across visits. Treatment history that persists across care teams and sessions.
02
Legal
Case memory and precedent tracking. Client interaction continuity across matters.
03
Finance
Portfolio context, trade history, and risk awareness that compounds across agent sessions.
04
Insurance
Claims history and policy context. Adjuster memory that carries across every touchpoint.
05
Defense
Mission context that persists across briefings, operations, and multi-agent coordination.
06
Technology
Product context, customer success history, and engineering knowledge that carries across teams and releases.
07
Retail & logistics
Inventory, fulfillment, and partner memory across channels, warehouses, and agent-assisted operations.
08
And more…
Energy, media, public sector, and other high-stakes domains — we'll shape memory around your workflows.
ii.
Builder workflows
REST API
Docs →
Direct HTTP endpoints. Full control without MCP or an SDK.
MCP Server
Docs →
Remote MCP — use OctaMem from any MCP-compatible assistant or agent.
Claude Desktop
Docs →
Connectors in Settings, or config file with Node on Mac and Windows.
Cursor
Docs →
Tools & MCP in Cursor settings, or mcp.json — Mac, Windows, and Linux.
Claude.ai (Web)
Docs →
Custom MCP connector URL in the web app.
OpenClaw
Docs →
Plugin with auto-recall and capture for open agent stacks.
Python SDK
Docs →
pip install octamem — typed client for scripts and services.
JavaScript SDK
Docs →
npm install @octamem/octamem-js — for Node, browsers, Deno, and Bun.
§ From market to stack
§ 07For the enterprise
Built for the high-stakes stack.
When memory integrity matters, when decisions need traceability, when continuity is not optional. OctaMem is the layer your security, compliance, and infrastructure teams will actually approve.
Trust & securityTalk to enterprise
Policy
Policy-aware memory.
Agents respect organizational rules, constraints, and boundaries embedded in the memory layer. Not in the prompt, not in the model.
Role-based access · Scoped retrieval · Tenant isolation
Audit
Audit-ready continuity.
Every memory write and read is traceable. Full lineage from source document to model output, with cryptographic integrity for regulated environments.
Immutable audit log · Source attribution · Retention policies
Access
Role-based memory access.
Teams control who sees what. Memory isolation between departments, projects, and roles, enforced at the storage layer, not the application.
SSO · SCIM provisioning · Department scopes
Deploy
Deploy where you must.
Cloud, private cloud, or on-premise. Single-tenant, dedicated keys, and customer-managed encryption available for the highest-stakes environments.
Cloud · VPC · On-prem · BYO-KMS
§ 08In practice
Same memory. Five runtimes.
The full integration. No vector DB to operate. No embedding pipeline to maintain. No chunking. OctaMem holds the memory; you keep your stack — Python, JavaScript, REST, or MCP.
›add(). Capture a memory with its previous context.
›get() / search(). Recall it from any agent, any session.
›MCP. Same operations as tool-calls in any MCP-compatible client.
Read the docsSDK reference
quickstart.py · python
from octamem import OctaMem
Your API key from platform.octamem.com.
client = OctaMem(api_key="sk-om-live-...")
Capture a memory.
client.add( content="Beta opens March 20.", previous_context="Q1 product launch", )
Recall it later, possibly from a different agent.
results = client.get( query="When does beta open?", previous_context="Q1 product launch", ) print(results)
response · memory.search()200 · application/json
{ "results": [ { "id": "rec_01HV4Z…", "type": "semantic", "score": 0.94, "content": "Beta opens March 20.", "source": "planning_doc_q1", "created_at": "2026-02-14T09:12Z" }, { "id": "rec_01HV7M…", "type": "episodic", "score": 0.88, "content": "Approved Q1 scope reduction on 2026-02-09." }, { "id": "rec_01HV9F…", "type": "procedural", "score": 0.81 } ], "tokens": 642, "previous_context": "Q1 product launch" }
Source-linked. Every record carries id, type, score, content, and source — auditable end-to-end, deletable by id or by previous_context.
p50 retrieve
84ms
Edge-cached search across all three memory layers, single region.
p99 retrieve
210ms
Worst-case end-to-end, cold cache, with reranking.
Write ack
32ms
Synchronous acknowledgement before async embedding & indexing.
§ From code to control
§ 09Trust & control
Your memory. Your control.
Memory is sensitive. See what is stored, keep it structured and traceable, and delete it whenever you want. No opaque embeddings. No locked-in vendor format.
Audit chain
Every memory action leaves a mark.
Reads, writes, redactions, and policy checks are chained together so the record can be inspected after the fact.
Active event
context.delivered
hash: sha256:ad72f9019c
EVT_4182prev: 0b91ce774a
recall.requested
agent:legal-copilotacme/legal/msas
EVT_4183prev: 8f4a2c91b0
policy.checked
policy:contract-scoperedact: pricing / pii
EVT_4184prev: 1c68bd044e
context.delivered
octamem:renderer642 tokens / 7 sources
EVT_4185prev: ad72f9019c
memory.captured
agent:legal-copilotretention: 365 days
i.
Visibl
[truncated for AI cost control]