2026-06-22 11:26 UTCIn-site rewrite5 min readUpdated: 2026-06-22 12:05 UTC

OctaMem: Auditable memory for AI agents, no vector DB to run

OctaMem provides a persistent memory layer for AI agents with three memory types (semantic, episodic, procedural), eliminating the need for vector databases. It offers auditable, role-based memory with file ingestion and supports multiple runtimes.

SourceHacker News AIAuthor: Mossiah

An infrastructure for everything your agents shouldn’t forget.

Memory is all you need.

The persistent memory layer for AI agents. Built for the stacks where forgetting isn’t an option.

Start freeTalk to sales

No card. 2 GB on the free tier.

Works with the stack you already run· OpenAI

OpenAI·

MCP·

Cursor·

LangGraph·

Vercel AI SDK

Design partnersInitial cohort in healthcare, finance, defense, and legal.

Apply to join

§ 01Problem & solution

The cost of forgetting.

Abstract

№ 01

Agents without memory fail in two ways that show up on the balance sheet: they burn tokens re-reading context, and they let hard-won institutional knowledge walk out the door. A memory layer answers both.

01The cost

Token spend compounds every turn.

Without memory, the same context is re-sent on every call. Conversations re-explain themselves, prompts balloon, and you pay frontier-model rates to re-read what the model was already told an hour ago.

The solution

Send less. Repeat nothing. Pay less.

Less context per call — only what's relevant is retrieved and injected

No repetition — facts and decisions persist instead of being re-sent

Cheaper models hold their own once the context they receive is sharper

02The leak

Institutional knowledge isn't centralised.

What your agents and teams learn lives in scattered sessions, local notes, and individual heads. When an employee leaves, it leaves with them. Nothing compounds, and nothing is owned by the organisation.

The solution

One memory the whole organisation owns.

Organisation-wide intelligence — every agent reads from one shared layer

Knowledge stays when people leave — it lives in the memory, not the person

Context compounds across teams instead of resetting every session

Memory has to be infrastructure — not a patch.

→See how the architecture solves it

§ From failure to system

§ 02The Architecture

Memory in motion.

Every request passes through the same disciplined cycle. OctaMem doesn’t fire a generic search across one bucket of text. It rebuilds context from three memory types that each serve a distinct purpose, then reassembles them for the model.

Read the technical brief

Search cycle · in motion

Stage 01 / 05

01 / Caller

App or MCP

02 / Access · quota

Security layer

03 / OctaMem agent

Retrieval service

04 / Three layers

Memory layers

05 / Back to app

Unified context

fig. 1 · Search cycle · stage 1 of 5

fig. 1 · Search · stage 1 / 5

§ 03File ingestion

Any file. Now memory.

Hand OctaMem the document itself. Contracts, decks, spreadsheets, emails, PDFs. We parse, structure, and store it as typed memory your agents can query forever.

Not embeddings of a blob. Clauses, parties, obligations.

Batch upload5 files

Avg pages40

Max file30 MB

RetentionConfigurable

Drop the file. Memory does the rest.

contract-v3.pdfPDF

Master Services Agreementparties · term · obligations

Parsed memory record

contract-v3.pdf

Master Services Agreement, v3 · executed 2026-04-12

›parties: Acme Corp, OctaMem Inc.

›term: 24 months, auto-renew 12

›obligations: 99.9% uptime SLA, 30-day deletion

Searchable across the account under previous_context: legal-msas.

01 / category

Documents

Contracts, briefs, reports, knowledge bases.

Supported

.pdf

.docx

.docm

.dotx

.dotm

.odt

.rtf

.txt

02 / category

Spreadsheets

Tables, ledgers, datasets — reasoned over rows.

Supported

.xlsx

.xls

.csv

03 / category

Presentations

Decks parsed slide by slide. Factual recall, not pixels.

Supported

.pptx

04 / category

Email & Data

Threads, payloads, structured exports.

Supported

.eml

.json

§ From input to inheritance

§ 04Compounding intelligence

Intelligence that compounds.

Every session without memory is a reset. Every session with memory is an upgrade.

Day 1

Recognition

Day 30

Pattern awareness

Day 180

Operational depth

fig. 2 · capability over time. Day 360 is off this chart, the curve keeps climbing.

0Day 1

Recognition.

Names, preferences, initial constraints. Conversations feel slightly personalized. The kind a thoughtful intern manages on day one.

0Day 30

Pattern awareness.

The agent remembers your decisions, avoids past mistakes, and follows your workflows without repeated instruction. Fewer questions, fewer corrections.

0Day 180

Operational depth.

Deep institutional context. The agent operates with continuity across teams, releases, and tools. A system of record your AI can actually use.

Day 360 isn’t on the chart. The curve keeps climbing.

+ Compounds with every session

One memory layer. Two paths.

Start on the general cloud, or run on a vertical-specific memory cloud tuned to your sector’s schemas, policies, and compliance posture.

Search pathreads

One memory layer

Ssemanticfacts & knowledge

Eepisodicevents & history

Pproceduralworkflows & rules

writesAdd path

fig. 3 · search reads, add writes, same three layers.

01 / FACTS & KNOWLEDGES

Semantic.

Stable knowledge, preferences, account facts, business rules, domain context. Relationship-aware retrieval over a graph store.

CycleRead-heavy

02 / EVENTS & HISTORYE

Episodic.

What happened, when, and why. Time-based retrieval gives the model a sense of history, ordered and explainable.

CycleAppend-only

03 / WORKFLOWS & RULESP

Procedural.

How work should be done, escalation paths, compliance workflows, recurring routines. Retrieved by intent.

CycleVersioned

Foundation

General Memory Cloud.

Sector-agnostic persistent memory for any agent workflow. Model-agnostic. Protocol-native.

Cross-model memory under one account

High-recall retrieval and context rebuild

API and MCP-compatible access

Semantic, episodic, and procedural memory types

Granular deletion controls

Start building

Vertical clouds

In progress

Specialized Memory Clouds.

In progress

Domain-aware memory structures, sector-specific behavioural models, and compliance-aligned memory policies.

Everything in General Memory Cloud

Domain-specific memory schemas and retrieval

Sector-aware behavioural continuity

Policy-bound enforcement and guardrails

Vertical-optimized context rebuild

Priority support and deployment options

Healthcare, Legal, and Defense memory clouds in development.

§ 06Use cases

Built for real systems.

The same memory layer, accessed however your team already builds. No bespoke vertical stack. No rewrite. The platform shapes to the workflow, not the other way around.

Coverage at a glance

Verticals

Healthcare, finance, defense, public sector.

Runtimes

REST, MCP, SDKs, IDE plugins.

Memory layer

Unified across stacks.

Stack rewrites

Drop in through existing interfaces.

Enterprise verticals

Healthcare

Patient continuity across visits. Treatment history that persists across care teams and sessions.

Legal

Case memory and precedent tracking. Client interaction continuity across matters.

Finance

Portfolio context, trade history, and risk awareness that compounds across agent sessions.

Insurance

Claims history and policy context. Adjuster memory that carries across every touchpoint.

Defense

Mission context that persists across briefings, operations, and multi-agent coordination.

Technology

Product context, customer success history, and engineering knowledge that carries across teams and releases.

Retail & logistics

Inventory, fulfillment, and partner memory across channels, warehouses, and agent-assisted operations.

And more…

Energy, media, public sector, and other high-stakes domains — we'll shape memory around your workflows.

ii.

Builder workflows

REST API

Docs →

Direct HTTP endpoints. Full control without MCP or an SDK.

MCP Server

Docs →

Remote MCP — use OctaMem from any MCP-compatible assistant or agent.

Claude Desktop

Docs →

Connectors in Settings, or config file with Node on Mac and Windows.

Cursor

Docs →

Tools & MCP in Cursor settings, or mcp.json — Mac, Windows, and Linux.

Claude.ai (Web)

Docs →

Custom MCP connector URL in the web app.

OpenClaw

Docs →

Plugin with auto-recall and capture for open agent stacks.

Python SDK

Docs →

pip install octamem — typed client for scripts and services.

JavaScript SDK

Docs →

npm install @octamem/octamem-js — for Node, browsers, Deno, and Bun.

§ From market to stack

§ 07For the enterprise

Built for the high-stakes stack.

When memory integrity matters, when decisions need traceability, when continuity is not optional. OctaMem is the layer your security, compliance, and infrastructure teams will actually approve.

Trust & securityTalk to enterprise

Policy

Policy-aware memory.

Agents respect organizational rules, constraints, and boundaries embedded in the memory layer. Not in the prompt, not in the model.

Role-based access · Scoped retrieval · Tenant isolation

Audit

Audit-ready continuity.

Every memory write and read is traceable. Full lineage from source document to model output, with cryptographic integrity for regulated environments.

Immutable audit log · Source attribution · Retention policies

Access

Role-based memory access.

Teams control who sees what. Memory isolation between departments, projects, and roles, enforced at the storage layer, not the application.

SSO · SCIM provisioning · Department scopes

Deploy

Deploy where you must.

Cloud, private cloud, or on-premise. Single-tenant, dedicated keys, and customer-managed encryption available for the highest-stakes environments.

Cloud · VPC · On-prem · BYO-KMS

§ 08In practice

Same memory. Five runtimes.

The full integration. No vector DB to operate. No embedding pipeline to maintain. No chunking. OctaMem holds the memory; you keep your stack — Python, JavaScript, REST, or MCP.

›add(). Capture a memory with its previous context.

›get() / search(). Recall it from any agent, any session.

›MCP. Same operations as tool-calls in any MCP-compatible client.

Read the docsSDK reference

quickstart.py · python

from octamem import OctaMem

Your API key from platform.octamem.com.

client = OctaMem(api_key="sk-om-live-...")

Capture a memory.

client.add( content="Beta opens March 20.", previous_context="Q1 product launch", )

Recall it later, possibly from a different agent.

results = client.get( query="When does beta open?", previous_context="Q1 product launch", ) print(results)

response · memory.search()200 · application/json

{ "results": [ { "id": "rec_01HV4Z…", "type": "semantic", "score": 0.94, "content": "Beta opens March 20.", "source": "planning_doc_q1", "created_at": "2026-02-14T09:12Z" }, { "id": "rec_01HV7M…", "type": "episodic", "score": 0.88, "content": "Approved Q1 scope reduction on 2026-02-09." }, { "id": "rec_01HV9F…", "type": "procedural", "score": 0.81 } ], "tokens": 642, "previous_context": "Q1 product launch" }

Source-linked. Every record carries id, type, score, content, and source — auditable end-to-end, deletable by id or by previous_context.

p50 retrieve

84ms

Edge-cached search across all three memory layers, single region.

p99 retrieve

210ms

Worst-case end-to-end, cold cache, with reranking.

Write ack

32ms

Synchronous acknowledgement before async embedding & indexing.

§ From code to control

§ 09Trust & control

Your memory. Your control.

Memory is sensitive. See what is stored, keep it structured and traceable, and delete it whenever you want. No opaque embeddings. No locked-in vendor format.

Audit chain

Every memory action leaves a mark.

Reads, writes, redactions, and policy checks are chained together so the record can be inspected after the fact.

Active event

context.delivered

hash: sha256:ad72f9019c

EVT_4182prev: 0b91ce774a

recall.requested

agent:legal-copilotacme/legal/msas

EVT_4183prev: 8f4a2c91b0

policy.checked

policy:contract-scoperedact: pricing / pii

EVT_4184prev: 1c68bd044e

context.delivered

octamem:renderer642 tokens / 7 sources

EVT_4185prev: ad72f9019c

memory.captured

agent:legal-copilotretention: 365 days

Visibl

[truncated for AI cost control]