Show HN: Mneme HQ – repo-native architectural rules for AI coding agents
Mneme HQ provides architectural governance for AI-assisted development by enforcing constraints before code generation, preventing architectural drift and reducing review overhead. It integrates directly into the AI coding agent workflow, blocking banned frameworks, cross-boundary calls, and superseded decisions before they reach the PR queue.
Article intelligence
Key points
- Enforces architectural rules before AI agents generate code, stopping violations at the source
- Works with major AI coding assistants and agent frameworks
- Blocks unauthorized framework introduction, cross-boundary calls, and ADR conflicts automatically
- Acts as a pre-generation governance layer, complementing post-generation observability tools
Why it matters
This matters because enforces architectural rules before AI agents generate code, stopping violations at the source.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Mneme HQ is the architectural governance layer for AI-assisted development. It compiles your team's architectural intent into enforceable constraints that govern AI coding agents at the pre-generation stage, before architectural drift reaches review. Rules files document standards. Memory tools recall context. RAG retrieves knowledge. Mneme governs implementation.
Architectural governance for AI-assisted development
Govern AI coding agents before they generate the code.
Stop architectural drift before it reaches review. Mneme catches violations at the moment AI generates code — so your standards are enforced, not just documented.
Block banned frameworks, cross-boundary calls, and superseded decisions before generation
No re-prompting — constraints apply on every call, every session, across every agent
Surface violations before the PR, not during it — cut review overhead at the source
Works with direct API integrations, coding assistants, agent frameworks, and managed agent platforms.
Request pilot access Walk the flagship demos View on GitHub
Works with
Claude Code
Cursor
GitHub Actions
GitHub Copilot
Windsurf
OpenAI
Aider
+ more →
The bottleneck
AI increased code output. Review capacity did not.
Coding assistants generate code faster than teams can review it.
But review bandwidth has not increased.
That means more surface area to validate, more architectural drift to catch, and more governance pushed downstream into PR review.
AI agents do not just create more code. They expose intent debt: undocumented, stale, or unenforced architectural decisions that human reviewers used to catch manually.
The issue is not model quality. It is that coding agents do not retain your architectural decisions by default.
Throughput vs. review capacity · 2023–2026
More PR Surface Area
AI increases the amount of code reviewers must validate per change.
Reactive Governance
Architectural violations are caught after generation, during review.
Session Amnesia
Coding agents forget prior decisions unless re-prompted every time.
Where Mneme sits
Adjacent tools solve adjacent problems.
Mneme is not a memory tool, not a rules file, and not a RAG system. Each of those exists for a reason. None of them govern implementation.
Rules files document standards.
Mneme enforces them.
Memory tools recall context.
Mneme governs implementation.
RAG retrieves knowledge.
Mneme operationalizes decisions.
The AI coding governance stack
Pre-generation governance
Mneme. Compiles architectural intent into enforceable constraints before the agent generates code.
Generation and runtime
Agent frameworks and runtime harnesses. Cursor, Claude Code, agent platforms.
Post-generation observability
Tools like SentRux. Detect violations after the agent has acted.
SentRux tells you when the agent violated architecture. Mneme helps prevent the violation from being proposed in the first place. The two layers are complementary.
How it works
Five stages. No vector store. No ML.
Where Mneme sits · generative AI software engineering stack
07 Human oversight review · approvals
06 Validation & eval benchmarks · tracing
05 Governance & control Mneme HQ
04 Tooling & execution MCP · CI/CD · shells
03 Agent runtime LangGraph · Claude Code
02 Context & retrieval RAG · vectors · memory
01 Foundation models OpenAI · Anthropic · Gemini
Almost everyone is competing in layers 01–03. Mneme is layer 05 — the governance layer above the agent runtime. Read the full layer-by-layer breakdown →
project_memory.json → MemoryStore → Retriever → ContextBuilder → LLMAdapter → Evaluator
1
Load
Your decisions become durable rules. Engineers edit a JSON file once; Mneme loads it on every call — no re-prompting, no session amnesia.
2
Retrieve
The right rules reach the agent every time. Deterministic scoring means the same task always surfaces the same constraints — no probabilistic gaps, no missed standards.
3
Build
Only relevant constraints reach the agent. A targeted packet keeps latency low and prevents rule dilution — the agent gets what applies, not everything you've ever decided.
4
Inject
Every AI call runs under your standards. The context packet is injected as the system prompt before generation — regardless of agent, IDE, or platform.
5
Evaluate
Violations surface before review, not during it. Responses are scored against the injected constraints — giving you a blocking gate before code reaches your PR queue.
Why Existing Approaches Fail
Every current approach shares a common flaw: none of them enforce decisions before the model writes the code.
Approach Why It Breaks at Scale Mneme HQ
Rules Files Static, manually maintained, silently ignored by tools
Deterministic pre-generation enforcement. Structured decisions with a precedence engine, scope-aware retrieval, and hook-level blocking.
Prompt Templates Drift between sessions, omitted by integrators, inconsistent across agents
RAG / Vector Search Probabilistic retrieval, no authority model, no enforcement
Code Review Reactive, linear capacity, too late to prevent architectural debt
Why RAG fails → Why code review doesn't scale →
What Mneme prevents
Concrete violations, not abstract rules.
Mneme injects your team's architectural decisions into AI-assisted generation. Below is what that catches in practice — the kinds of changes an agent will otherwise ship, because nothing told it not to.
Example scenario
A developer asks Claude Code to add analytics to a checkout route. The agent proposes importing the BigQuery client directly into the frontend service — violating your layered architecture decision that data-platform calls belong in a backend service only.
Mneme detects the cross-boundary call before generation completes. The violation is flagged and blocked — the agent never writes the code, and nothing reaches your PR queue.
Unauthorized framework introduction
Redux pulled into a Zustand-standardized app. Banned ORM imported into a service that already chose another.
Cross-boundary architecture violations
BigQuery client instantiated inside a frontend route. Business logic dropped into a controller. Layering decisions ignored.
ADR supersession conflicts
Celery re-introduced after the team moved to Pub/Sub. Old decisions reappearing because the agent didn't see the new one.
Restricted path modifications
Codegen agent writing to db/prod/migrations/*. Billing agent touching the auth package.
Security policy violations
Raw SQL string concatenation. Mock auth shipped in production paths. Credentials handled outside the approved surface.
Non-approved dependency usage
GPL packages added to a license-restricted repo. Internal-only libraries imported into externally-shipped services.
See all twelve examples across five governance categories →
Operational proof
Three flagship demos. One worldview.
Each flagship is a different manifestation of the same structural problem: AI accelerates entropy, review does not scale linearly with AI output, drift compounds. Together they sell the category, not a feature. Each ships with a runnable example that drives real Mneme enforcement against scripted diffs — deterministic, no LLM call required.
Flagship 01 · Centerpiece Runnable Mneme dogfoods this
The ADR compiler — turn architectural decisions into infrastructure
Most teams already have ADRs. They sit in docs/adr/ and are quietly ignored by every AI coding agent. The compiler reads the same files, parses an optional ## Constraints section, and emits enforceable, precedence-aware decisions that govern generation and CI. No rewrite.
Walk through the compiler →
Flagship 02 Runnable
Architectural drift prevention — the AI SDLC entropy demo
Six-step timeline. An agent proposes reasonable-looking code that violates ADR-001. Three downstream changes amplify the divergence. A reviewer would plausibly miss it. Mneme blocks the first divergence upstream and the system converges instead of forking.
Walk the timeline →
Flagship 03 Forward-looking Runnable
Governance continuity across multiple actors
Three actors act sequentially against the same codebase with no shared memory. The compiled corpus is the only thing they share. The architectural invariants stay coherent because they live outside any single actor — in the layer the governance evaluates against.
See the governance trace →
All three flagships, supporting enforcement examples, and operational evidence on the demo hub →
Works with
Model-agnostic. Agent-agnostic.
Frontier and open-weight models. IDE agents, CLI agents, and orchestration frameworks. The decision corpus is the constant; everything upstream of it can change.
Models
OpenAI, Anthropic, Gemini, Llama, Qwen, DeepSeek, Mistral — direct APIs and OpenAI-compatible endpoints.
Coding agents
Claude Code & Cursor (native). Copilot, Aider, Cline, OpenHands designed-to-support.
Frameworks & CI
LangGraph, CrewAI, AutoGen, OpenAI Agents SDK. GitHub Actions (native), self-hosted runners.
Full compatibility surface → · Native integrations →
Get started
Running in under two minutes.
install
$ git clone https://github.com/TheoV823/mneme $ cd mneme $ pip install -e .
run demo
Runs the before/after demo without an API key
$ python demo.py --dry-run
governance gate (CI)
$ mneme check --memory .mneme/project_memory.json \ --input pr.diff --query "$PR_TITLE" --mode strict
Full CLI reference → · Run the benchmark → · Python API →
Vision & roadmap
Building the governance layer for AI-assisted development.
Mneme is evolving from local governance tooling into the governance infrastructure layer for AI-assisted software development. As coding workflows mature, teams will need more than prompt files to maintain architectural consistency at scale.
Phase 1 — Current
OSS Developer Wedge
Architectural governance for individual developers and early engineering adopters.
Phase 2
Team Governance Layer
Shared policy and decision stores for teams adopting AI-assisted development.
Phase 3
Agent Platform Integrations
Governance for enterprise agent workflows and managed coding platforms.
Phase 4
Governance Infrastructure
Policy-as-code enforcement and drift analytics across engineering organizations.
See full roadmap →
Frequently asked
Common questions.
What is Mneme HQ?
Mneme HQ is the architectural governance layer for AI-assisted development. It compiles architectural intent into enforceable constraints that govern AI coding agents before code is generated. As agent platforms proliferate, governance becomes infrastructure, and Mneme is positioned as the pre-generation governance layer of that stack.
How is Mneme different from Cursor Rules or CLAUDE.md?
Rules files document standards. Mneme enforces them. Cursor Rules and CLAUDE.md are prompt files that describe preferences to the model. Mneme is a governance layer that compiles architectural decisions into enforceable constraints, retrieves them at prompt time based on what the agent is doing, and validates outputs against them.
How is Mneme different from RAG or vector databases?
RAG retrieves knowledge. Mneme operationalizes decisions. RAG systems surface documents that the model may or may not act on. Mneme compiles architectural decisions into structured rules and evaluates AI-generated code against them. There is no embedding model, no vector store, and no probabilistic retrieval in the governance path.
How is Mneme different from observability tools like SentRux?
SentRux tells you when the agent violated architecture. Mneme helps prevent the violation from being proposed in the first place. Pre-generation governance and post-generation observability are complementary layer
[truncated for AI cost control]