AI News HubLIVE
站内改写

Show HN: Mneme HQ – repo-native architectural rules for AI coding agents

Mneme HQ provides architectural governance for AI-assisted development by enforcing constraints before code generation, preventing architectural drift and reducing review overhead. It integrates directly into the AI coding agent workflow, blocking banned frameworks, cross-boundary calls, and superseded decisions before they reach the PR queue.

Article intelligence

EngineersIntermediate

Key points

  • Enforces architectural rules before AI agents generate code, stopping violations at the source
  • Works with major AI coding assistants and agent frameworks
  • Blocks unauthorized framework introduction, cross-boundary calls, and ADR conflicts automatically
  • Acts as a pre-generation governance layer, complementing post-generation observability tools

Why it matters

This matters because enforces architectural rules before AI agents generate code, stopping violations at the source.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

Mneme HQ is the architectural governance layer for AI-assisted development. It compiles your team's architectural intent into enforceable constraints that govern AI coding agents at the pre-generation stage, before architectural drift reaches review. Rules files document standards. Memory tools recall context. RAG retrieves knowledge. Mneme governs implementation.

Architectural governance for AI-assisted development

Govern AI coding agents before they generate the code.

Stop architectural drift before it reaches review. Mneme catches violations at the moment AI generates code — so your standards are enforced, not just documented.

Block banned frameworks, cross-boundary calls, and superseded decisions before generation

No re-prompting — constraints apply on every call, every session, across every agent

Surface violations before the PR, not during it — cut review overhead at the source

Works with direct API integrations, coding assistants, agent frameworks, and managed agent platforms.

Request pilot access Walk the flagship demos View on GitHub

Works with

Claude Code

Cursor

GitHub Actions

GitHub Copilot

Windsurf

OpenAI

Aider

+ more →

The bottleneck

AI increased code output. Review capacity did not.

Coding assistants generate code faster than teams can review it.

But review bandwidth has not increased.

That means more surface area to validate, more architectural drift to catch, and more governance pushed downstream into PR review.

AI agents do not just create more code. They expose intent debt: undocumented, stale, or unenforced architectural decisions that human reviewers used to catch manually.

The issue is not model quality. It is that coding agents do not retain your architectural decisions by default.

Throughput vs. review capacity · 2023–2026

More PR Surface Area

AI increases the amount of code reviewers must validate per change.

Reactive Governance

Architectural violations are caught after generation, during review.

Session Amnesia

Coding agents forget prior decisions unless re-prompted every time.

Where Mneme sits

Adjacent tools solve adjacent problems.

Mneme is not a memory tool, not a rules file, and not a RAG system. Each of those exists for a reason. None of them govern implementation.

Rules files document standards.

Mneme enforces them.

Memory tools recall context.

Mneme governs implementation.

RAG retrieves knowledge.

Mneme operationalizes decisions.

The AI coding governance stack

Pre-generation governance

Mneme. Compiles architectural intent into enforceable constraints before the agent generates code.

Generation and runtime

Agent frameworks and runtime harnesses. Cursor, Claude Code, agent platforms.

Post-generation observability

Tools like SentRux. Detect violations after the agent has acted.

SentRux tells you when the agent violated architecture. Mneme helps prevent the violation from being proposed in the first place. The two layers are complementary.

How it works

Five stages. No vector store. No ML.

Where Mneme sits · generative AI software engineering stack

07 Human oversight review · approvals

06 Validation & eval benchmarks · tracing

05 Governance & control Mneme HQ

04 Tooling & execution MCP · CI/CD · shells

03 Agent runtime LangGraph · Claude Code

02 Context & retrieval RAG · vectors · memory

01 Foundation models OpenAI · Anthropic · Gemini

Almost everyone is competing in layers 01–03. Mneme is layer 05 — the governance layer above the agent runtime. Read the full layer-by-layer breakdown →

project_memory.json → MemoryStore → Retriever → ContextBuilder → LLMAdapter → Evaluator

1

Load

Your decisions become durable rules. Engineers edit a JSON file once; Mneme loads it on every call — no re-prompting, no session amnesia.

2

Retrieve

The right rules reach the agent every time. Deterministic scoring means the same task always surfaces the same constraints — no probabilistic gaps, no missed standards.

3

Build

Only relevant constraints reach the agent. A targeted packet keeps latency low and prevents rule dilution — the agent gets what applies, not everything you've ever decided.

4

Inject

Every AI call runs under your standards. The context packet is injected as the system prompt before generation — regardless of agent, IDE, or platform.

5

Evaluate

Violations surface before review, not during it. Responses are scored against the injected constraints — giving you a blocking gate before code reaches your PR queue.

Why Existing Approaches Fail

Every current approach shares a common flaw: none of them enforce decisions before the model writes the code.

Approach Why It Breaks at Scale Mneme HQ

Rules Files Static, manually maintained, silently ignored by tools

Deterministic pre-generation enforcement. Structured decisions with a precedence engine, scope-aware retrieval, and hook-level blocking.

Prompt Templates Drift between sessions, omitted by integrators, inconsistent across agents

RAG / Vector Search Probabilistic retrieval, no authority model, no enforcement

Code Review Reactive, linear capacity, too late to prevent architectural debt

Why RAG fails → Why code review doesn't scale →

What Mneme prevents

Concrete violations, not abstract rules.

Mneme injects your team's architectural decisions into AI-assisted generation. Below is what that catches in practice — the kinds of changes an agent will otherwise ship, because nothing told it not to.

Example scenario

A developer asks Claude Code to add analytics to a checkout route. The agent proposes importing the BigQuery client directly into the frontend service — violating your layered architecture decision that data-platform calls belong in a backend service only.

Mneme detects the cross-boundary call before generation completes. The violation is flagged and blocked — the agent never writes the code, and nothing reaches your PR queue.

Unauthorized framework introduction

Redux pulled into a Zustand-standardized app. Banned ORM imported into a service that already chose another.

Cross-boundary architecture violations

BigQuery client instantiated inside a frontend route. Business logic dropped into a controller. Layering decisions ignored.

ADR supersession conflicts

Celery re-introduced after the team moved to Pub/Sub. Old decisions reappearing because the agent didn't see the new one.

Restricted path modifications

Codegen agent writing to db/prod/migrations/*. Billing agent touching the auth package.

Security policy violations

Raw SQL string concatenation. Mock auth shipped in production paths. Credentials handled outside the approved surface.

Non-approved dependency usage

GPL packages added to a license-restricted repo. Internal-only libraries imported into externally-shipped services.

See all twelve examples across five governance categories →

Operational proof

Three flagship demos. One worldview.

Each flagship is a different manifestation of the same structural problem: AI accelerates entropy, review does not scale linearly with AI output, drift compounds. Together they sell the category, not a feature. Each ships with a runnable example that drives real Mneme enforcement against scripted diffs — deterministic, no LLM call required.

Flagship 01 · Centerpiece Runnable Mneme dogfoods this

The ADR compiler — turn architectural decisions into infrastructure

Most teams already have ADRs. They sit in docs/adr/ and are quietly ignored by every AI coding agent. The compiler reads the same files, parses an optional ## Constraints section, and emits enforceable, precedence-aware decisions that govern generation and CI. No rewrite.

Walk through the compiler →

Flagship 02 Runnable

Architectural drift prevention — the AI SDLC entropy demo

Six-step timeline. An agent proposes reasonable-looking code that violates ADR-001. Three downstream changes amplify the divergence. A reviewer would plausibly miss it. Mneme blocks the first divergence upstream and the system converges instead of forking.

Walk the timeline →

Flagship 03 Forward-looking Runnable

Governance continuity across multiple actors

Three actors act sequentially against the same codebase with no shared memory. The compiled corpus is the only thing they share. The architectural invariants stay coherent because they live outside any single actor — in the layer the governance evaluates against.

See the governance trace →

All three flagships, supporting enforcement examples, and operational evidence on the demo hub →

Works with

Model-agnostic. Agent-agnostic.

Frontier and open-weight models. IDE agents, CLI agents, and orchestration frameworks. The decision corpus is the constant; everything upstream of it can change.

Models

OpenAI, Anthropic, Gemini, Llama, Qwen, DeepSeek, Mistral — direct APIs and OpenAI-compatible endpoints.

Coding agents

Claude Code & Cursor (native). Copilot, Aider, Cline, OpenHands designed-to-support.

Frameworks & CI

LangGraph, CrewAI, AutoGen, OpenAI Agents SDK. GitHub Actions (native), self-hosted runners.

Full compatibility surface → · Native integrations →

Get started

Running in under two minutes.

install

$ git clone https://github.com/TheoV823/mneme $ cd mneme $ pip install -e .

run demo

Runs the before/after demo without an API key

$ python demo.py --dry-run

governance gate (CI)

$ mneme check --memory .mneme/project_memory.json \ --input pr.diff --query "$PR_TITLE" --mode strict

Full CLI reference → · Run the benchmark → · Python API →

Vision & roadmap

Building the governance layer for AI-assisted development.

Mneme is evolving from local governance tooling into the governance infrastructure layer for AI-assisted software development. As coding workflows mature, teams will need more than prompt files to maintain architectural consistency at scale.

Phase 1 — Current

OSS Developer Wedge

Architectural governance for individual developers and early engineering adopters.

Phase 2

Team Governance Layer

Shared policy and decision stores for teams adopting AI-assisted development.

Phase 3

Agent Platform Integrations

Governance for enterprise agent workflows and managed coding platforms.

Phase 4

Governance Infrastructure

Policy-as-code enforcement and drift analytics across engineering organizations.

See full roadmap →

Frequently asked

Common questions.

What is Mneme HQ?

Mneme HQ is the architectural governance layer for AI-assisted development. It compiles architectural intent into enforceable constraints that govern AI coding agents before code is generated. As agent platforms proliferate, governance becomes infrastructure, and Mneme is positioned as the pre-generation governance layer of that stack.

How is Mneme different from Cursor Rules or CLAUDE.md?

Rules files document standards. Mneme enforces them. Cursor Rules and CLAUDE.md are prompt files that describe preferences to the model. Mneme is a governance layer that compiles architectural decisions into enforceable constraints, retrieves them at prompt time based on what the agent is doing, and validates outputs against them.

How is Mneme different from RAG or vector databases?

RAG retrieves knowledge. Mneme operationalizes decisions. RAG systems surface documents that the model may or may not act on. Mneme compiles architectural decisions into structured rules and evaluates AI-generated code against them. There is no embedding model, no vector store, and no probabilistic retrieval in the governance path.

How is Mneme different from observability tools like SentRux?

SentRux tells you when the agent violated architecture. Mneme helps prevent the violation from being proposed in the first place. Pre-generation governance and post-generation observability are complementary layer

[truncated for AI cost control]