Show HN: Forensic-deepdive: code knowledge graph and MCP server for AI agents
Forensic-deepdive is an open-source tool that builds a persistent code knowledge graph for any codebase (supports 9 languages) and provides an MCP server interface for AI coding agents. It generates 5 human-readable Markdown documents and 10 integration files, all offline without LLM or network.
Notifications You must be signed in to change notification settings
Fork 0
Star 0
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
143 Commits
143 Commits
.claude-plugin
.claude-plugin
.claude/skills
.claude/skills
.github/workflows
.github/workflows
docs
docs
examples
examples
experiments/fastcontext
experiments/fastcontext
scripts
scripts
src/forensic_deepdive
src/forensic_deepdive
tests
tests
.gitignore
.gitignore
.mcp.json
.mcp.json
.python-version
.python-version
AGENTS.md
AGENTS.md
CHANGELOG.md
CHANGELOG.md
CLAUDE.md
CLAUDE.md
CONTRIBUTING.md
CONTRIBUTING.md
LICENSE
LICENSE
NOTICE
NOTICE
README.md
README.md
pyproject.toml
pyproject.toml
server.json
server.json
uv.lock
uv.lock
Repository files navigation
A persistent code knowledge graph + MCP server for AI coding agents. Five durable markdown artifacts as the human-readable projection. Apache-2.0.
forensic-deepdive analyzes any codebase (9 languages, polyglot) and produces:
A persistent embedded graph at /.deepdive/graph.lbug — File, Symbol, Module, Commit, Author, Endpoint, and DbTable nodes plus DEFINES, MEMBER_OF, IMPORTS, CALLS, EXTENDS, IMPLEMENTS, TOUCHED_BY_COMMIT, AUTHORED_BY, CO_CHANGES_WITH, and the cross-boundary HANDLES / CALLS_ENDPOINT / ROUTES_TO / INJECTS / PERSISTS_TO edges. Every edge carries a confidence tag (EXTRACTED / INFERRED / AMBIGUOUS) — no hidden heuristics. The single Endpoint join node unifies five cross-boundary protocols (HTTP, MCP tools, registry-dispatch, gRPC, messaging/AMQP), so a frontend call resolves to its backend handler across the stack as one ROUTES_TO edge.
An MCP server (forensic serve) exposing 9 composite tools (impact, context, archaeology, flow, query, record_insight, recall_insights, visualize, trace) consumable by Claude Code, Cursor, Codex, Continue, Cline, Windsurf — and any other MCP-aware agent.
Five durable markdown artifacts under /docs/codebase/, regenerated from the graph on every extract:
MAP.md — what's where, ranked by centrality.
HOTPATHS.md — the dependency hot spots, with a per-row confidence-mix column so you see exactly how cleanly each symbol resolves.
ARCHAEOLOGY.md — why the code looks the way it does (git history, top authors with %, bus factor, co-change clusters, defect proximity).
MENTAL_MODEL.md — the doc the original author would write to onboard a new hire.
AGENT_BRIEF.md — ≤5 KB of assertive Never/Always rules with per-rule confidence tags. Drop-in CLAUDE.md for any project.
Ten shims into the target repo — 4 editor rule files (CLAUDE.md, AGENTS.md, .cursor/rules/codebase.mdc, .continue/rules/codebase.md), 5 single-intent Claude skills under .claude/skills/codebase-{exploring,debugging,impact-analysis,refactoring,onboarding}/, and a .claude-plugin/plugin.json manifest. All write-if-absent — hand-edited files are never overwritten.
An agent-insight layer — record_insight / recall_insights MCP tools backed by /.deepdive/insights.jsonl by default (zero dependencies, human-readable, git-friendly). The optional [graphiti] extra upgrades to a temporal knowledge graph backend above a 2-of-5 repo-size threshold.
Extract also regenerates ARCHITECTURE.md — a system-level Mermaid view of the cross-boundary graph (ROUTES_TO / INJECTS / PERSISTS_TO, confidence-styled), a separate human-validation surface (not one of the five contract artifacts, exactly like forensic visualize and serve --ui). Regenerate it on its own with forensic diagram --repo . Use it to sanity-check the graph — a wrong edge there is a wrong edge everywhere.
Add --emit-vault to also write an Obsidian-friendly vault under /vault/ — every artifact gets summary:/tags: frontmatter, cross-references become [[wikilinks]], and an INDEX.md MOC ties them together (with a .obsidian/ config). A local-first second brain for humans (graph view, backlinks) and agents (triage by summary: without opening files, a traversable index). Opt-in; off by default.
Status
v0.8.0 "USABLE → USEFUL + public release" — the first public PyPI release. Builds on the frozen five-protocol cross-boundary graph (HTTP/MCP/registry/gRPC/messaging on one Endpoint join node) with a precision pass (honest call-graph confidence, distinct-caller counts, low-history/solo-repo guards), a human-validation ARCHITECTURE.md diagram surface, distribution (PyPI + MCP Registry + a Claude Code plugin), and an opt-in --emit-vault Obsidian export. The 5-artifact + 9-MCP-tool contract is frozen.
What's proven, and what isn't (honest framing). v0.8 is an assisted-analysis tool: a real fresh-agent onboarding test confirmed it's usable and that an agent auto-discovers AGENT_BRIEF.md and routes to the right skill unprompted, and a grounded MCP tool review found the git-archaeology + curated briefs are the high-trust core. The autonomous end-to-end question — does deepdive-seeding make an agent resolve real issues measurably faster — is not yet proven: a model-free localization pilot is recorded (experiments/fastcontext/RESULTS.md — the static seed is a weak prior), and the end-to-end measurement is deferred to v0.9 (it needs a GPU + a frontier main-agent endpoint). No autonomous-execution claims are made. Accepted across real repos including Apache Superset, wagtail (Django), spring-petclinic, ripgrep, fastapi, and Iris-Nearby (Flutter/Dart) — see docs/findings/.
Quick start
install from PyPI (puts forensic on PATH); or run ephemerally with uvx
uv tool install forensic-deepdive forensic info # banner + capability panel forensic extract /path/to/repo
…or from source for development:
git clone https://github.com/Dhevenddra/forensic-deepdive && cd forensic-deepdive uv sync --all-extras
what can it do? (banner + capability panel: artifacts, protocols, MCP tools, confidence legend)
uv run forensic info
run on any repo
uv run forensic extract /path/to/repo
graph lands at /.deepdive/graph.lbug
5 markdown artifacts at /docs/codebase/
10 shims at /.claude/, .cursor/, .continue/, root
trace a cross-stack feature slice (frontend call -> endpoint -> handler -> tail)
uv run forensic trace --repo /path/to/repo
query the graph as an MCP server (point it at the analyzed repo)
uv run forensic serve --repo /path/to/repo
inspect every repo you've analyzed
uv run forensic list
Install from PyPI
Published as forensic-deepdive — no clone needed:
uv tool install forensic-deepdive # puts forensic on PATH
forensic extract /path/to/repo
…or run ephemerally, no install:
uvx forensic-deepdive extract /path/to/repo
Optional extras: uv tool install "forensic-deepdive[semantic]" (offline ONNX NL query), [openapi] (YAML spec parsing), [graphiti] (temporal insight backend). pip install forensic-deepdive works too if you're not on uv.
Use it as an MCP server
forensic serve is a stdio MCP server exposing the 9 composite tools to any MCP-aware agent (Claude Code, Cursor, VS Code/Copilot, Codex, Continue, Cline, Windsurf). First build the graph once (forensic extract ), then wire the server. Three ways, easiest first:
- Claude Code plugin (self-hosted marketplace — no PyPI step):
/plugin marketplace add Dhevenddra/forensic-deepdive /plugin install forensic-deepdive@dhevenddra
- From the MCP Registry — indexed as
io.github.Dhevenddra/forensic-deepdive, so registry-aware clients and discovery hubs (PulseMCP, MCPJungle, the VS Code @mcp index) can find and install it directly.
- Manual config — generate a client snippet with forensic mcp-config, or paste:
{ "mcpServers": { "forensic-deepdive": { "command": "uvx", "args": ["forensic-deepdive", "serve", "--repo", "."] } } }
Per-client copy-paste blocks (Cursor, VS Code, Codex, the uvx-not-found GUI gotcha) are in docs/install.md.
The 9 supported languages
Python, C, Dart, Swift, TypeScript, JavaScript, Java, Go, Rust.
The 9 MCP tools
Tool What it does
impact(symbol, depth, direction, min_confidence) Blast-radius BFS over CALLS edges, depth-bucketed, confidence-filterable.
context(symbol) Single-call kitchen sink: definition + callers + callees + parent/siblings/members + extends/implements + recent commits + dominant author + recent insights.
archaeology(file_or_symbol) Churn, top authors with %, bus factor, co-change cluster, defect proximity, recent commits.
flow(entry_point, max_depth) DFS over CALLS with cycle detection.
query(cypher | natural_language) Raw Cypher, or hybrid NL retrieval (FTS5/BM25 + structural graph signal + opt-in offline semantic, RRF-fused and shaped) with per-hit provenance + confidence.
record_insight(symbol, claim, evidence, verified_by) Persist a verified learning.
recall_insights(symbol, since, limit) Newest-first substring match against stored insights.
visualize(target, format, depth, max_nodes, ...) Bounded Mermaid diagram of a symbol/file neighborhood (or central); edge dash style encodes confidence.
trace(symbol, direction, max_depth) Cross-stack feature slice across the Endpoint join node: downstream walks frontend call → CALLS_ENDPOINT → endpoint → HANDLES → handler → CALLS tail; upstream answers "who calls this endpoint".
Tool descriptions are individually ≤200 tokens so the 9-tool envelope stays comfortably inside Anthropic's per-turn skill metadata budget.
The confidence taxonomy
Every edge and every emitted claim carries EXTRACTED / INFERRED / AMBIGUOUS:
EXTRACTED — deterministic from AST or git log. Facts.
INFERRED — a heuristic resolved cleanly (import-graph walk, receiver-type inference, single same-name candidate cross-file). High-trust but derived.
AMBIGUOUS — multiple candidates surfaced; the resolver couldn't disambiguate. You see every candidate, not a silent guess.
HOTPATHS shows a per-row confidence-mix column so at a glance you can tell Logger (4 EXTRACTED + 1458 INFERRED — mostly clean) from ChatToolResponse (449 AMBIGUOUS — same-name cross-file collision).
Honest-mode (pure-static, zero LLM, zero network)
forensic extract works end-to-end with no ANTHROPIC_API_KEY, no OPENAI_API_KEY, no Ollama, no network. Graphiti is opt-in via the [graphiti] PyPI extra plus a 2-of-5 repo-size threshold (≥50 k LOC, ≥25 contributors, ≥18 mo old, ≥200 PRs/12 mo, ≥100 issues with discussion). The JsonlInsightStore is the always-available floor.
Why this and not [GitNexus / CodeGraphContext / DeepWiki / Sourcegraph]
forensic-deepdive GitNexus CodeGraphContext DeepWiki Sourcegraph
License Apache-2.0 PolyForm Noncommercial MIT proprietary (open variant: MIT) partial
Persistent code knowledge graph ✅ LadybugDB ✅ LadybugDB partial ❌ partial
MCP server ✅ 9 composite tools ✅ 16 tools partial ❌ ❌
Per-edge confidence taxonomy ✅ EXTRACTED / INFERRED / AMBIGUOUS ❌ ❌ ❌ ❌
Git archaeology as a first-class layer ✅ ❌ ❌ ❌ partial
Durable committed markdown artifacts ✅ 5 files partial partial ✅ (wiki) ❌
Agent-insight layer (record_insight / recall_insights) ✅ ❌ ❌ ❌ ❌
Multi-platform skill emission ✅ 10 shims partial partial ❌ ❌
Local-only (no cloud required) ✅ co-equal ✅ ✅ ❌ ❌
GitNexus is the runaway leader — but the PolyForm Noncommercial license locks every commercial user out. That's the wedge: Apache-2.0 + honest confidence + git archaeology + persistent agent memory + the 5 markdown artifacts as a fallback for any agent that doesn't speak MCP.
Local development
git clone https://github.com/Dhevenddra/forensic-deepdive cd forensic-deepdive uv sync --all-extras uv run forensic --version uv run pytest -x # 830 tests at v0.8.0 uv run ruff check src/ tests/ uv run forensic extract tests/fixtures/tiny_fixture
Read CLAUDE.md, DECISIONS.md (81 active DECs), and PROGRESS.md before making changes. This repo dogfoods its own pattern: every session starts with the protocol in CLAUDE.md, every architectural choice is captured as
[truncated for AI cost control]