Show HN: Adaptive Runtime – AI agent layer, no GPU, crash recovery
Adaptive Runtime is an open-source Python library that provides a runtime intelligence layer for stateful AI systems. It features five core engines (State, Context, Confidence, Decision, Recovery) that address production runtime issues like crash recovery, state persistence, confidence scoring, and more. No GPU required, runs on low-cost VPS.
Article intelligence
Key points
- Adaptive Runtime is a runtime intelligence layer for stateful AI systems, addressing production runtime problems.
- It includes five core engines: State, Context, Confidence, Decision, and Recovery.
- No GPU needed, minimal dependencies, runs on Raspberry Pi or $5 VPS.
- Built-in SQLite persistence, crash recovery, and adaptive confidence scoring.
Why it matters
This matters because adaptive Runtime is a runtime intelligence layer for stateful AI systems, addressing production runtime problems.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Notifications You must be signed in to change notification settings
Fork 0
Star 0
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
2 Commits
2 Commits
.github/workflows
.github/workflows
core
core
examples
examples
observability
observability
runtime
runtime
storage
storage
tests
tests
.gitignore
.gitignore
LICENSE
LICENSE
README.md
README.md
init.py
init.py
demo.yml
demo.yml
demo_record.py
demo_record.py
pyproject.toml
pyproject.toml
render1779972984877.gif
render1779972984877.gif
Repository files navigation
Runtime Intelligence Layer for Stateful AI Systems
Not a chatbot framework. Not an LLM wrapper. Not a workflow builder.
An adaptive runtime intelligence layer — the missing piece between your AI logic and production reality.
The Problem
Most AI frameworks solve the model problem.
Nobody solves the runtime problem.
Your AI agent in development: Works perfectly. Your AI agent in production: Crashes. Forgets state. Retries blindly. Dies silently.
Production AI systems fail because of:
💥 No crash recovery — state lost on restart
🧠 No memory — agent forgets context between sessions
🔁 Retry chaos — blind retries with no back-off
📉 No confidence scoring — decisions made without certainty
🌊 No contextual awareness — can't adapt to changing conditions
Adaptive Runtime fixes this.
See It Running
[16:08:13][RUNTIME] Event received: service_overload [16:08:13][CONTEXT_ENGINE] risk=high stability=low pressure=0.65 [16:08:13][CONFIDENCE_ENGINE] confidence=0.84 [16:08:13][DECISION_ENGINE] ACTION: RESTART_SERVICE [16:08:13][STATE_ENGINE] State persisted [16:08:13][RECOVERY_ENGINE] Checkpoint #3 created
→ restart_service [high] conf=0.840
[16:08:14][RUNTIME] Event received: anomaly_detected [16:08:14][CONTEXT_ENGINE] risk=low stability=stable pressure=0.32 [16:08:14][CONFIDENCE_ENGINE] confidence=0.62 [16:08:14][DECISION_ENGINE] ACTION: FLAG_FOR_REVIEW [16:08:14][STATE_ENGINE] State persisted
→ flag_for_review [low] conf=0.620
The runtime thinks, decides, remembers, and recovers — automatically.
How It Works
Event (CPU spike, anomaly, timeout, auth failure...) │ ▼ ┌─────────────────┐ │ Context Engine │ → Analyzes conditions: risk, stability, pressure score └────────┬────────┘ │ ▼ ┌──────────────────────┐ │ Confidence Engine │ → Calculates adaptive confidence (with decay + history) └────────┬─────────────┘ │ ▼ ┌──────────────────┐ │ Decision Engine │ → Selects action: restart / throttle / rollback / recover... └────────┬─────────┘ │ ▼ ┌──────────────────┐ │ State Engine │ → Persists state to SQLite (survives crashes) └────────┬─────────┘ │ ▼ ┌──────────────────────┐ │ Recovery Engine │ → Creates checkpoint, handles retry with back-off └──────────────────────┘
Quick Start
pip install pydantic aiosqlite
import asyncio from adaptive_runtime import Runtime
async def main(): runtime = Runtime(agent_id="my-agent") await runtime.start()
result = await runtime.process({ "type": "service_overload", "severity": 0.82, "cpu": 94, "memory": 88, })
print(result.action) # "restart_service" print(result.confidence) # 0.7831 print(result.reason) # "high_resource_pressure" print(result.priority) # "high"
await runtime.stop()
asyncio.run(main())
That's it. No API keys. No cloud setup. No GPU. Runs on a $5 VPS.
Killer Example: Adaptive Monitoring System
import asyncio from adaptive_runtime import Runtime
async def monitor(): runtime = Runtime(agent_id="prod-monitor", checkpoint_every=5)
Subscribe to critical events
@runtime.bus.subscribe("anomaly_detected") async def on_anomaly(event): print(f" ⚠ Anomaly handler fired — severity={event['severity']}")
await runtime.start()
Simulate real production events
events = [ {"type": "service_overload", "severity": 0.91, "cpu": 96, "memory": 92}, {"type": "anomaly_detected", "severity": 0.74, "error_rate": 0.6}, {"type": "auth_failure", "severity": 0.55}, {"type": "timeout", "severity": 0.45, "latency_ms": 4200}, {"type": "recovery_needed", "severity": 0.30}, ]
for event in events: result = await runtime.process(event) print(f" [{result.priority.upper()}] {event['type']:25s} → {result.action}")
Runtime remembers everything
history = await runtime.event_history(limit=5) print(f"\n Last {len(history)} events remembered across sessions.")
await runtime.stop()
asyncio.run(monitor())
Output:
[HIGH] service_overload → scale_up_immediate [NORMAL] anomaly_detected → flag_for_review ⚠ Anomaly handler fired — severity=0.74 [NORMAL] auth_failure → trigger_security_audit [LOW] timeout → cache_warmup [LOW] recovery_needed → run_recovery
Last 5 events remembered across sessions.
Why Not LangChain?
This question will come up. Here's the honest answer:
LangChain / AutoGen Adaptive Runtime
Purpose LLM orchestration Runtime behavior
Core abstraction Prompt chains Stateful events
Intelligence Language model Probabilistic engine
Dependencies Heavy (openai, tiktoken, ...) Minimal (pydantic, aiosqlite)
GPU required Sometimes Never
Crash recovery ❌ ✅ Built-in
State persistence External setup ✅ Built-in SQLite
Confidence scoring ❌ ✅ Adaptive
Runs on $5 VPS Barely ✅ Designed for it
Use case Chat, RAG, agents Runtime resilience
TL;DR: LangChain makes LLMs useful. Adaptive Runtime makes AI systems reliable.
They solve different problems. Use both, or use this standalone.
Runtime Philosophy
Most AI problems in production are not model problems.
They are runtime problems.
Adaptive Runtime is built around the belief that future AI systems need:
Memory — state that survives crashes and restarts
Resilience — self-healing with checkpoints and retry logic
Contextual behavior — decisions that adapt to real conditions
Confidence awareness — knowing how certain a decision is
Lightweight cognition — intelligence without neural dependency
Not just prompts. Not just workflows. Runtime intelligence.
The 5 Core Engines
- State Engine
Persistent agent memory. Survives crashes. SQLite by default.
await state_engine.save_state({"health": "ok", "version": "1.2"}) state = await state_engine.load_state() # Restored after restart await state_engine.patch_state({"last": "ok"}) # Partial update
- Context Engine
Transforms raw signals into contextual understanding — no ML needed.
ctx = context_engine.analyze({ "type": "service_overload", "cpu": 94, "memory": 88, "severity": 0.82 })
→ risk="high", stability="low", context="resource_pressure", pressure=0.65
- Confidence Engine
Adaptive probabilistic scoring with historical weighting and decay.
conf = confidence_engine.calculate(event, context_risk="high")
→ conf.final = 0.7831 (lower when risk is high, adapts from history)
confidence_engine.record_outcome(success=True, confidence=0.78, context_risk="high")
- Decision Engine
Explainable rule-based action selection. Extensible with custom rules.
decision = decision_engine.decide(event, "resource_pressure", "high", 0.78)
→ action="restart_service", reason="high_resource_pressure", priority="high"
Add your own rules:
custom_rules = [("my_context", "high", 0.70, "my_action", "my_reason")] engine = DecisionEngine(custom_rules=custom_rules)
- Recovery Engine
Crash recovery, checkpoint snapshots, exponential back-off retry.
await recovery_engine.create_checkpoint(state) # Save checkpoint state = await recovery_engine.restore_latest() # Restore after crash result = await recovery_engine.retry(fn, fallback=fallback_fn) # Retry with back-off
Designed for Constrained Environments
✅ Raspberry Pi ✅ $5 VPS (512MB RAM) ✅ Old laptop ✅ Edge devices ✅ Offline / air-gapped systems ✅ Serverless (cold start friendly)
No GPU. No cloud lock-in. No heavy ML frameworks.
Just Python + asyncio + SQLite.
Project Structure
adaptive_runtime/ │ ├── core/ │ ├── state_engine.py # State persistence and memory │ ├── context_engine.py # Event → contextual classification │ ├── confidence_engine.py # Adaptive probabilistic confidence │ ├── decision_engine.py # Rule-based action selection │ └── recovery_engine.py # Crash recovery + retry orchestration │ ├── runtime/ │ ├── runtime_manager.py # Main orchestrator (Runtime class) │ ├── event_bus.py # Async pub/sub event bus │ └── cache.py # TTL-based in-memory cache │ ├── storage/ │ ├── sqlite_store.py # Async SQLite persistence │ └── memory_store.py # In-process ephemeral store (testing) │ ├── observability/ │ ├── logger.py # Structured color logger │ └── metrics.py # Lightweight in-memory metrics │ ├── examples/ │ ├── agent_demo.py # Basic event processing │ ├── monitoring_demo.py # Continuous monitoring + event bus │ └── automation_demo.py # Retry + crash recovery │ └── tests/ └── test_engines.py # 12 unit tests — all engines
Run the Examples
Clone
git clone https://github.com/stateflow-dev/adaptive-runtime.git cd adaptive-runtime
Install
pip install pydantic aiosqlite
Run demos
python examples/agent_demo.py python examples/monitoring_demo.py python examples/automation_demo.py
Run tests
pip install pytest pytest-asyncio pytest tests/ -v
→ 12 passed
Roadmap
Feature Status
✅ 5 Core Engines Tier 1 — Released
✅ SQLite + Memory store Tier 1 — Released
✅ Async event bus Tier 1 — Released
✅ Retry + crash recovery Tier 1 — Released
🔜 REST API adapter (FastAPI) Tier 2
🔜 Multi-agent orchestration Tier 2
🔜 Plugin system Tier 2
🔜 Real-time dashboard Tier 2
🔜 Distributed runtime Tier 3
Benchmarks
Measured on a mid-range Windows laptop (Python 3.10, SQLite, no GPU).
Metric Result
Cold start 446 ms
Idle memory 29 MB
CPU idle usage <0%
SQLite save latency 36.5 ms avg (n=50)
SQLite load latency 2.7 ms avg (n=50)
Event processing 109.2 ms avg (n=50)
GPU required ❌ Never
Runs comfortably on a $5 VPS (512MB RAM). No GPU. No cloud lock-in.
Contributing
Issues and PRs welcome. Please open an issue first for major changes.
License
MIT © Stateflow Labs
"The biggest AI problems in production are not model problems. They are runtime problems."
About
Adaptive Runtime Layer for Stateful AI Systems
Topics
python
ai
runtime
asyncio
event-driven
stateful-ai
Resources
Readme
License
MIT license
Uh oh!
There was an error while loading. Please reload this page.
Activity
Stars
0 stars
Watchers
0 watching
Forks
0 forks
Report repository
Releases 1
v0.1.0 — Tier 1 Adaptive Runtime Core
Latest
May 28, 2026
Packages 0
Uh oh!
There was an error while loading. Please reload this page.
Contributors
Uh oh!
There was an error while loading. Please reload this page.
Languages
Python 100.0%