Avai – your first AI antivirus
Avai is an open-source host telemetry tool with an LLM threat classifier. It runs via Docker, monitors 26 aspects of macOS (21 on Linux) including processes, USB, persistence, file integrity, and browser extensions, enriches findings with 17 threat-intel sources, and uses a Claude-class LLM to classify threats as malicious/suspicious/unknown/benign with MITRE-aligned categories and remediation. No agent, SIEM, or cloud control plane required.
Article intelligence
Key points
- Open-source host telemetry + LLM threat classifier, one docker run.
- Monitors 26 corners on macOS (21 on Linux), integrates 17 threat-intel sources.
- Verdicts include MITRE category, confidence, and one-line remediation.
- No agent contract, no SIEM, no cloud control plane; read-only dashboard.
Why it matters
This matters because open-source host telemetry + LLM threat classifier, one docker run.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Notifications You must be signed in to change notification settings
Fork 0
Star 0
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
80 Commits
80 Commits
landing-page
landing-page
src/avai
src/avai
tests
tests
.dockerignore
.dockerignore
.env.example
.env.example
.gitignore
.gitignore
CHANGELOG.md
CHANGELOG.md
Dockerfile
Dockerfile
README.md
README.md
av-features.md
av-features.md
docker-compose.yml
docker-compose.yml
pyproject.toml
pyproject.toml
Repository files navigation
Know what's actually running on your machines. Open-source host telemetry + LLM threat classifier. One docker run.
avai snapshots 26 corners of your host on macOS (21 on Linux) — processes, USB, persistence, file integrity, browser extensions, exec events — enriches each new finding with up to 17 threat-intel sources (VirusTotal, MalwareBazaar, URLhaus, CISA KEV, Shodan, AbuseIPDB, OSV, NVD, …), and lets a Claude-class LLM tell you which ones are worth caring about. Verdicts come back as malicious / suspicious / unknown / benign with a MITRE-aligned category, a confidence, and a one-line remediation.
No agent contract, no SIEM, no cloud control plane.
Dedup by content hash — the same artifact is never sent to the LLM twice.
17 plug-and-play threat-intel sources behind the LLM — see .env.example; missing keys disable a source cleanly.
Read-only Flask + HTMX + Chart.js dashboard on :8765.
BYO key (ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN), or swap to any litellm-supported provider.
→ Marketing site & screenshots: https://getavai.com → Source: https://github.com/iklobato/avai
One image, two roles
Run Command Where it makes sense
Dashboard (default) docker run iklob1/avai any host — read-only Flask + HTMX on :8765
Monitor docker run ... iklob1/avai avai monitor ... Linux hosts only — needs pid=host, network=host, and host filesystem bind-mounts
The image's default CMD is the dashboard. Override the command at docker run / compose level to run the monitor instead. Native install is also possible (pip install avai-monitor, then avai monitor / avai dashboard) but is not the documented path.
The image carries a HEALTHCHECK against the dashboard's /api/notifications/new endpoint — starting → healthy in ~10 s on first launch. docker compose ps and docker inspect --format '{{.State.Health.Status}}' will both reflect it.
TL;DR — 60-second test, no LLM key
A safe first run on any host (macOS or Linux), no privileges, no credentials, no host bind-mounts. Produces a populated DB and a green dashboard you can poke at.
mkdir -p ~/.avai && cd ~/.avai
1. populate the DB with one snapshot of the container's view
docker run --rm -v "$PWD":/data iklob1/avai \ avai monitor --once --no-streaming --no-judge --db /data/avai.db
2. serve it
docker run -d --name avai -p 8765:8765 -v "$PWD":/data iklob1/avai
open http://localhost:8765/ # macOS; xdg-open on Linux
You'll see ~14 collectors' worth of rows (processes, network_connections, listening_ports, network_interfaces, usb_devices, launch_items, installed_apps, mounts, setuid_files, etc.) — read off the container itself rather than the host, since the run above doesn't bind-mount host state. To get real data, jump to §2 / §3 below.
Stop with docker stop avai && docker rm avai.
1 — Dashboard only (any host, including macOS)
The dashboard reads a SQLite database written by the monitor (or by a previous run). It needs no privileges, no host namespace, no capabilities — just a directory containing avai.db mounted at /data.
mkdir -p ~/.avai && cd ~/.avai
docker run -d \ --name avai-dashboard \ -p 8765:8765 \ -v "$PWD":/data \ iklob1/avai
open http://localhost:8765/
If the database file doesn't exist yet, the dashboard creates an empty schema on launch and every panel renders empty until the monitor produces rows. Stop with docker stop avai-dashboard && docker rm avai-dashboard.
Override port or DB path
docker run --rm -p 9000:9000 \ -v /var/lib/avai:/data \ iklob1/avai \ avai dashboard --host 0.0.0.0 --port 9000 --db /data/custom.db
The image entry point is avai; anything after the image name is passed to it.
2 — Monitor: one-shot scan (Linux host)
A single cycle on the local Linux host. No streaming, no LLM judge — fast smoke test that the bind mounts are wired right.
mkdir -p ~/.avai && cd ~/.avai
docker run --rm \ --pid=host \ --network=host \ --user 0:0 \ --cap-add SYS_PTRACE --cap-add NET_ADMIN --cap-add NET_RAW --cap-add DAC_READ_SEARCH \ -e HOST_PREFIX=/host \ -v /proc:/host/proc:ro \ -v /sys:/host/sys:ro \ -v /etc:/host/etc:ro \ -v /var/lib/bluetooth:/host/var/lib/bluetooth:ro \ -v /var/lib/dpkg:/host/var/lib/dpkg:ro \ -v /usr/share/applications:/host/usr/share/applications:ro \ -v /lib/systemd:/host/lib/systemd:ro \ -v /usr/lib/systemd:/host/usr/lib/systemd:ro \ -v /run/systemd:/run/systemd:ro \ -v /run/dbus:/run/dbus:ro \ -v /etc/machine-id:/etc/machine-id:ro \ -v /dev/mapper:/dev/mapper:ro \ -v /home:/host/home:ro \ -v /root:/host/root:ro \ -v "$PWD":/data \ iklob1/avai \ avai monitor --once --no-streaming --no-judge --db /data/avai.db
When the command exits, ~/.avai/avai.db contains one collection_runs row plus the populated collector tables. Verify:
docker run --rm -v "$PWD":/data iklob1/avai python -c " import sqlite3 c = sqlite3.connect('/data/avai.db') for n, in c.execute(\"select name from sqlite_master where type='table'\"): print(f'{n:/dev/null docker rm avai-dashboard avai-monitor 2>/dev/null
Wipe the database (also wipes verdicts; monitor will re-judge from scratch)
rm -f data/avai.db data/avai.db-wal data/avai.db-shm
Pull the latest image
docker pull iklob1/avai
7 — Recipes
Practical, copy‑paste scenarios beyond the basics above.
Native install on a Linux server (full host visibility)
Inside a container on a real Linux host the monitor already works, but the simplest way to watch a server is to install it natively and let it see everything directly:
pip install 'avai-monitor[judge]' # [judge] pulls litellm + anthropic export ANTHROPIC_API_KEY=sk-ant-... # or CLAUDE_CODE_OAUTH_TOKEN export ABUSE_CH_AUTH_KEY=... # optional, free — adds 3 sources
sudo -E avai monitor --db /var/lib/avai/avai.db --interval 300 & avai dashboard --db /var/lib/avai/avai.db --host 0.0.0.0 --port 8765
sudo lets the collectors read root‑owned state (/etc/shadow, other users' crontabs, every process). -E preserves your API keys across the sudo boundary.
Keep it running with systemd
/etc/systemd/system/avai.service:
[Unit] Description=avai host monitor After=network-online.target
[Service] Environment=ANTHROPIC_API_KEY=sk-ant-... Environment=ABUSE_CH_AUTH_KEY=... ExecStart=/usr/local/bin/avai monitor --db /var/lib/avai/avai.db --interval 300 Restart=always User=root
[Install] WantedBy=multi-user.target
sudo systemctl enable --now avai journalctl -u avai -f # watch cycles
Read findings straight from the command line (no dashboard)
Everything lives in one SQLite file, so you can query it directly — handy for scripting, cron mail, or a server with no browser:
The active dangerous + suspicious findings, newest first
sqlite3 -box /var/lib/avai/avai.db " SELECT verdict, collector, substr(reasoning,1,60) AS why FROM judgements WHERE verdict IN ('malicious','suspicious') ORDER BY created_at DESC LIMIT 20;"
Count by verdict
sqlite3 /var/lib/avai/avai.db \ "SELECT verdict, count(*) FROM judgements GROUP BY verdict;"
What did the threat-intel sources say?
sqlite3 -box /var/lib/avai/avai.db " SELECT source, verdict_hint, substr(summary,1,70) FROM enrichment_evidence WHERE verdict_hint IN ('malicious','suspicious');"
Run a one‑shot scan from cron (instead of the always‑on daemon)
/etc/cron.d/avai — scan once an hour, no streaming
0 * * * * root ANTHROPIC_API_KEY=sk-ant-... \ avai monitor --once --no-streaming --db /var/lib/avai/avai.db
Split setup: monitor on the server, dashboard on your laptop
The monitor writes the DB; the dashboard only reads it. Sync the file (rsync/scp/NFS) and view it anywhere:
on the server (writer)
avai monitor --db /var/lib/avai/avai.db --interval 300
pull it to your laptop and view (reader — any OS, no privileges)
scp server:/var/lib/avai/avai.db ./avai.db docker run --rm -p 8765:8765 -v "$PWD":/data iklob1/avai
Keep LLM cost low
avai monitor \ --judge-model claude-haiku-4-5-20251001 \ # cheapest tier (default) --judge-max-per-collector 20 \ # cap new items judged per cycle --judge-batch-size 20 # entries per API call
Cost is near‑zero in steady state anyway — only new artifacts are judged, and threat‑intel verdicts are cached, so quiet hosts make almost no API calls after the first cycle.
Turn enrichment on/off and debug one source
avai monitor --no-enrich # collectors + judge only avai monitor --enrich-only cisa_kev # just this source (repeatable) avai monitor --enrich-only virustotal --enrich-only abuseipdb
Source names: malware_bazaar urlhaus threatfox circl_hashlookup shodan_internetdb feodo_tracker osv cisa_kev nvd endoflife crtsh virustotal abuseipdb greynoise safe_browsing phishtank github_advisory.
Bring your own LLM provider
--judge-model is a litellm model id, so any supported provider works:
avai monitor --judge-model gpt-4o-mini # OpenAI (OPENAI_API_KEY) avai monitor --judge-model ollama/llama3.1 # local, free, offline avai monitor --judge-model gemini/gemini-1.5-pro # Google
What's collected (one-line summary)
Snapshot collectors (run every cycle, default 300s):
Group Sources
Processes / network processes, network_connections, listening_ports, network_interfaces (psutil)
Hardware usb_devices (/sys/bus/usb), bluetooth_devices (/var/lib/bluetooth), wifi_state (sysfs + iw)
Persistence launch_items (systemd unit files + cron)
Files file_integrity (passwd / shadow / sudoers / SSH config / dotfiles), setuid_files, mounts
Apps installed_apps (dpkg-query + XDG .desktop), browser_extensions
Posture system_integrity (SELinux / AppArmor / ufw / sshd / vnc / LUKS)
Posture (macOS only) tcc_permissions (camera/mic/location/screen grants), quarantine_events, mdm_profiles, kernel_extensions, system_extensions
Streaming collectors (events as they happen):
Collector Source
auth_events journalctl -f (Linux) / macOS unified log (macOS), filtered to security-relevant subsystems. LLM-judged by unique (process, subsystem, message) pattern — each event template is classified once regardless of how many times it fires.
process_exec_events journalctl -f _AUDIT_TYPE_NAME=EXECVE (needs auditd auditctl -a always,exit -F arch=b64 -S execve rule)
For every entity collected (deduped by a content hash over the collector's "judge fields"), the LLM judge classifies it as malicious / suspicious / unknown / benign with a confidence, MITRE-aligned category, and one-line remediation. Judgments are persisted; the same artifact is never sent twice.
Dashboard
The Flask + HTMX dashboard at :8765 has full filter and pagination on every table:
Findings — filter by verdict, collector, category, status (active/resolved), free-text search; sortable columns; configurable page size (10/25/50/100).
Network flows — filter by verdict and IP/host/process search; summary stats (destinations, volume, malicious count).
Listening ports — filter by verdict and bind scope (all interfaces / routable / loopback); process search.
DNS queries — filter by verdict, resolution level (DoH / external DNS / local resolver), domain search.
Persistence — SSH authorized keys, /etc/hosts mappings, and privilege config each with independent pagination.
Auth events — aggregated by unique (process, subsystem, message) pattern with occurrence counts and last-seen timestamps. Filter by subsystem (TCC, securityd, syspolicy, loginwindow, Authorization) or verdict. Sort by count or verdict severity. LLM verdicts
[truncated for AI cost control]