AI News HubLIVE
站内改写

Avai – your first AI antivirus

Avai is an open-source host telemetry tool with an LLM threat classifier. It runs via Docker, monitors 26 aspects of macOS (21 on Linux) including processes, USB, persistence, file integrity, and browser extensions, enriches findings with 17 threat-intel sources, and uses a Claude-class LLM to classify threats as malicious/suspicious/unknown/benign with MITRE-aligned categories and remediation. No agent, SIEM, or cloud control plane required.

Article intelligence

EngineersIntermediate

Key points

  • Open-source host telemetry + LLM threat classifier, one docker run.
  • Monitors 26 corners on macOS (21 on Linux), integrates 17 threat-intel sources.
  • Verdicts include MITRE category, confidence, and one-line remediation.
  • No agent contract, no SIEM, no cloud control plane; read-only dashboard.

Why it matters

This matters because open-source host telemetry + LLM threat classifier, one docker run.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

Notifications You must be signed in to change notification settings

Fork 0

Star 0

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

80 Commits

80 Commits

landing-page

landing-page

src/avai

src/avai

tests

tests

.dockerignore

.dockerignore

.env.example

.env.example

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

Dockerfile

Dockerfile

README.md

README.md

av-features.md

av-features.md

docker-compose.yml

docker-compose.yml

pyproject.toml

pyproject.toml

Repository files navigation

Know what's actually running on your machines. Open-source host telemetry + LLM threat classifier. One docker run.

avai snapshots 26 corners of your host on macOS (21 on Linux) — processes, USB, persistence, file integrity, browser extensions, exec events — enriches each new finding with up to 17 threat-intel sources (VirusTotal, MalwareBazaar, URLhaus, CISA KEV, Shodan, AbuseIPDB, OSV, NVD, …), and lets a Claude-class LLM tell you which ones are worth caring about. Verdicts come back as malicious / suspicious / unknown / benign with a MITRE-aligned category, a confidence, and a one-line remediation.

No agent contract, no SIEM, no cloud control plane.

Dedup by content hash — the same artifact is never sent to the LLM twice.

17 plug-and-play threat-intel sources behind the LLM — see .env.example; missing keys disable a source cleanly.

Read-only Flask + HTMX + Chart.js dashboard on :8765.

BYO key (ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN), or swap to any litellm-supported provider.

→ Marketing site & screenshots: https://getavai.com → Source: https://github.com/iklobato/avai

One image, two roles

Run Command Where it makes sense

Dashboard (default) docker run iklob1/avai any host — read-only Flask + HTMX on :8765

Monitor docker run ... iklob1/avai avai monitor ... Linux hosts only — needs pid=host, network=host, and host filesystem bind-mounts

The image's default CMD is the dashboard. Override the command at docker run / compose level to run the monitor instead. Native install is also possible (pip install avai-monitor, then avai monitor / avai dashboard) but is not the documented path.

The image carries a HEALTHCHECK against the dashboard's /api/notifications/new endpoint — starting → healthy in ~10 s on first launch. docker compose ps and docker inspect --format '{{.State.Health.Status}}' will both reflect it.

TL;DR — 60-second test, no LLM key

A safe first run on any host (macOS or Linux), no privileges, no credentials, no host bind-mounts. Produces a populated DB and a green dashboard you can poke at.

mkdir -p ~/.avai && cd ~/.avai

1. populate the DB with one snapshot of the container's view

docker run --rm -v "$PWD":/data iklob1/avai \ avai monitor --once --no-streaming --no-judge --db /data/avai.db

2. serve it

docker run -d --name avai -p 8765:8765 -v "$PWD":/data iklob1/avai

open http://localhost:8765/ # macOS; xdg-open on Linux

You'll see ~14 collectors' worth of rows (processes, network_connections, listening_ports, network_interfaces, usb_devices, launch_items, installed_apps, mounts, setuid_files, etc.) — read off the container itself rather than the host, since the run above doesn't bind-mount host state. To get real data, jump to §2 / §3 below.

Stop with docker stop avai && docker rm avai.

1 — Dashboard only (any host, including macOS)

The dashboard reads a SQLite database written by the monitor (or by a previous run). It needs no privileges, no host namespace, no capabilities — just a directory containing avai.db mounted at /data.

mkdir -p ~/.avai && cd ~/.avai

docker run -d \ --name avai-dashboard \ -p 8765:8765 \ -v "$PWD":/data \ iklob1/avai

open http://localhost:8765/

If the database file doesn't exist yet, the dashboard creates an empty schema on launch and every panel renders empty until the monitor produces rows. Stop with docker stop avai-dashboard && docker rm avai-dashboard.

Override port or DB path

docker run --rm -p 9000:9000 \ -v /var/lib/avai:/data \ iklob1/avai \ avai dashboard --host 0.0.0.0 --port 9000 --db /data/custom.db

The image entry point is avai; anything after the image name is passed to it.

2 — Monitor: one-shot scan (Linux host)

A single cycle on the local Linux host. No streaming, no LLM judge — fast smoke test that the bind mounts are wired right.

mkdir -p ~/.avai && cd ~/.avai

docker run --rm \ --pid=host \ --network=host \ --user 0:0 \ --cap-add SYS_PTRACE --cap-add NET_ADMIN --cap-add NET_RAW --cap-add DAC_READ_SEARCH \ -e HOST_PREFIX=/host \ -v /proc:/host/proc:ro \ -v /sys:/host/sys:ro \ -v /etc:/host/etc:ro \ -v /var/lib/bluetooth:/host/var/lib/bluetooth:ro \ -v /var/lib/dpkg:/host/var/lib/dpkg:ro \ -v /usr/share/applications:/host/usr/share/applications:ro \ -v /lib/systemd:/host/lib/systemd:ro \ -v /usr/lib/systemd:/host/usr/lib/systemd:ro \ -v /run/systemd:/run/systemd:ro \ -v /run/dbus:/run/dbus:ro \ -v /etc/machine-id:/etc/machine-id:ro \ -v /dev/mapper:/dev/mapper:ro \ -v /home:/host/home:ro \ -v /root:/host/root:ro \ -v "$PWD":/data \ iklob1/avai \ avai monitor --once --no-streaming --no-judge --db /data/avai.db

When the command exits, ~/.avai/avai.db contains one collection_runs row plus the populated collector tables. Verify:

docker run --rm -v "$PWD":/data iklob1/avai python -c " import sqlite3 c = sqlite3.connect('/data/avai.db') for n, in c.execute(\"select name from sqlite_master where type='table'\"): print(f'{n:/dev/null docker rm avai-dashboard avai-monitor 2>/dev/null

Wipe the database (also wipes verdicts; monitor will re-judge from scratch)

rm -f data/avai.db data/avai.db-wal data/avai.db-shm

Pull the latest image

docker pull iklob1/avai

7 — Recipes

Practical, copy‑paste scenarios beyond the basics above.

Native install on a Linux server (full host visibility)

Inside a container on a real Linux host the monitor already works, but the simplest way to watch a server is to install it natively and let it see everything directly:

pip install 'avai-monitor[judge]' # [judge] pulls litellm + anthropic export ANTHROPIC_API_KEY=sk-ant-... # or CLAUDE_CODE_OAUTH_TOKEN export ABUSE_CH_AUTH_KEY=... # optional, free — adds 3 sources

sudo -E avai monitor --db /var/lib/avai/avai.db --interval 300 & avai dashboard --db /var/lib/avai/avai.db --host 0.0.0.0 --port 8765

sudo lets the collectors read root‑owned state (/etc/shadow, other users' crontabs, every process). -E preserves your API keys across the sudo boundary.

Keep it running with systemd

/etc/systemd/system/avai.service:

[Unit] Description=avai host monitor After=network-online.target

[Service] Environment=ANTHROPIC_API_KEY=sk-ant-... Environment=ABUSE_CH_AUTH_KEY=... ExecStart=/usr/local/bin/avai monitor --db /var/lib/avai/avai.db --interval 300 Restart=always User=root

[Install] WantedBy=multi-user.target

sudo systemctl enable --now avai journalctl -u avai -f # watch cycles

Read findings straight from the command line (no dashboard)

Everything lives in one SQLite file, so you can query it directly — handy for scripting, cron mail, or a server with no browser:

The active dangerous + suspicious findings, newest first

sqlite3 -box /var/lib/avai/avai.db " SELECT verdict, collector, substr(reasoning,1,60) AS why FROM judgements WHERE verdict IN ('malicious','suspicious') ORDER BY created_at DESC LIMIT 20;"

Count by verdict

sqlite3 /var/lib/avai/avai.db \ "SELECT verdict, count(*) FROM judgements GROUP BY verdict;"

What did the threat-intel sources say?

sqlite3 -box /var/lib/avai/avai.db " SELECT source, verdict_hint, substr(summary,1,70) FROM enrichment_evidence WHERE verdict_hint IN ('malicious','suspicious');"

Run a one‑shot scan from cron (instead of the always‑on daemon)

/etc/cron.d/avai — scan once an hour, no streaming

0 * * * * root ANTHROPIC_API_KEY=sk-ant-... \ avai monitor --once --no-streaming --db /var/lib/avai/avai.db

Split setup: monitor on the server, dashboard on your laptop

The monitor writes the DB; the dashboard only reads it. Sync the file (rsync/scp/NFS) and view it anywhere:

on the server (writer)

avai monitor --db /var/lib/avai/avai.db --interval 300

pull it to your laptop and view (reader — any OS, no privileges)

scp server:/var/lib/avai/avai.db ./avai.db docker run --rm -p 8765:8765 -v "$PWD":/data iklob1/avai

Keep LLM cost low

avai monitor \ --judge-model claude-haiku-4-5-20251001 \ # cheapest tier (default) --judge-max-per-collector 20 \ # cap new items judged per cycle --judge-batch-size 20 # entries per API call

Cost is near‑zero in steady state anyway — only new artifacts are judged, and threat‑intel verdicts are cached, so quiet hosts make almost no API calls after the first cycle.

Turn enrichment on/off and debug one source

avai monitor --no-enrich # collectors + judge only avai monitor --enrich-only cisa_kev # just this source (repeatable) avai monitor --enrich-only virustotal --enrich-only abuseipdb

Source names: malware_bazaar urlhaus threatfox circl_hashlookup shodan_internetdb feodo_tracker osv cisa_kev nvd endoflife crtsh virustotal abuseipdb greynoise safe_browsing phishtank github_advisory.

Bring your own LLM provider

--judge-model is a litellm model id, so any supported provider works:

avai monitor --judge-model gpt-4o-mini # OpenAI (OPENAI_API_KEY) avai monitor --judge-model ollama/llama3.1 # local, free, offline avai monitor --judge-model gemini/gemini-1.5-pro # Google

What's collected (one-line summary)

Snapshot collectors (run every cycle, default 300s):

Group Sources

Processes / network processes, network_connections, listening_ports, network_interfaces (psutil)

Hardware usb_devices (/sys/bus/usb), bluetooth_devices (/var/lib/bluetooth), wifi_state (sysfs + iw)

Persistence launch_items (systemd unit files + cron)

Files file_integrity (passwd / shadow / sudoers / SSH config / dotfiles), setuid_files, mounts

Apps installed_apps (dpkg-query + XDG .desktop), browser_extensions

Posture system_integrity (SELinux / AppArmor / ufw / sshd / vnc / LUKS)

Posture (macOS only) tcc_permissions (camera/mic/location/screen grants), quarantine_events, mdm_profiles, kernel_extensions, system_extensions

Streaming collectors (events as they happen):

Collector Source

auth_events journalctl -f (Linux) / macOS unified log (macOS), filtered to security-relevant subsystems. LLM-judged by unique (process, subsystem, message) pattern — each event template is classified once regardless of how many times it fires.

process_exec_events journalctl -f _AUDIT_TYPE_NAME=EXECVE (needs auditd auditctl -a always,exit -F arch=b64 -S execve rule)

For every entity collected (deduped by a content hash over the collector's "judge fields"), the LLM judge classifies it as malicious / suspicious / unknown / benign with a confidence, MITRE-aligned category, and one-line remediation. Judgments are persisted; the same artifact is never sent twice.

Dashboard

The Flask + HTMX dashboard at :8765 has full filter and pagination on every table:

Findings — filter by verdict, collector, category, status (active/resolved), free-text search; sortable columns; configurable page size (10/25/50/100).

Network flows — filter by verdict and IP/host/process search; summary stats (destinations, volume, malicious count).

Listening ports — filter by verdict and bind scope (all interfaces / routable / loopback); process search.

DNS queries — filter by verdict, resolution level (DoH / external DNS / local resolver), domain search.

Persistence — SSH authorized keys, /etc/hosts mappings, and privilege config each with independent pagination.

Auth events — aggregated by unique (process, subsystem, message) pattern with occurrence counts and last-seen timestamps. Filter by subsystem (TCC, securityd, syspolicy, loginwindow, Authorization) or verdict. Sort by count or verdict severity. LLM verdicts

[truncated for AI cost control]