Own Private AI, Part 2: Secure Access from Anywhere with Tailscale Aperture
A guide to securely accessing a self-hosted LLM from any device using Tailscale's private network and Aperture AI gateway, without exposing the model to the public internet.
10io — Actual Intelligence | Fractional CAIO / Chief AI Officer
Why AI Fails
What I Do
How I Do It
Actual Intel
About Me
Let's talk!
Actual Intelligence.
Jun 20, 2026 · 8 min read
Your Own Private AI, Part 2: Secure Access from Anywhere with Tailscale Aperture
A beginner-friendly walkthrough for securely accessing your self-hosted LLM from anywhere — over a private Tailscale network fronted by the Aperture AI gateway, never exposed to the public internet.
Sovereign AI
Local LLM
NVIDIA DGX Spark
Tailscale
Aperture
AI Infrastructure
Self-Hosting
In Part 1 we did Stages 1 and 2 to get a Qwen3.6-35B-A3B-FP8 Mixture of Experts (MoE) model serving an OpenAI-compatible API on a “SparkStation”, a GB10 NVIDIA DGX Spark-class machine. However, the model is only accessible on the machine itself via localhost:8000. Here in Part 2, we run through Stages 3 to 5 to make the model securely reachable from any other devices you choose, without ever exposing it to the public internet. These instructions should be helpful even if you have a different local AI model being served by something other than a SparkStation.
The problem, and the plan
I am constantly looking for a better way to operate and access sovereign AI solutions. I want self-hosted models running on private infrastructure that are as flexibly accessible as the solutions from OpenAI or Anthropic. But a model answering at localhost:8000 is only usable by the machine it runs on. The obvious way to make it accessible from anywhere is to forward a port through my router. However, this is also dangerous: it puts an unauthenticated AI endpoint on the open internet for anyone to find and abuse.
Instead, my current approach has two layers:
Tailscale — a private mesh network (“tailnet”) that encrypts direct connections between approved (“allowlisted”) devices, as if they were on the same LAN, no matter where they are physically. Nothing is exposed publicly; only devices that I’ve explicitly added can reach each other. Tailscale offers a very generous free tier, which I’ve been using for several months and have not yet exceeded. Your mileage may vary.
Aperture (by Tailscale) — an “AI gateway” that sits in front of one or more models on the tailnet. It authenticates every request by the caller’s Tailscale identity, so there are no API keys to distribute, and it logs all usage centrally.
If you follow this guide, your locally hosted models will be reachable only over your private network, and every request through the gateway will be identified and recorded. That’s genuinely private, secure, managed AI.
Concepts in one breath. A tailnet is your private device network. MagicDNS is Tailscale’s feature that lets you address devices by name (e.g. gateway) instead of IP. A provider in Aperture is an upstream model. A grant is a rule saying who may use which models. You’ll meet each below.
Stage 3 — Put the server on your private network
First, get the GB10 machine (or whatever machine you are using as the “AI server”) onto your tailnet.
3.1 Install and join Tailscale on the server
On the server, install Tailscale and bring it up:
curl -fsSL https://tailscale.com/install.sh | sh sudo tailscale up
tailscale up prints a URL — open it in any browser and sign in (Google, GitHub, Microsoft, or email all work). That authenticates this machine and adds it to your tailnet. The account you sign in with defines your tailnet, so remember which one you use — every device must join the same account.
3.2 Note the server’s Tailscale IP
tailscale ip -4
You’ll get an address in the 100.x.x.x range — Tailscale’s private space. I’ll use 100.92.0.10 as a stand-in below; replace it with your own. This is the address Aperture will use to reach your model.
Because vLLM was launched with --network host back in Part 1, it’s already listening on this interface — no change needed. If you run a firewall like ufw, allow the port on the Tailscale interface: sudo ufw allow in on tailscale0 to any port 8000.
Stage 4 — Put Aperture in front
Now we add the gateway. Aperture runs as its own node on your tailnet with its own web dashboard.
4.1 Provision Aperture
Go to aperture.tailscale.com and request access / sign up. During the beta it’s free with any Tailscale account. Once provisioned, Aperture appears as a machine on your tailnet with a hostname, and serves a dashboard at:
http:///ui
I’ll use gateway as the stand-in hostname, so my dashboard is http://gateway/ui. Yours will have its own name — you’ll find it in the Aperture sign-up flow and in your Tailscale admin console’s list of Machines.
Two different dashboards — don’t confuse them. login.tailscale.com/admin is the Tailscale admin console (manages your network: devices, users, access rules). http:///ui is the Aperture dashboard (manages models, providers, and usage). The model configuration below lives in the Aperture dashboard.
4.2 Add your model as a provider
In the Aperture dashboard, open Configuration, and edit the raw HuJSON configuration (Tailscale’s JSON-with-comments format) to define your self-hosted model as a provider. Look for the "providers": {...} block. You may see it already has a default list of third-party providers (e.g., Anthropic, Codex). I just added the following lines inside the "providers": {...} block, right before the first of those lines:
"sovereign": { "baseurl": "http://100.92.0.10:8000", "apikey": "local-no-auth", "models": ["Qwen/Qwen3.6-35B-A3B-FP8"] },
Three details that matter, each of which can cost you an afternoon:
baseurl has no /v1. Aperture appends the incoming request path (which already includes /v1/chat/completions) to your baseurl. If you add /v1 here too, you get a broken /v1/v1/... path. Use just the host and port, remembering to substitute 100.92.0.10 for your server’s Tailscale IP from Stage 3.
apikey is a throwaway. Your vLLM server doesn’t require a key, but Aperture’s dashboard test button refuses to run without one. Any non-empty string works; vLLM ignores it.
models must match the exact model ID vLLM serves (check http://localhost:8000/v1/models on the server if unsure).
4.3 Grant yourself access
Aperture is deny-by-default: even as an admin, you can’t call a model until a grant says so. So, in the same JSON config, find the "grants": [...] block (it follows the "providers": {...} block), and make sure it includes a { "models": "**" } capability to access your model via Aperture:
"grants": [ { "src": ["*"], "app": { "tailscale.com/cap/aperture": [ { "role": "admin" }, { "models": "**" } ] } } ]
This grants everyone on your tailnet ("*") the admin role and access to all models ("**") — fine for a personal setup. To restrict it to just yourself, replace "*" with your Tailscale login name. Save the config (Aperture treats warnings as errors on save, so it’ll tell you if anything’s off).
4.4 Test the route from the dashboard
Open the Models tab. You should see Qwen/Qwen3.6-35B-A3B-FP8 with a Play icon beside it. Click it — a green check means Aperture successfully reached your vLLM server through the tailnet. A red X means it couldn’t (usually a network access rule; see Troubleshooting).
Stage 5 — Connect a client computer
The final piece: reach the model from another device — a laptop, in my case a MacBook. Any OpenAI-compatible tool can use it, but first the client has to be on the tailnet too.
5.1 Install and join Tailscale on the client
macOS / Windows: install the Tailscale app from tailscale.com/download (or the Mac App Store), launch it, and sign in with the same account you used on the server.
Linux: curl -fsSL https://tailscale.com/install.sh | sh then sudo tailscale up, signing in with the same account.
The “same account” part is essential — it’s what puts the client on the same tailnet as the server and the gateway.
5.2 The end-to-end test
From a terminal on the client, first confirm it can see the gateway and resolve its name:
curl http://gateway/v1/models
If that returns JSON listing your model, MagicDNS is resolving and the tailnet path works. Then the moment of truth — a real request, from your laptop, through the gateway, to the model running on the GB10 box:
curl http://gateway/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"Qwen/Qwen3.6-35B-A3B-FP8","messages":[{"role":"user","content":"hello from my laptop"}]}'
No API key in the request — your Tailscale identity is the authentication. When the reply comes back, it has traveled: laptop → tailnet → Aperture → vLLM on the server → back. Check the Aperture dashboard’s usage log and you’ll see that request recorded with your identity and a token count.
This is the whole setup working as required: a very capable private model, reachable from anywhere you and your devices are, authenticated by identity, logged centrally, and never exposed to the public internet.
What’s next
Congratulations! You now have a secure, private gateway to your own model. A natural next step is to point real applications at it — chat front-ends, notebook tools, coding assistants — which all reduce to the same three settings:
Setting
Value
Base URL
http://gateway/v1
API key
any non-empty placeholder
Model
Qwen/Qwen3.6-35B-A3B-FP8
I’m planning to explore wiring up specific apps and different models — and the per-app and per-model quirks that arise — in future posts. But, nothing is stopping you from trying out your sovereign AI setup. And, drop a line to share how you are enjoying it, what you are using it for, and suggestions for other sovereign AI projects!
Troubleshooting recap (Part 2)
Symptom
Cause
Fix
Play-icon test shows a red X
A tailnet access rule (ACL) blocks the Aperture node from reaching the server’s port
Allow the Aperture node → server:8000 in your Tailscale admin console policy
/v1/v1/... errors or 404s through the gateway
/v1 mistakenly included in the provider baseurl
Use host:port only, no /v1
“No API key configured” / Play icon greyed out
Self-hosted provider missing an apikey
Add any non-empty placeholder; vLLM ignores it
gateway won’t resolve on the client
MagicDNS off, or client on a different tailnet
Enable MagicDNS; confirm the client signed in with the same account. Fallback: use the Aperture node’s 100.x.x.x IP
403 / access denied through the gateway
No grant covers your identity
Add/confirm a grants entry covering your user and the model
Topics covered in this post
Sovereign AI
Local LLM
NVIDIA DGX Spark
Tailscale
Aperture
AI Infrastructure
Self-Hosting
Sovereign AI
Local LLM
NVIDIA DGX Spark
Tailscale
Aperture
AI Infrastructure
Self-Hosting
Your Own Private AI, Part 1: Running a Local LLM on a GB10 NVIDIA DGX Spark
Sovereign AI
Local LLM
NVIDIA DGX Spark
vLLM
AI Infrastructure
Self-Hosting
The Glomar Trap: How Anthropic's Claude Mythos Release Pulls Forward Enterprise Sales
AI Strategy
AI Governance
Enterprise AI
Cybersecurity
Industry Views
The Hidden Costs of “Easy AI”
AI Strategy
Data Privacy
AI Governance
Enterprise AI
← Previous post
Your Own Private AI, Part 1: Running a Local LLM on a GB10 NVIDIA DGX Spark
Next post →
Don't Do Software Estimation. Do Incremental Small(ish) Non-Trivial Chunking.