2026-06-22 19:58 UTCIn-site rewrite6 min readUpdated: 2026-06-22 20:05 UTC

Own Private AI, Part 2: Secure Access from Anywhere with Tailscale Aperture

A guide to securely accessing a self-hosted LLM from any device using Tailscale's private network and Aperture AI gateway, without exposing the model to the public internet.

SourceHacker News AIAuthor: anactofgod

10io — Actual Intelligence | Fractional CAIO / Chief AI Officer

Why AI Fails

What I Do

How I Do It

Actual Intel

About Me

Let's talk!

Actual Intelligence.

Jun 20, 2026 · 8 min read

Your Own Private AI, Part 2: Secure Access from Anywhere with Tailscale Aperture

A beginner-friendly walkthrough for securely accessing your self-hosted LLM from anywhere — over a private Tailscale network fronted by the Aperture AI gateway, never exposed to the public internet.

Sovereign AI

Local LLM

NVIDIA DGX Spark

Tailscale

Aperture

AI Infrastructure

Self-Hosting

In Part 1 we did Stages 1 and 2 to get a Qwen3.6-35B-A3B-FP8 Mixture of Experts (MoE) model serving an OpenAI-compatible API on a “SparkStation”, a GB10 NVIDIA DGX Spark-class machine. However, the model is only accessible on the machine itself via localhost:8000. Here in Part 2, we run through Stages 3 to 5 to make the model securely reachable from any other devices you choose, without ever exposing it to the public internet. These instructions should be helpful even if you have a different local AI model being served by something other than a SparkStation.

The problem, and the plan

I am constantly looking for a better way to operate and access sovereign AI solutions. I want self-hosted models running on private infrastructure that are as flexibly accessible as the solutions from OpenAI or Anthropic. But a model answering at localhost:8000 is only usable by the machine it runs on. The obvious way to make it accessible from anywhere is to forward a port through my router. However, this is also dangerous: it puts an unauthenticated AI endpoint on the open internet for anyone to find and abuse.

Instead, my current approach has two layers:

Tailscale — a private mesh network (“tailnet”) that encrypts direct connections between approved (“allowlisted”) devices, as if they were on the same LAN, no matter where they are physically. Nothing is exposed publicly; only devices that I’ve explicitly added can reach each other. Tailscale offers a very generous free tier, which I’ve been using for several months and have not yet exceeded. Your mileage may vary.

Aperture (by Tailscale) — an “AI gateway” that sits in front of one or more models on the tailnet. It authenticates every request by the caller’s Tailscale identity, so there are no API keys to distribute, and it logs all usage centrally.

If you follow this guide, your locally hosted models will be reachable only over your private network, and every request through the gateway will be identified and recorded. That’s genuinely private, secure, managed AI.

Concepts in one breath. A tailnet is your private device network. MagicDNS is Tailscale’s feature that lets you address devices by name (e.g. gateway) instead of IP. A provider in Aperture is an upstream model. A grant is a rule saying who may use which models. You’ll meet each below.

Stage 3 — Put the server on your private network

First, get the GB10 machine (or whatever machine you are using as the “AI server”) onto your tailnet.

3.1 Install and join Tailscale on the server

On the server, install Tailscale and bring it up:

curl -fsSL https://tailscale.com/install.sh | sh sudo tailscale up

tailscale up prints a URL — open it in any browser and sign in (Google, GitHub, Microsoft, or email all work). That authenticates this machine and adds it to your tailnet. The account you sign in with defines your tailnet, so remember which one you use — every device must join the same account.

3.2 Note the server’s Tailscale IP

tailscale ip -4

You’ll get an address in the 100.x.x.x range — Tailscale’s private space. I’ll use 100.92.0.10 as a stand-in below; replace it with your own. This is the address Aperture will use to reach your model.

Because vLLM was launched with --network host back in Part 1, it’s already listening on this interface — no change needed. If you run a firewall like ufw, allow the port on the Tailscale interface: sudo ufw allow in on tailscale0 to any port 8000.

Stage 4 — Put Aperture in front

Now we add the gateway. Aperture runs as its own node on your tailnet with its own web dashboard.

4.1 Provision Aperture

Go to aperture.tailscale.com and request access / sign up. During the beta it’s free with any Tailscale account. Once provisioned, Aperture appears as a machine on your tailnet with a hostname, and serves a dashboard at:

http:///ui

I’ll use gateway as the stand-in hostname, so my dashboard is http://gateway/ui. Yours will have its own name — you’ll find it in the Aperture sign-up flow and in your Tailscale admin console’s list of Machines.

Two different dashboards — don’t confuse them. login.tailscale.com/admin is the Tailscale admin console (manages your network: devices, users, access rules). http:///ui is the Aperture dashboard (manages models, providers, and usage). The model configuration below lives in the Aperture dashboard.

4.2 Add your model as a provider

In the Aperture dashboard, open Configuration, and edit the raw HuJSON configuration (Tailscale’s JSON-with-comments format) to define your self-hosted model as a provider. Look for the "providers": {...} block. You may see it already has a default list of third-party providers (e.g., Anthropic, Codex). I just added the following lines inside the "providers": {...} block, right before the first of those lines:

"sovereign": { "baseurl": "http://100.92.0.10:8000", "apikey": "local-no-auth", "models": ["Qwen/Qwen3.6-35B-A3B-FP8"] },

Three details that matter, each of which can cost you an afternoon:

baseurl has no /v1. Aperture appends the incoming request path (which already includes /v1/chat/completions) to your baseurl. If you add /v1 here too, you get a broken /v1/v1/... path. Use just the host and port, remembering to substitute 100.92.0.10 for your server’s Tailscale IP from Stage 3.

apikey is a throwaway. Your vLLM server doesn’t require a key, but Aperture’s dashboard test button refuses to run without one. Any non-empty string works; vLLM ignores it.

models must match the exact model ID vLLM serves (check http://localhost:8000/v1/models on the server if unsure).

4.3 Grant yourself access

Aperture is deny-by-default: even as an admin, you can’t call a model until a grant says so. So, in the same JSON config, find the "grants": [...] block (it follows the "providers": {...} block), and make sure it includes a { "models": "**" } capability to access your model via Aperture:

"grants": [ { "src": ["*"], "app": { "tailscale.com/cap/aperture": [ { "role": "admin" }, { "models": "**" } ] } } ]

This grants everyone on your tailnet ("*") the admin role and access to all models ("**") — fine for a personal setup. To restrict it to just yourself, replace "*" with your Tailscale login name. Save the config (Aperture treats warnings as errors on save, so it’ll tell you if anything’s off).

4.4 Test the route from the dashboard

Open the Models tab. You should see Qwen/Qwen3.6-35B-A3B-FP8 with a Play icon beside it. Click it — a green check means Aperture successfully reached your vLLM server through the tailnet. A red X means it couldn’t (usually a network access rule; see Troubleshooting).

Stage 5 — Connect a client computer

The final piece: reach the model from another device — a laptop, in my case a MacBook. Any OpenAI-compatible tool can use it, but first the client has to be on the tailnet too.

5.1 Install and join Tailscale on the client

macOS / Windows: install the Tailscale app from tailscale.com/download (or the Mac App Store), launch it, and sign in with the same account you used on the server.

Linux: curl -fsSL https://tailscale.com/install.sh | sh then sudo tailscale up, signing in with the same account.

The “same account” part is essential — it’s what puts the client on the same tailnet as the server and the gateway.

5.2 The end-to-end test

From a terminal on the client, first confirm it can see the gateway and resolve its name:

curl http://gateway/v1/models

If that returns JSON listing your model, MagicDNS is resolving and the tailnet path works. Then the moment of truth — a real request, from your laptop, through the gateway, to the model running on the GB10 box:

curl http://gateway/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"Qwen/Qwen3.6-35B-A3B-FP8","messages":[{"role":"user","content":"hello from my laptop"}]}'

No API key in the request — your Tailscale identity is the authentication. When the reply comes back, it has traveled: laptop → tailnet → Aperture → vLLM on the server → back. Check the Aperture dashboard’s usage log and you’ll see that request recorded with your identity and a token count.

This is the whole setup working as required: a very capable private model, reachable from anywhere you and your devices are, authenticated by identity, logged centrally, and never exposed to the public internet.

What’s next

Congratulations! You now have a secure, private gateway to your own model. A natural next step is to point real applications at it — chat front-ends, notebook tools, coding assistants — which all reduce to the same three settings:

Setting

Value

Base URL

http://gateway/v1

API key

any non-empty placeholder

Model

Qwen/Qwen3.6-35B-A3B-FP8

I’m planning to explore wiring up specific apps and different models — and the per-app and per-model quirks that arise — in future posts. But, nothing is stopping you from trying out your sovereign AI setup. And, drop a line to share how you are enjoying it, what you are using it for, and suggestions for other sovereign AI projects!

Troubleshooting recap (Part 2)

Symptom

Cause

Fix

Play-icon test shows a red X

A tailnet access rule (ACL) blocks the Aperture node from reaching the server’s port

Allow the Aperture node → server:8000 in your Tailscale admin console policy

/v1/v1/... errors or 404s through the gateway

/v1 mistakenly included in the provider baseurl

Use host:port only, no /v1

“No API key configured” / Play icon greyed out

Self-hosted provider missing an apikey

Add any non-empty placeholder; vLLM ignores it

gateway won’t resolve on the client

MagicDNS off, or client on a different tailnet

Enable MagicDNS; confirm the client signed in with the same account. Fallback: use the Aperture node’s 100.x.x.x IP

403 / access denied through the gateway

No grant covers your identity

Add/confirm a grants entry covering your user and the model

Topics covered in this post

Sovereign AI

Local LLM

NVIDIA DGX Spark

Tailscale

Aperture

AI Infrastructure

Self-Hosting

Sovereign AI

Local LLM

NVIDIA DGX Spark

Tailscale

Aperture

AI Infrastructure

Self-Hosting

Your Own Private AI, Part 1: Running a Local LLM on a GB10 NVIDIA DGX Spark

Sovereign AI

Local LLM

NVIDIA DGX Spark

vLLM

AI Infrastructure

Self-Hosting

The Glomar Trap: How Anthropic's Claude Mythos Release Pulls Forward Enterprise Sales

AI Strategy

AI Governance

Enterprise AI

Cybersecurity

Industry Views

The Hidden Costs of “Easy AI”

AI Strategy

Data Privacy

AI Governance

Enterprise AI

← Previous post

Your Own Private AI, Part 1: Running a Local LLM on a GB10 NVIDIA DGX Spark

Don't Do Software Estimation. Do Incremental Small(ish) Non-Trivial Chunking.