2026-04-27原文

How to build scalable web apps with OpenAI's Privacy Filter

This article demonstrates building three scalable web applications using OpenAI's newly released open-source Privacy Filter: Document Privacy Explorer, Image Anonymizer, and SmartRedact Paste. Each app showcases different capabilities of the model and leverages gradio.Server for efficient backend processing and custom frontends.

Article intelligence

EngineersAdvanced

Key points

OpenAI released Privacy Filter, an open-source PII detector supporting 128k context and eight categories.
Three example apps: Document Privacy Explorer, Image Anonymizer, SmartRedact Paste.
All apps built on gradio.Server, combining custom HTML/JS frontends with Gradio's queuing and ZeroGPU allocation.
Model: 1.5B parameters (50M active), state-of-the-art on PII-Masking-300k benchmark.

Why it matters

This matters because openAI released Privacy Filter, an open-source PII detector supporting 128k context and eight categories.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

How to build scalable web apps with OpenAI's Privacy Filter

Hugging Face

Models

Datasets

Spaces

Buckets new

Docs

Enterprise

Pricing

Back to Articles

How to build scalable web apps with OpenAI's Privacy Filter

Published April 27, 2026

Update on GitHub

Upvote

yuvraj sharma

ysharma

Freddy Boulton

freddyaboulton

Abubakar Abid

abidlabs

The model

Document Privacy Explorer

Image Anonymizer

SmartRedact Paste

gradio.Server provides">What gradio.Server provides

Try them

Model call → queued endpoint. Hit from the browser via

client.predict("/create_paste", { text, ttl }).

@server.api(name="create_paste") def create_paste(text: str, ttl: str = "never") -> dict: source_text, spans = run_privacy_filter(text) redacted = redact(source_text, spans) # placeholders pid, reveal_token = secrets.token_urlsafe(6), secrets.token_urlsafe(22) PASTES[pid] = Paste(pid, reveal_token, source_text, redacted, spans, expires_at=_ttl(ttl)) # see app.py return { "view_path": f"/view/{pid}", "reveal_path": f"/view/{pid}?token={reveal_token}", }

View page → plain FastAPI GET. No model, no queue needed, and we

actually want the bespoke URL shape `/view/{pid}?token=...` that a

queued endpoint couldn't give us.

@server.get("/view/{pid}", response_class=HTMLResponse) async def view_paste(pid: str, token: str | None = None): p = _store_get(pid) # see app.py for store if p is None: return HTMLResponse(_not_found(), status_code=404) revealed = bool(token) and secrets.compare_digest(token, p.reveal_token) return HTMLResponse(_render_view(p, revealed))

A daemon thread evicts expired pastes every 30 seconds. The whole service, including storage, is about 200 lines of application code because everything lives in one process.

What gradio.Server provides

The split across all three apps is the same — anything that touches the model goes through @server.api, everything else stays on plain FastAPI routes:

App Queued compute (@server.api) Plain FastAPI routes

Document Privacy Explorer analyze_document — extract, detect, stats GET / serves the custom reader view

Image Anonymizer anonymize_screenshot — OCR, detect, spans → pixel boxes GET / + GET /examples/* serve the canvas UI and preloaded examples

SmartRedact Paste create_paste — detect, redact, mint IDs GET / compose page, GET /view/{pid}?token=... public + token-gated views, GET /api/paste/{pid} JSON lookup

@server.api gives you Gradio's queue (serialized requests, correct @spaces.GPU composition on ZeroGPU, progress events) and it's what the browser hits through @gradio/client. The same endpoint is also what gradio_client users hit from Python — one function, two SDKs, no duplicated code. Plain @server.get/@server.post are reserved for the static surfaces: HTML pages, file lookups, cheap dict reads. That's the rule of thumb from the gradio.Server intro post, and it's what makes these three apps feel consistent even though their UIs are very different.

Try them

Document Privacy Explorer

Image Anonymizer

SmartRedact Paste

Drop in a resume, a screenshot of a Slack thread, a log line with a token in it. The fun part is seeing what Privacy Filter catches (and occasionally misses) on text you actually care about.

How to build scalable web apps with OpenAI's Privacy Filter

Article intelligence

Key points

Why it matters

Technical impact

Model call → queued endpoint. Hit from the browser via

client.predict("/create_paste", { text, ttl }).

View page → plain FastAPI GET. No model, no queue needed, and we

actually want the bespoke URL shape /view/{pid}?token=... that a

queued endpoint couldn't give us.

actually want the bespoke URL shape `/view/{pid}?token=...` that a