2026-06-26 12:50 UTCIn-site rewrite3 min readUpdated: 2026-06-26 13:17 UTC

Show HN: Jargo – a Golang port of Pipecat for conversational-AI apps

Jargo is an early-stage Go port of Pipecat, providing a real-time voice agent framework with WebRTC audio I/O, streaming STT→LLM→TTS pipeline, turn-taking, and barge-in. It aims to offer a self-hosted, binary-deployable alternative to Python-based solutions.

SourceHacker News AIAuthor: fallais

Article intelligence

EngineersIntermediate

Key points

Jargo brings Pipecat's architecture to Go, enabling static binary deployment with low memory footprint and fast startup.
It uses plain WebRTC via Pion, avoiding proprietary transports like Daily, and supports RTVI data channels for interoperability.
Features include streaming voice pipeline, local VAD-based turn-taking, pluggable STT/LLM/TTS services, and concurrent processing.
Currently in early development; requires cgo and native libraries like libsoxr and optionally libopus for better speech quality.

Why it matters

This matters because jargo brings Pipecat's architecture to Go, enabling static binary deployment with low memory footprint and fast startup.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

This panel is AI-generated and reviewed for accuracy.

Uh oh!

There was an error while loading. Please reload this page.

Notifications You must be signed in to change notification settings

Fork 0

Star 1

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

43 Commits

.github/workflows

aggregators

assets

audio

clock

docs

examples

frames

internal/onnxrt

language

pipeline

processor

provider

rtvi

service

transport

turntaking

.dockerignore

.gitignore

.golangci.yml

.goreleaser.yaml

.plumber.yaml

Dockerfile

LICENSE

NOTICE

PLAN.md

README.md

doc.go

go.mod

go.sum

Repository files navigation

jargo builds real-time voice agents in Go: audio in over WebRTC, a streaming transcription → reasoning → speech pipeline with turn-taking and barge-in, and audio back out — over RTVI so existing clients interoperate.

Status: early work in progress. APIs are unstable and will change.

Why?

Pipecat is great, and jargo is a port of it — the architecture and many design decisions are Pipecat's.

Python might not be the way

This port exists for one reason: I'd rather not run a voice agent on Python.

Python is the right tool when you need the AI/data-science ecosystem. A real-time voice server doesn't: the models run as services or as ONNX, and what's left is plumbing — audio framing, WebRTC, concurrency, and shipping a binary. For that, Go is a better fit: one static binary to deploy, low and predictable memory, fast startup, and real concurrency for many simultaneous sessions without a GIL. The heavy numerics stay where they belong (the ONNX Runtime, the remote services), so giving up Python costs little here. See the benchmarks for the honest performance picture.

No Daily, no lock-in

jargo stays on plain, standard WebRTC via Pion — no Daily, no hosted transport, no proprietary SDK or cloud to sign up for. You ship one binary, the browser connects with vanilla WebRTC, and RTVI rides the data channel. Keeping the transport open and self-hosted is a deliberate goal, not an afterthought.

Features

WebRTC, pure Go (Pion) — audio in and out of the browser.

Opus, not pure Go yet, waiting for pion/opus to be ready.

Streaming voice pipeline: STT → LLM → TTS, with prompt caching.

Turn-taking & barge-in: Silero VAD + Smart Turn v3, local ONNX.

RTVI data channel — works with existing RTVI clients.

Pluggable services: swap any STT/LLM/TTS behind a small interface.

Concurrent by design: independent processors; interruptions are frames.

Dependencies

jargo uses cgo (CGO_ENABLED=0 is not supported) and a few native libraries:

libsoxr — audio resampling, linked at build time (libsoxr-dev).

libopus — optional C Opus encoder, selected with -tags libopus (libopus-dev); the default build ships a pure-Go encoder, but libopus sounds noticeably better on speech.

ONNX Runtime — loaded at run time for VAD + end-of-turn detection.

The container image bundles all of them.

Usage

go get github.com/gojargo/jargo

Locally — install the native deps, then build with cgo:

Debian/Ubuntu: apt-get install -y libsoxr-dev libopus-dev

CGO_ENABLED=1 go run ./examples/echo # open http://localhost:8080 CGO_ENABLED=1 go run -tags libopus ./examples/voicebot # libopus speech encoder

With Docker — the image bundles every native dependency, so there's no host setup:

docker build -t jargo-voicebot . docker run --rm -p 8080:8080 \ -e DEEPGRAM_API_KEY=… -e ANTHROPIC_API_KEY=… -e ELEVENLABS_API_KEY=… \ jargo-voicebot

See the Quickstart for the full setup.

Examples

Two runnable bots live in examples/: an echo bot (no API keys) and a full voice bot (STT → LLM → TTS). The fastest way to try either — locally or with Docker — is the Quickstart.

go run ./examples/echo # then open http://localhost:8080

Documentation

See docs/index.md for the full documentation.

License & attribution

jargo is a Go port of Pipecat, distributed under the same BSD 2-Clause License. The upstream copyright — Copyright (c) 2024–2026, Daily — is preserved verbatim in LICENSE; see NOTICE for details. jargo is an independent project, not affiliated with or endorsed by Daily.

About

A WebRTC-native, audio-first conversational-AI framework for Go.

Resources

Readme

License

BSD-2-Clause license

Uh oh!

There was an error while loading. Please reload this page.

Activity

Custom properties

Stars

1 star

Watchers

0 watching

Forks

0 forks

Report repository

Releases

No releases published

Packages 0

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

Languages

Go 99.2%

Dockerfile 0.8%