AI audio translator with speech-to-text, LLM translation, and text-to-speech
A Python-based open-source AI audio translator that uses Telnyx APIs for speech-to-text, LLM-based translation, and text-to-speech, allowing users to upload audio and receive translated audio with aligned transcript.
Uh oh!
There was an error while loading. Please reload this page.
Notifications You must be signed in to change notification settings
Fork 3
Star 140
Copy path
More options
More options
More options
More options
Latest commit
History
History
History
Copy path
Folders and files
NameName
Last commit message
Last commit date
parent directory
..
.env.example
.env.example
API.md
API.md
GUIDE.md
GUIDE.md
README.md
README.md
app.py
app.py
requirements.txt
requirements.txt
Media Streaming
integrations
channel
voice
api
AI Content Translator
Upload any audio (podcast, meeting, lecture), STT transcribes in source language, AI Inference translates, TTS generates audio in target language. Returns translated audio + aligned transcript.
Telnyx API Endpoints Used
STT Transcribe: POST /v2/ai/transcribe -- ref
AI Inference: POST /v2/ai/chat/completions -- ref
TTS Generate: POST /v2/ai/generate -- ref
Architecture
API Request │ ▼ ┌──────────────────┐ │ Answer + Greet │ ── TTS welcome message └────────┬─────────┘ │ ▼ ┌──────────────────┐ │ Gather Speech │ ── STT transcription └────────┬─────────┘ │ ▼ ┌──────────────────┐ │ AI Inference │ │ • Translation │ └────────┬─────────┘ │ ◄──── conversation loop │ ▼ JSON response
How It Works
Sends conversation to Telnyx AI Inference for processing
Converts response to speech via Telnyx TTS
Why Telnyx
Telnyx is an AI Communications Infrastructure platform - voice, messaging, SIP, AI, and IoT on one private, global network.
Co-located inference - LLM runs on the same network as voice traffic. Sub-200ms round trips.
Environment Variables
Copy .env.example to .env and fill in:
Variable Type Example Required Description Where to get it
TELNYX_API_KEY string KEY0123456789ABCDEF yes Telnyx API v2 key Portal
AI_MODEL string moonshotai/Kimi-K2.6 no AI Inference model Docs
TTS_MODEL string telnyx/tts no TTS model name Docs
STT_MODEL string telnyx/asr no STT model name Docs
Setup
git clone https://github.com/team-telnyx/telnyx-code-examples.git cd telnyx-code-examples/ai-content-translator-python cp .env.example .env pip install -r requirements.txt python app.py
Webhook Configuration
ngrok http 5000
Set webhook URL in Telnyx Portal:
Call Control Application -> https://.ngrok.io/webhooks/voice
API Reference
POST /translate
Upload as multipart form:
curl -X POST http://localhost:5000/translate \ -F [email protected] \ -F source=en \ -F target=ja
Response:
{"job_id": "tr-a1b2c3d4", "status": "complete", "source": "en (English)", "target": "ja (Japanese)", "original_length": 1847, "translated_length": 923}
GET /health
curl http://localhost:5000/health
{"status": "ok"}
Troubleshooting
Connection refused on port 5000: App isn't running. Run python app.py and check no other process uses port 5000.
401 Unauthorized: Your TELNYX_API_KEY is invalid. Generate a new one at portal.telnyx.com/api-keys.
AI response slow/empty: Verify model name. See available models at developers.telnyx.com.
Related Examples
run-llm-inference-python - Standalone inference
build-voice-ai-agent-python - Voice AI agent
Resources
AI Inference Guide
Call Control Guide
Telnyx Developer Docs
Telnyx Portal