Show HN: E3d-pod2vid – AI pipeline that turns podcasts into YouTube-ready videos
E3d-pod2vid is an open-source AI pipeline that automatically converts podcasts into YouTube-ready videos. It uses GPT-4o-mini for semantic B-roll selection, generates burned-in subtitles, supports optional OpenAI TTS voice replacement, and can upload to YouTube and post to multiple social platforms.
Notifications You must be signed in to change notification settings
Fork 0
Star 1
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
5 Commits
5 Commits
.env.example
.env.example
.gitignore
.gitignore
README.md
README.md
announce.js
announce.js
linkedin_auth.js
linkedin_auth.js
make_thumbnail.py
make_thumbnail.py
package.json
package.json
pod2vid.py
pod2vid.py
requirements.txt
requirements.txt
tts_replace.py
tts_replace.py
yt_auth.js
yt_auth.js
yt_update.js
yt_update.js
yt_upload.js
yt_upload.js
Repository files navigation
AI-powered podcast-to-video pipeline. Converts a diarized audio file (NotebookLM, podcast, interview) into a YouTube-ready MP4 with:
Semantically matched Pexels B-roll per utterance (GPT-4o-mini picks the clip)
Burned-in subtitles (no ffmpeg libass required — pure Pillow)
Optional OpenAI TTS voice replacement (swap out NotebookLM / AI voices)
YouTube upload + description/thumbnail update
One-shot multi-platform social posting (Discord, Telegram, X, Moltbook, LinkedIn)
Quick Start
git clone https://github.com/spacepacket1/e3d-pod2vid.git cd e3d-pod2vid
Python deps
pip install -r requirements.txt
Node deps (YouTube + social posting only)
npm install
Copy and fill in your API keys
cp .env.example .env $EDITOR .env
Workflow
- Convert audio to video
python3 pod2vid.py episode.m4a output/episode.mp4
This single command:
Uploads audio to AssemblyAI for speaker diarization
Asks GPT-4o-mini for a specific Pexels search query per utterance
Downloads matching B-roll clips (cached per query)
Renders each segment with burned-in subtitles
Concatenates into a final MP4 + SRT subtitle file
Caches diarization and queries as JSON so re-runs are fast.
- (Optional) Replace voices with OpenAI TTS
If you want custom voices instead of the original audio (e.g. replace NotebookLM voices):
Synthesize with OpenAI TTS voices
python3 tts_replace.py output/episode-diarization.json episode-tts
Render video using TTS audio
python3 pod2vid.py output/episode-tts.mp3 output/episode-tts.mp4
Default voices: onyx (Speaker A) and nova (Speaker B). Override with VOICE_A / VOICE_B.
Available voices: alloy, echo, fable, onyx, nova, shimmer
- Generate a thumbnail
python3 make_thumbnail.py "Predictive GPS for Autonomous AI Agents" thumbnail.png /path/to/logo.png
Outputs a 1280×720 PNG with title, accent stripe, and optional logo overlay. Pure Pillow — no browser or design tool required.
- Upload to YouTube
First time: authorize your account
node yt_auth.js
The script prints a URL. Open it on any device (phone, browser — the machine running the script doesn't need a browser). After approving, paste the redirect URL back into the terminal. Tokens are saved to youtube-tokens.json.
Upload the video
node yt_upload.js output/episode-tts.mp4 "My Episode Title"
Prints the video URL and ID when done.
Update description and thumbnail
YT_DESCRIPTION="Check out maps.e3d.ai — AI-powered GPS for autonomous vehicles.
Follow us: • X: @e3dmaps • Discord: https://discord.gg/your-server" \ node yt_update.js VIDEO_ID thumbnail.png
- Announce on social media
node announce.js https://www.youtube.com/watch?v=VIDEO_ID "New episode: Predictive GPS for Autonomous AI Agents"
Posts simultaneously to all configured platforms. Platforms with no credentials are silently skipped.
Platform Credential(s) needed
Discord DISCORD_BOT_TOKEN + DISCORD_CHANNEL_ID
Telegram TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID
X (Twitter) X_ACCESS_TOKEN
Moltbook MOLTBOOK_API_KEY
LinkedIn linkedin-tokens.json with person_urn (run node linkedin_auth.js)
- (Optional) LinkedIn setup
LinkedIn's API requires a few one-time setup steps before announce.js can post there.
Step 1 — Create a LinkedIn app
Go to linkedin.com/developers/apps and create an app. Under the Auth tab, add this as an authorized redirect URL:
https://www.linkedin.com/developers/tools/oauth/redirect
Step 2 — Add required products
Under the Products tab, request access to both:
Share on LinkedIn — grants w_member_social scope (post on behalf of user)
Sign In with LinkedIn using OpenID Connect — grants openid profile scopes (needed to resolve your person URN)
Both are typically approved instantly for personal apps.
Step 3 — Verify company association (if prompted)
LinkedIn may ask you to verify a company page association. Open the verification URL while logged in as a Page Admin and approve it.
Step 4 — Authorize and get tokens
Add your app credentials to .env:
LINKEDIN_CLIENT_ID=your_client_id LINKEDIN_CLIENT_SECRET=your_client_secret
Then run:
node linkedin_auth.js
Open the printed URL on any device. After approving, paste the redirect URL back. Tokens are saved to linkedin-tokens.json.
Step 5 — Add your person URN
LinkedIn's API requires your encoded person ID (not your numeric member ID). To find it:
Go to your LinkedIn profile in a browser
View Page Source (Cmd+U / Ctrl+U) and search for urn:li:member:
Note the numeric ID (e.g. 4435724)
Make a test API call — the error response will reveal your encoded person URN (e.g. urn:li:person:2KqUAyg4oY)
Or run this one-liner after getting a token:
node -e " const https = require('https'); const t = JSON.parse(require('fs').readFileSync('linkedin-tokens.json')); // Replace MEMBER_ID with your numeric ID from page source const body = JSON.stringify({author:'urn:li:member:MEMBER_ID',commentary:'test',visibility:'PUBLIC',distribution:{feedDistribution:'MAIN_FEED',targetEntities:[],thirdPartyDistributionChannels:[]},lifecycleState:'PUBLISHED',isReshareDisabledByAuthor:false}); const u = require('url').parse('https://api.linkedin.com/rest/posts'); const r = https.request(Object.assign(u,{method:'POST',headers:{'Authorization':'Bearer '+t.access_token,'Content-Type':'application/json','Content-Length':Buffer.byteLength(body),'LinkedIn-Version':'202506','X-Restli-Protocol-Version':'2.0.0'}}),res=>{let d='';res.on('data',c=>d+=c);res.on('end',()=>console.log(d.slice(0,300)));}); r.write(body);r.end(); "
The error message will contain your encoded URN. Save it:
node -e " const fs = require('fs'); const t = JSON.parse(fs.readFileSync('linkedin-tokens.json')); t.person_urn = 'urn:li:person:YOUR_ENCODED_ID'; fs.writeFileSync('linkedin-tokens.json', JSON.stringify(t, null, 2)); "
Once linkedin-tokens.json contains person_urn, announce.js will post to LinkedIn automatically.
Configuration
Copy .env.example to .env and fill in the keys you need.
Variable Required for Notes
ASSEMBLYAI_API_KEY pod2vid.py assemblyai.com
OPENAI_API_KEY pod2vid.py, tts_replace.py GPT-4o-mini + TTS
PEXELS_API_KEY pod2vid.py pexels.com/api — free
DISCORD_BOT_TOKEN announce.js Optional
DISCORD_CHANNEL_ID announce.js Optional
TELEGRAM_BOT_TOKEN announce.js Optional
TELEGRAM_CHAT_ID announce.js Optional
X_ACCESS_TOKEN announce.js OAuth2 bearer token
MOLTBOOK_API_KEY announce.js Optional
MOLTBOOK_SUBMOLT announce.js Submolt name (default: agentfinance)
LINKEDIN_CLIENT_ID linkedin_auth.js From LinkedIn Developer Portal
LINKEDIN_CLIENT_SECRET linkedin_auth.js From LinkedIn Developer Portal
LINKEDIN_TOKEN_FILE announce.js Default: linkedin-tokens.json — must contain person_urn
VOICE_A tts_replace.py Default: onyx
VOICE_B tts_replace.py Default: nova
SPEAKER_A_NAME pod2vid.py Subtitle label (default: Host)
SPEAKER_B_NAME pod2vid.py Subtitle label (default: Guest)
YT_PRIVACY yt_upload.js public / unlisted / private
YT_DESCRIPTION yt_update.js Full video description text
How semantic B-roll works
Instead of rotating through a fixed clip library, this pipeline asks GPT-4o-mini to generate a specific Pexels search query for each utterance:
"EZPass saved us 90 seconds at every toll plaza" → "toll booth highway payment"
"the dual-witness problem" → "courtroom judge testimony"
"machine learning position predictions" → "machine learning data training loop"
Queries are cached so re-runs or TTS voice swaps don't re-spend API credits. ~82 unique clips across a 90-segment episode is typical.
Requirements
Python 3.8+
Pillow >= 10.0
python-dotenv >= 1.0
ffmpeg (any version — subtitle rendering does not require libfreetype/libass)
Node.js 18+
dotenv
External APIs
AssemblyAI (diarization)
OpenAI (GPT-4o-mini + TTS)
Pexels (B-roll clips, free tier fine for personal use)
YouTube Data API v3 (via Google Cloud Console)
LinkedIn API (via LinkedIn Developer Portal) — optional, for posting
Output files
output/ episode.mp4 final video episode.srt subtitle file for YouTube CC episode-diarization.json cached AssemblyAI result episode-queries.json cached GPT Pexels queries broll/ cached B-roll clips (one per unique query) tts-cache/ cached TTS utterances (per voice+text hash)
Credits
Built by E3D Maps — AI-powered navigation for autonomous vehicles.
License
MIT
About
AI-powered podcast-to-video pipeline. Semantic B-roll, voice synthesis, burned subtitles, YouTube publishing.
Resources
Readme
Uh oh!
There was an error while loading. Please reload this page.
Activity
Stars
1 star
Watchers
0 watching
Forks
0 forks
Report repository
Releases
No releases published
Packages 0
Uh oh!
There was an error while loading. Please reload this page.
Contributors
Uh oh!
There was an error while loading. Please reload this page.
Languages
Python 50.1%
JavaScript 49.9%