Built an AI that explains math visually instead of just answering
Claw Learn is an AI-powered visual math tutor that combines the ElevenLabs Speech Engine with a custom canvas renderer to turn math questions into live animated explanations with synchronized narration. Users can ask questions by voice or text and watch the animation generate in real-time.
Article intelligence
Key points
- Claw Learn transforms math questions into visual animated explanations with real-time voice interaction. The project is built on Next.js 16 and uses ElevenLabs WebRTC for low-latency voice I/O.
- Supports multiple AI providers (Gemini, OpenAI, Ollama) and offers detailed deployment guides.
- The canvas renderer supports over 30 visual element types to dynamically generate custom teaching scenes.
Why it matters
This matters because claw Learn transforms math questions into visual animated explanations with real-time voice interaction. The project is built on Next.js 16 and uses ElevenLabs WebRTC for low-latency voice I/O.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Notifications You must be signed in to change notification settings
Fork 1
Star 1
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
21 Commits
21 Commits
app
app
components
components
docs
docs
hooks
hooks
lib
lib
public
public
types
types
.env.local.example
.env.local.example
.gitignore
.gitignore
AGENTS.md
AGENTS.md
CLAUDE.md
CLAUDE.md
CONTRIBUTING.md
CONTRIBUTING.md
LICENSE
LICENSE
README.md
README.md
SECURITY.md
SECURITY.md
auth.ts
auth.ts
eslint.config.mjs
eslint.config.mjs
next.config.ts
next.config.ts
package-lock.json
package-lock.json
package.json
package.json
postcss.config.mjs
postcss.config.mjs
proxy.ts
proxy.ts
tsconfig.json
tsconfig.json
vercel.json
vercel.json
Repository files navigation
Talk to it. Watch it teach.
Claw Learn is an AI-powered visual math tutor with a real-time voice interface — powered by the ElevenLabs Speech Engine. Ask any math or physics question by voice or text, and watch a synchronized animated explanation generate live in the browser.
Live Demo · Report a Bug · Request a Feature
What is Claw Learn?
Claw Learn combines the ElevenLabs Speech Engine with an AI scene planner and a custom canvas renderer to turn math questions into live animated explanations with synchronized narration.
The Speech Engine is the core of the experience — it handles both voice input and audio output over WebRTC, so you can speak your question, interrupt mid-explanation, and ask follow-ups without ever touching a keyboard. When the Speech Engine isn't configured, the app falls back to REST TTS and browser-based speech recognition.
No slides. No textbooks. No pre-recorded videos. Every explanation is generated fresh for your exact question.
You: "Why does the derivative represent slope?"
App: → ElevenLabs Speech Engine captures your voice over WebRTC → AI generates a 10-scene visual teaching plan → Canvas renders: axes, parabola, tangent line, slope formula → Speech Engine narrates each scene in sync with the animation → Interrupt at any time to ask a follow-up — just speak
Demo
Add a GIF or screenshot here
Try these questions:
"How does matrix multiplication work?"
"Explain the Fourier transform visually"
"What is integration and why does it find area?"
"Show me Euler's formula e^(iπ) + 1 = 0"
"How does gravity create orbits?"
Tech Stack
Layer Technology
Framework Next.js 16 (App Router, Turbopack)
UI React 19, Tailwind CSS v4, Framer Motion
AI Any OpenAI-compatible API (Gemini, OpenAI, Ollama, etc.)
Voice I/O ElevenLabs Speech Engine (WebRTC)
TTS fallback ElevenLabs REST API
STT fallback Web Speech API
Animations Custom 2D Canvas renderer
Language TypeScript 5
Deployment Vercel
Getting Started
Prerequisites
Node.js 18+
An OpenAI-compatible API key — Gemini (free at aistudio.google.com), OpenAI, or any compatible provider
Google OAuth credentials — required for login (console.cloud.google.com)
Upstash Redis — recommended for rate limiting (console.upstash.com, free tier)
ElevenLabs — optional, free tier at elevenlabs.io
- Clone and install
git clone https://github.com/arzumanabbasov/claw-learn.git cd claw-learn npm install
- Configure environment variables
cp .env.local.example .env.local
Open .env.local and fill in your keys:
── AI Provider (required) ────────────────────────────────────────────────────
OPENAI_API_KEY=your_api_key_here OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai OPENAI_MODEL=gemini-2.5-flash
── Auth (required) ───────────────────────────────────────────────────────────
Generate a secret: openssl rand -base64 32
AUTH_SECRET=your_auth_secret_here
Google OAuth — https://console.cloud.google.com/
GOOGLE_CLIENT_ID=your_google_client_id GOOGLE_CLIENT_SECRET=your_google_client_secret
── Rate limiting — Upstash Redis (recommended) ───────────────────────────────
Without these, rate limiting falls back to in-memory (resets on server restart)
Create a free Redis DB at https://console.upstash.com/
UPSTASH_REDIS_REST_URL=https://your-db.upstash.io UPSTASH_REDIS_REST_TOKEN=your_token_here
── ElevenLabs Voice (optional) ───────────────────────────────────────────────
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here ELEVENLABS_VOICE_ID=pNInz6obpgDQGcFmaJgB
Speech Engine — full WebRTC voice I/O
Create an agent at https://elevenlabs.io/app/conversational-ai
ELEVENLABS_SPEECH_ENGINE_ID=agent_xxxxxxxxxxxxxxxxxxxx
── Security ──────────────────────────────────────────────────────────────────
ALLOWED_ORIGIN=https://your-domain.com
- Run
npm run dev
Open http://localhost:3000.
Authentication
Claw Learn uses NextAuth.js v5 with Google OAuth. All routes require a valid session — unauthenticated users are redirected to /login.
Setup:
Go to console.cloud.google.com → APIs & Services → Credentials
Create an OAuth 2.0 Client ID (Web application)
Add your domain to Authorized JavaScript origins and https://your-domain.com/api/auth/callback/google to Authorized redirect URIs
Copy the Client ID and Secret into your env vars
Generate AUTH_SECRET with openssl rand -base64 32
For local dev, add http://localhost:3000 as an authorized origin and http://localhost:3000/api/auth/callback/google as a redirect URI.
Rate Limiting
Each authenticated user gets 3 questions per day, tracked by their Google user ID and reset at UTC midnight.
Rate limiting uses Upstash Redis in production — an atomic INCR with a TTL set to the end of the current UTC day. This is serverless-safe and works across all Vercel edge instances.
Without Upstash credentials, the app falls back to an in-memory store that resets whenever the server restarts (fine for local dev, not suitable for production).
Setup:
Create a free Redis database at console.upstash.com
Copy the REST URL and token into UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN
The remaining question count is shown in the top bar as a live badge and resets automatically each day.
Claw Learn uses the OpenAI-compatible API format, so it works with any provider that supports it.
Gemini (default)
OPENAI_API_KEY=your_gemini_api_key OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai OPENAI_MODEL=gemini-2.5-flash
OpenAI
OPENAI_API_KEY=your_openai_api_key OPENAI_BASE_URL=https://api.openai.com/v1 OPENAI_MODEL=gpt-4o
Ollama (local)
OPENAI_API_KEY=ollama OPENAI_BASE_URL=http://localhost:11434/v1 OPENAI_MODEL=llama3.1
Voice Modes
Mode 1 — ElevenLabs Speech Engine (recommended)
The Speech Engine connects via WebRTC to an ElevenLabs Conversational AI agent and is the primary voice interface for Claw Learn. It handles both input and output in a single low-latency connection:
Voice input — speak your questions naturally, no typing needed
Streaming TTS — audio streams directly from ElevenLabs as each scene plays
Interruption — speak mid-explanation to redirect or ask a follow-up
Lower latency — WebRTC is significantly faster than the REST fallback
Setup:
Go to elevenlabs.io/app/conversational-ai
Create a new agent
Set the system prompt to: "You are a math narration voice. Read exactly what the user sends you as clear, natural narration."
Copy the Agent ID and set it as ELEVENLABS_SPEECH_ENGINE_ID in your .env.local
The Voice button in the top bar connects and disconnects the Speech Engine. When connected, a pulsing green indicator shows the session is live.
Mode 2 — REST TTS fallback
When ELEVENLABS_API_KEY is set but no Speech Engine is configured, each scene's narration is sent to the ElevenLabs REST API and played back as audio. No voice input in this mode.
Mode 3 — No voice
The app works fully without any ElevenLabs configuration — text input and silent animations only.
Deployment
Vercel (recommended)
npx vercel
Set these environment variables in the Vercel dashboard under Settings → Environment Variables:
Variable Required Description
OPENAI_API_KEY ✅ API key for your AI provider
OPENAI_BASE_URL ✅ Base URL of the OpenAI-compatible endpoint
OPENAI_MODEL ✅ Model name to use
AUTH_SECRET ✅ NextAuth secret (openssl rand -base64 32)
GOOGLE_CLIENT_ID ✅ Google OAuth client ID
GOOGLE_CLIENT_SECRET ✅ Google OAuth client secret
UPSTASH_REDIS_REST_URL Recommended Upstash Redis URL for persistent rate limiting
UPSTASH_REDIS_REST_TOKEN Recommended Upstash Redis token
ELEVENLABS_API_KEY Optional ElevenLabs REST TTS fallback
ELEVENLABS_VOICE_ID Optional Override default voice
ELEVENLABS_SPEECH_ENGINE_ID Recommended WebRTC voice agent ID
ALLOWED_ORIGIN Recommended Your production domain for CORS
The vercel.json in the repo is pre-configured.
Self-hosted
npm run build npm start
Requires Node.js 18+ and the environment variables above.
Visual Element Reference
The canvas renderer supports 30+ element types:
Type Description
axes Coordinate axes with grid and tick labels
graph Function curve (JS math expression)
tangent Tangent line to a curve at a point
secant Secant line between two points
shaded_area Filled area under a curve
point Dot with optional label
vector Arrow with label
matrix Matrix grid with brackets and highlights
formula Math text in a pill box
histogram Bar chart for distributions
pie_chart Proportions and compositions
bar_chart Categorical comparisons
line_chart Discrete data series
scatter_plot Correlation with optional regression line
wave Propagating sine/cosine wave
axes_3d Isometric 3D axes
complex_plane Re/Im axes with unit circle
riemann_sum Rectangles approximating an integral
slope_field Directional arrows for dy/dx
parametric_curve x(t), y(t) traced as t varies
polygon Arbitrary shape from vertices
angle_arc Label an angle between two rays
spring Physics spring between two points
brace Curly brace annotation
table Data table with headers
highlight_region Shaded overlay
Coordinate system: origin at center, x right, y up. Typical visible range: x ∈ [-6, 6], y ∈ [-4, 4].
Project Structure
clawlearn/ ├── app/ │ ├── api/ │ │ ├── explain/route.ts # POST — AI scene plan generation │ │ ├── narrate/route.ts # POST — ElevenLabs REST TTS │ │ └── speech-engine/token/ # GET — WebRTC conversation token │ ├── page.tsx # Root — landing ↔ tutor router │ ├── layout.tsx # Fonts, metadata, global CSS │ └── globals.css # Design tokens, animations │ ├── components/ │ ├── LandingPage.tsx # Marketing page │ ├── TutorApp.tsx # App shell │ ├── AnimationCanvas.tsx # Canvas + scene sequencer │ ├── ConversationPanel.tsx # Chat history │ ├── QuestionInput.tsx # Input bar │ └── NarrationSubtitle.tsx # Subtitle below canvas │ ├── hooks/ │ ├── useTutor.ts # Core orchestration │ ├── useSpeechEngine.ts # ElevenLabs Speech Engine (WebRTC) │ └── useVoice.ts # Web Speech API fallback │ ├── lib/ │ ├── openai.ts # OpenAI-compatible client + system prompt │ ├── animationEngine.ts # Canvas renderer (30+ elements) │ ├── elevenlabs.ts # ElevenLabs REST helpers │ └── voiceRecognition.ts # Web Speech API wrapper │ ├── types/ │ └── scene.ts # Scene plan TypeScript types │ ├── .env.local.example # Environment variable template ├── CONTRIBUTING.md # Contribution guide ├── LICENSE # MIT └── vercel.json # Vercel deployment config
Security
API keys are server-side only — never exposed to the browser
Input is length-limited and validated on every API route
CORS is locked to ALLOWED_ORIGIN in production
The canvas renderer uses a safe recursive-descent math parser — no eval or new Function
Security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy) are set on all responses
See SECURITY.md for the full security policy and how to report vulnerabilities.
Known Limitations
No persistence — conversation history is in-memory, cleared on pag
[truncated for AI cost control]