Browser Integration
Set up realtime transcription in a web browser using the getRealtimeToken callback and the Web Audio API.
This guide walks through a complete browser-side realtime transcription integration. The SDK manages token lifecycle automatically — you just provide a callback that fetches tokens from your server.
Architecture
Step 1: Server — Token Endpoint
Create an endpoint on your server that generates temporary tokens for authenticated users.
Express
Next.js
Fastify
⚠️ Warning: Always authenticate your own users before issuing Scribeberry tokens. Don't expose this endpoint without your own auth layer.
Step 2: Browser — Initialize SDK with Token Callback
Instead of manually managing tokens, provide a getRealtimeToken callback. The SDK calls it automatically when it needs a token (on connect and before expiry).
The SDK will:
- Call your callback on first connect
- Cache the token until it's close to expiry
- Auto-refresh ~60 seconds before expiry (no interruption to active sessions)
- Retry if refresh fails
Step 3: Browser — Start Session
Step 4: Browser — Capture Microphone Audio
Scribeberry expects PCM 16-bit signed little-endian, 16kHz, mono audio. Here's how to capture and convert it from the browser's Web Audio API:
Step 5: Putting It All Together
Complete React Example
Token Lifecycle
When using the getRealtimeToken callback, the SDK handles token lifecycle automatically:
- First connect — SDK calls your callback to get a token
- During session — SDK tracks the
expiresAttimestamp - Before expiry — SDK calls your callback again ~60s before the token expires
- Seamless refresh — active sessions continue uninterrupted
You don't need to write any token refresh logic. If you need manual control, you can still use the static apiKey: 'sb_rt_...' pattern instead.
Troubleshooting
| Issue | Solution |
|---|---|
| No audio data received | Check that getUserMedia permissions are granted and the AudioContext sample rate is 16000 |
| WebSocket closes immediately | Verify the token is valid and hasn't expired. Check CORS configuration. |
| Transcript is garbled | Ensure audio is PCM 16-bit, 16kHz, mono. Float32 audio will not work. |
| "Session not active" errors | Wait for the started event before calling sendAudio() |
| High latency | Use a buffer size of 4096 or smaller in createScriptProcessor() |