Realtime Transcription
Stream live audio to Scribeberry and receive instant transcript segments via WebSocket.
Realtime transcription lets you stream audio from a microphone (or any audio source) and receive transcript segments as speech is recognized — with sub-second latency.
How It Works
Realtime transcription uses a WebSocket connection between your application and the Scribeberry API. The flow is:
- Connect — open a WebSocket to
wss://api.scribeberry.com/ws/realtime - Start — send a
startcommand with your session configuration - Stream — send raw audio chunks as binary WebSocket frames
- Receive — get
partial(interim) andfinal(confirmed) transcript events - Stop — send a
stopcommand and receive the final transcript + optional note
Quick Example (Node.js)
Session Lifecycle
| State | Description |
|---|---|
idle | Session created, not yet connected |
connecting | WebSocket open, waiting for server acknowledgment |
active | Streaming audio, receiving transcripts |
paused | Audio paused, connection alive |
stopping | Stop requested, waiting for server to flush |
stopped | Session complete, final results available |
Events
partial — Interim Transcript
Fired rapidly as speech is recognized. Each partial replaces the previous one. Use this for live display of what the user is saying.
final — Confirmed Segment
Fired when a segment of speech is fully recognized. This text is stable — it won't change. Accumulate final segments to build the complete transcript.
endpoint — Utterance Boundary
Fired when a natural pause in speech is detected. Use this to insert paragraph breaks or punctuation.
started — Session Ready
stopped — Session Complete
note — Note Generated
Fired only if you provided a templateId in the session config. The server generates a note from the accumulated transcript after you stop the session.
error — Error Occurred
Session Methods
| Method | Description |
|---|---|
sendAudio(data) | Send a binary audio chunk |
sendStream(iterable) | Stream from an async iterable |
getTranscript() | Get accumulated transcript text so far |
getSegments() | Get all confirmed segments so far |
pause() | Pause audio (connection stays alive) |
resume() | Resume after pause |
finalize() | Force-flush pending audio |
stop() | Stop the session and get final results |
Configuration
Auto Note Generation
If you pass a templateId, the server automatically generates a note from the accumulated transcript when you stop the session:
Next Steps
-
Browser Integration: Set up realtime transcription in a web browser with temporary tokens.
-
Audio Format: Detailed audio format requirements for realtime streaming.