Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.medera.info/llms.txt

Use this file to discover all available pages before exploring further.

Medera Speech to Text runs on a hybrid Whisper + Deepgram nova-2-medical stack, tuned for behavioral-health language and validated screener administration.

Capabilities

CapabilitySurface
Real-time conversational STT/streams (WebSocket)
Real-time stateless dictation/transcribe (WebSocket)
Batch transcription/recordings (REST)
Diarization/streams, /recordings
Smart punctuationAll
Dictation commands/transcribe
Interim resultsAll
Speaker enrollment/streams
Language detectionAll

Languages

See Languages for the full matrix.

Latency

  • Interim transcript: < 350 ms P95
  • Final transcript: < 900 ms P95
  • Diarization label: < 1.2 s P95 after final

Quality

Audio recommendations:
  • 16 kHz 16-bit PCM
  • Mono
  • SNR > 20 dB
  • Beamforming or lavalier mic recommended
See Recording Best Practices.