Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.medera.info/llms.txt

Use this file to discover all available pages before exploring further.

Medera diarizes up to 4 speakers in conversational STT with sub-second labelling latency.

How it works

  • Speaker embeddings extracted per voiced window
  • Online clustering with minimum-cluster-duration heuristic
  • Label stability via Viterbi smoothing
  • Optional enrollment for known clinicians

Enrollment

POST /api/providers/{id}/voice-enroll
Content-Type: audio/wav
Upload 30 s of clean clinician speech to register a speaker embedding. Enrolled speakers are labelled consistently across sessions.