The Co-Therapy Agent is a silent multimodal partner — it never speaks to the patient. It captures audio, video, and language during the session and fuses the signal into 15 RDoC construct activations on a private clinician channel.Documentation Index
Fetch the complete documentation index at: https://docs.medera.info/llms.txt
Use this file to discover all available pages before exploring further.
Architecture
Multimodal channels
| Channel | Engine | Outputs |
|---|---|---|
| Audio | Vocal Acoustic Engine | F0, jitter, shimmer, HNR, MFCC, prosodic flatness, depression/anxiety/distress indices |
| Video | Facial Physiological Engine | HR, BP, HRV (SDNN/RMSSD/LF-HF), respiration rate, stress index, affect |
| Language | Linguistic Content Expert | Topic, valence, certainty, pronoun shift |
Steps
Capture consent
Recording requires explicit patient consent per session. Configure your consent flow in the Console.
Start the session
POST /api/therapy-sessions/start with the participants and modality. Returns a session_id and a multimodal WebSocket URL.Stream audio + video
Stream PCM audio at 16 kHz and video frames (MediaPipe 468-point landmarks) over the multimodal WebSocket.
Receive construct activations
The clinician’s private channel emits
rdoc.activation events with feature contributions and confidence.Related
Co-Therapy Agent
Agent-level documentation.
Facial Engine
Physiological signals from video.
Vocal Engine
Acoustic features from audio.
RDoC Constructs
15 documented constructs.