Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.medera.info/llms.txt

Use this file to discover all available pages before exploring further.

Category: Clinical Operations  ·  Signal channels: Vocal acoustics · Facial physiology · Linguistic content · Assessments
The Co-Therapy Agent is a silent multimodal partner — it never speaks to the patient. It runs the Vocal Acoustic Engine and Facial Physiological Engine, fuses outputs through the RDoC Construct Computer into 15 named construct activations, and surfaces the resulting payload to the clinician on a private channel.

Endpoints

MethodPathPurpose
POST/api/co-therapy-sessionsCreate session (backend)
POST/api/co-therapy-sessions/{id}/startStart session + invitations
POST/api/co-therapy-sessions/{id}/joinRecord participant joining
POST/api/co-therapy-sessions/{id}/leaveRecord participant leaving
POST/api/co-therapy-sessions/{id}/transitionManually transition session state
POST/api/multimodal-therapy/analyze-sessionFull multimodal analysis
POST/api/multimodal-therapy/analyze-audio-onlyAudio-only analysis

Pipeline

audio + video + transcript + assessments
  └─→ FacialPhysiologicalEngine.extract_features()  → PhysiologicalSignals
  └─→ VocalAcousticEngine.extract_features()        → VocalFeatures
  └─→ RDoCConstructComputer.compute_all_constructs() → RDoCActivationProfile
  └─→ AdvancedClinicalRAG (optional)                → CitedResponse


                            MultimodalTherapyAnalyzer canonical payload


                              Clinician private channel + session note

Configuration

FieldDescription
audio_enabled, video_enabledChannel toggles
rdoc_constructsOptional subset of the 15 named constructs
clinical_contextChief complaint, active diagnoses, recent screener scores
consent_recordedPer-session consent flag (required)

Composable experts

ExpertRole
Vocal Prosody ExpertF0, jitter, shimmer, HNR, speaking rate, prosodic flatness
Affect Dynamics ExpertFacial affect intensity, valence, congruence
Neurobehavioral Construct ExpertRDoC activation, severity, contributors
Linguistic Content ExpertTopic, valence, certainty, pronoun shift

Quality gates

  • Audio: ≥ 1.0 s, valid voiced_fraction
  • Video: ≥ 30 s for HR, ≥ 60 s for HRV / BP, SNR > 3.0
  • Quality gate (analyzer): minimum 20 words of transcript + 30 s of session signal
On quality failure, the engine returns confidence: 0.0 and the analyzer falls back to a GPT-4o transcript-inferred analysis marked requires_human_review: truenever fabricated values.

Compliance guarantees

  • The Co-Therapy Agent never addresses or interacts with the patient
  • Construct activations report contributors[] so clinicians can audit feature-level influence
  • Recording policy is explicit per session (consent_recorded: true required)

Quickstart

Run a multimodal session.

Multimodal Overview

Engine deep dive.

RDoC Constructs

15 named constructs.

Multimodal API

REST surface.