Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.medera.info/llms.txt

Use this file to discover all available pages before exploring further.

Medera’s Multimodal Sensing layer fuses three real-time clinical signal channels — vocal acoustics, facial physiology, and assessment context — into 15 RDoC construct activations. The pipeline runs in ai-services under src/multimodal/engines/ and src/rdoc/.

The three engines

Vocal Acoustic Engine

VocalAcousticEngine — Librosa + Parselmouth (Praat). F0 statistics, jitter, shimmer, HNR, MFCC (13), prosodic features, and clinical markers.

Facial Physiological Engine

FacialPhysiologicalEngine — OpenCV + HeartPy + SciPy. rPPG-derived HR, BP, HRV (SDNN / RMSSD / LF-HF / SD1 / SD2), respiration, stress index.

RDoC Construct Computer

RDoCConstructComputer — fuses facial features, vocal features, and assessment context into 15 named constructs across 5 domains.

Architecture

   AUDIO  →  VocalAcousticEngine.extract_features(audio) → VocalFeatures

   VIDEO  →  FacialPhysiologicalEngine.extract_features(frames) → PhysiologicalSignals

   ASSESS →  PHQ-9 / GAD-7 / C-SSRS scores ──────────────────┤

                                  RDoCConstructComputer.compute_all_constructs()


                                          RDoCActivationProfile (15 constructs)


                          MultimodalTherapyAnalyzer → canonical clinical payload


                           AdvancedClinicalRAG (optional)  → CitedResponse

Endpoints

The multimodal pipeline is exposed through two API surfaces.
SurfaceEndpointUse
Backend (3001)POST /api/multimodal-therapy/analyze-sessionFull multimodal analysis + biomarker persistence + metered credit
Backend (3001)POST /api/multimodal-therapy/analyze-audio-onlyAudio-only proxy to AI services
AI Services (8000)POST /api/multimodal-therapy/analyze-sessionSource-of-truth analyzer
AI Services (8000)POST /api/multimodal-therapy/analyze-audio-onlyVocal-only analysis
AI Services (8000)GET /api/multimodal-therapy/healthEngine availability + dep status
The backend proxy enriches the payload with the runtime agent config and persists the resulting biomarker envelope; the AI services router runs the engines.

Quality and confidence

Every output reports confidence in ConfidenceLevel bands (defined in advanced_clinical_rag.py):
BandRange
VERY_HIGH≥ 0.95
HIGH≥ 0.85
MODERATE≥ 0.70
LOW≥ 0.50
VERY_LOW< 0.50
Below MODERATE, results are delivered with requires_human_review: true. If signal quality fails (SNR < 3.0 on the facial channel, voiced fraction too low on the vocal channel), the engine returns the metric with confidence: 0.0 and the analyzer falls back to a GPT-4o transcript-inferred analysis — never to fabricated values.

What’s next

Architecture

Pipeline deep dive.

Quickstart

Stream your first multimodal session.

RDoC Constructs

15 named constructs.

Co-Therapy Agent

Agent-level documentation.