Overview of Medera Multimodal Sensing

Medera Multimodal Sensing is specifically designed for behavioral and mental health. The engines below provide real-time access to clinical signal channels. Selecting the right engine depends on your use case.

Review the Languages page to see language support per surface and the Architecture page for engine deep dives.

Medera Multimodal engines

Vocal Acoustic Engine

F0, jitter, shimmer, HNR, MFCC, prosodic features, and clinical markers from speech.

Facial Physiological Engine

rPPG-derived HR, BP, HRV (SDNN, RMSSD, LF/HF, SD1, SD2), respiration, stress.

Neurobehavioral Construct Computer

15 named neurobehavioral constructs across 5 domains fused from facial, vocal, and assessment context.

Architecture

Engine functionality

	Vocal Acoustic	Facial Physiological	Neurobehavioral Construct Construct
Input	16 kHz mono PCM	Video frames	Vocal + Facial features + Assessments
Minimum duration	1.0 s	30 s (HR) / 60 s (HRV, BP)	—
Library	audio analysis library + speech analysis library	OpenCV + rPPG library + SciPy	NumPy + Pandas
Outputs	F0, jitter, shimmer, HNR, MFCC, clinical markers	HR, BP, HRV, RR, stress, PNS/SNS	15 named constructs
Confidence calibration	0–1 per family	0–1 + SNR > 3.0 gate	`VERY_HIGH` → `VERY_LOW`
Failure mode	`unavailable` status	`confidence: 0.0`	`requires_human_review: true`

Endpoints

Method	Path	Use
`POST`	`/api/multimodal-therapy/analyze-session`	Full multimodal analysis + biomarker persistence
`POST`	`/api/multimodal-therapy/analyze-audio-only`	Audio-only multimodal analysis
`GET`	`/api/multimodal-therapy/health`	Engine availability + dependency status

Confidence bands

Band	Range	Behavior
`VERY_HIGH`	≥ 0.95	Auto-pass through to clinician
`HIGH`	≥ 0.85	Auto-pass through
`MODERATE`	≥ 0.70	Surfaced normally
`LOW`	≥ 0.50	`requires_human_review: true`
`VERY_LOW`	< 0.50	`requires_human_review: true` + flagged for review queue	If signal quality fails (SNR ≤ 3.0 on facial, voiced fraction too low on vocal), the engine returns `confidence: 0.0` and the analyzer falls back to a Medera reasoning transcript-inferred analysis — never fabricated values.

What’s next

Architecture

Pipeline deep dive from raw signal to canonical clinical payload.

Quickstart

Stream your first multimodal session end-to-end.

Neurobehavioral Construct Constructs

15 named constructs across 5 domains.

Co-Therapy Agent

Agent-level documentation for in-session multimodal.

​Medera Multimodal engines

Vocal Acoustic Engine

Facial Physiological Engine

Neurobehavioral Construct Computer

​Architecture

​Engine functionality

​Endpoints

​Confidence bands

​What’s next

Architecture

Quickstart

Neurobehavioral Construct Constructs

Co-Therapy Agent

Medera Multimodal engines

Architecture

Engine functionality

Endpoints

Confidence bands

What’s next