Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.medera.info/llms.txt

Use this file to discover all available pages before exploring further.

VocalAcousticEngine extracts ~70 acoustic features using Librosa and Parselmouth (Praat).

Constructor

engine = VocalAcousticEngine(
    sample_rate=16000,
    frame_length_ms=25,
    hop_length_ms=10,
    gender=None,  # 'male' (85–180 Hz), 'female' (165–255 Hz), or None (auto 75–300 Hz)
)

Primary method

features: VocalFeatures = engine.extract_features(audio: np.ndarray, sr: Optional[int])

VocalFeatures dataclass

F0 (fundamental frequency)

FieldUnit
f0_meanHz
f0_stdHz
f0_min, f0_max, f0_rangeHz
f0_confidence0–1

Voice quality

FieldUnitNormal range
jitter%< 1.04 %
shimmer%< 3.81 %
hnrdB> 20 dB
voice_quality_confidence0–1

Prosody

FieldUnit
speaking_ratesyllables / sec
articulation_ratesyllables / sec (excluding pauses)
pause_countint
pause_duration_mean, pause_duration_totalseconds
pitch_variabilitycoefficient of variation
intensity_mean, intensity_stddB
prosody_confidence0–1

Spectral

FieldUnit
mfcc13-element list
spectral_centroidHz
spectral_bandwidthHz
spectral_rolloffHz
spectral_flatness0–1
spectral_confidence0–1

Clinical markers

FieldRange
vocal_depression_index0–1
vocal_anxiety_index0–1
vocal_distress_index0–1
prosodic_flatness0–1 (anhedonia marker)
clinical_confidence0–1

Quality

FieldNotes
overall_quality0–100
voiced_fraction0–1
snrdB
audio_durationseconds

Library dependencies

  • Librosa
  • Parselmouth (Praat)
  • SciPy (scipy.signal, scipy.stats)
  • NumPy