Machine learning could be a complementary decision support tool for depression assessment
Background and Purpose: Depression affects an estimated 18 million Americans each year, yet depression screening occurs rarely in the outpatient setting. This study evaluated an AI-based machine biomarker tool that uses speech patterns to detect moderate to severe depression to improve access to screening in primary care. Study Approach: The study analyzed over 14,000 speech samples from U.S. and Canadian adults. Participants answered the question: “How was your day?” With at least 25 seconds of free-form language. The tool analyzed vocal biomarkers associated with depression, including speech cadence, hesitations, pauses and other acoustic features. These were based on the results of…
Machine learning could be a complementary decision support tool for depression assessment
Background and Purpose: Depression affects an estimated 18 million Americans each year, yet depression screening occurs rarely in the outpatient setting. This study evaluated an AI-based machine biomarker tool that uses speech patterns to detect moderate to severe depression to improve access to screening in primary care.
Study Approach: The study analyzed over 14,000 speech samples from U.S. and Canadian adults. Participants answered the question: “How was your day?” With at least 25 seconds of free-form language. The tool analyzed vocal biomarkers associated with depression, including speech cadence, hesitations, pauses and other acoustic features. These were compared with the results of the Patient Health Questionnaire-9 (PHQ-9), a standard depression screening tool. A PHQ-9 score of 10 or higher indicated moderate to severe depression. The AI tool provided three outputs: signs of depression, signs of depression that were not detected, and further assessment (for uncertain cases).
Key Results: The dataset used to train the AI model consisted of 10,442 samples, while an additional 4,456 samples were used in a validation set to assess accuracy.
-
The tool showed a sensitivity of 71% and correctly identified depression in 71% of people who had it.
-
The specificity was 74%, which correctly ruled out depression in 74% of people who didn't have it.
Why it matters: The study's findings suggest that machine learning technology could serve as a complementary decision support tool for depression assessment.
Sources:
Mazur, A., et al. (2025) Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression. The Annals of Family Medicine. doi.org/10.1370/afm.240091.