Prospective prediction of first onset of major depressive disorder in midlife using machine learning.

Saved in:
Bibliographic Details
Title: Prospective prediction of first onset of major depressive disorder in midlife using machine learning.
Authors: Massell, Johannes (AUTHOR), Preisig, Martin (AUTHOR), Miché, Marcel (AUTHOR), Strippoli, Marie-Pierre F. (AUTHOR), Pistis, Giorgio (AUTHOR), Lieb, Roselind (AUTHOR)
Source: Social Psychiatry & Psychiatric Epidemiology. Oct2025, Vol. 60 Issue 10, p2387-2400. 14p.
Subjects: Receiver operating characteristic curves, Mental depression, Decision making, Random forest algorithms, Medical sciences
Abstract: Purpose: In this paper we leverage machine learning (ML) models to prospectively predict the first onset of Major Depressive Disorder (MDD), one of the most common and disabling mental health conditions. While such prediction models hold potential for enabling early interventions, few studies have applied ML approaches to this task, and those that have are heterogeneous in nature. Moreover, the clinical utility of these predictive models remains largely unexamined. Methods: Data stemmed from CoLaus|PsyCoLaus, a population-based cohort study. In total, 1350 participants, age 35–66 years without lifetime MDD at baseline participated in the physical and psychiatric baseline and at least one psychiatric follow-up evaluation. Models based on logistic regression, elastic net, random forests, and XGBoost were trained using an extensive array of psychosocial, environmental, biological, and genetic predictors. Discriminative performance, calibration, clinical utility, and individual predictor contributions were assessed using nested cross-validation. Results: Discriminative performance was comparable between models (areas under the precision-recall curve between 0.36 and 0.38; areas under the receiver operating characteristic curve between 0.65 and 0.68). Decision curve analysis suggested clinical utility of logistic regression, elastic net, and random forests for threshold probabilities between 10% and 40%. Across all models, neuroticism, sex, and age were the most important predictors. Conclusions: Although the prediction models achieved discriminative performance levels above chance, further refinement is necessary. The addition of biological and genetic predictors did not elevate performance markedly. Additional research seems warranted given the limited number and heterogeneous nature of existing studies, the burden associated with MDD, and the potential to improve overall outcomes for people at risk for MDD. [ABSTRACT FROM AUTHOR]
Copyright of Social Psychiatry & Psychiatric Epidemiology is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Psychology and Behavioral Sciences Collection
Description
Abstract:Purpose: In this paper we leverage machine learning (ML) models to prospectively predict the first onset of Major Depressive Disorder (MDD), one of the most common and disabling mental health conditions. While such prediction models hold potential for enabling early interventions, few studies have applied ML approaches to this task, and those that have are heterogeneous in nature. Moreover, the clinical utility of these predictive models remains largely unexamined. Methods: Data stemmed from CoLaus|PsyCoLaus, a population-based cohort study. In total, 1350 participants, age 35–66 years without lifetime MDD at baseline participated in the physical and psychiatric baseline and at least one psychiatric follow-up evaluation. Models based on logistic regression, elastic net, random forests, and XGBoost were trained using an extensive array of psychosocial, environmental, biological, and genetic predictors. Discriminative performance, calibration, clinical utility, and individual predictor contributions were assessed using nested cross-validation. Results: Discriminative performance was comparable between models (areas under the precision-recall curve between 0.36 and 0.38; areas under the receiver operating characteristic curve between 0.65 and 0.68). Decision curve analysis suggested clinical utility of logistic regression, elastic net, and random forests for threshold probabilities between 10% and 40%. Across all models, neuroticism, sex, and age were the most important predictors. Conclusions: Although the prediction models achieved discriminative performance levels above chance, further refinement is necessary. The addition of biological and genetic predictors did not elevate performance markedly. Additional research seems warranted given the limited number and heterogeneous nature of existing studies, the burden associated with MDD, and the potential to improve overall outcomes for people at risk for MDD. [ABSTRACT FROM AUTHOR]
ISSN:09337954
DOI:10.1007/s00127-025-02942-z