Abstract
BACKGROUND: Depression is a major cause of disability worldwide. Recent data suggest that, in industrialised countries, the prevalence of depression peaks in middle age. Identifying factors predictive of future depressive episodes is crucial for developing prevention strategies for this age group.</p>
AIMS: We aimed to identify future depression in middle-aged adults with no previous psychiatric history.</p>
METHOD: To predict a diagnosis of depression 1 year or more following a comprehensive baseline assessment, we used a data-driven, machine-learning methodology. Our data-set was the UK Biobank of middle-aged participants (N = 245 036) with no psychiatric history.</p>
RESULTS: Overall, 2.18% of the study population developed a depressive episode at least 1 year following baseline. Basing predictions on a single mental health questionnaire led to an area under the curve of the receiver operating characteristic of 0.66, and a predictive model leveraging the combined results of 100 UK Biobank questionnaires and measurements improved this to 0.79. Our findings were robust to demographic variations (place of birth, gender) and variations in methods of depression assessment. Thus, machine-learning-based models best predict diagnoses of depression when allowing the inclusion of multiple features.</p>
CONCLUSIONS: Machine-learning approaches show potential for being beneficial for the identification of clinically relevant predictors of depression. Specifically, we can identify, with moderate success, people with no recorded psychiatric history as at risk for depression by using a relatively small number of features. More work is required to improve these models and evaluate their cost-effectiveness before integrating them into the clinical workflow.</p>