Abstract
BACKGROUND: Many metabolomics studies of depression have been performed, but these have been limited by their scale. A comprehensive in silico analysis of global metabolite levels in large populations could provide robust insights into the pathological mechanisms underlying depression and candidate clinical biomarkers.</p>
METHODS: Depression-associated metabolomics was studied in 2 datasets from the UK Biobank database: participants with lifetime depression (N = 123,459) and participants with current depression (N = 94,921). The Whitehall II cohort (N = 4744) was used for external validation. CatBoost machine learning was used for modeling, and Shapley additive explanations were used to interpret the model. Fivefold cross-validation was used to validate model performance, training the model on 3 of the 5 sets with the remaining 2 sets for validation and testing, respectively. Diagnostic performance was assessed using the area under the receiver operating characteristic curve.</p>
RESULTS: In the lifetime depression and current depression datasets and sex-specific analyses, 24 significantly associated metabolic biomarkers were identified, 12 of which overlapped in the 2 datasets. The addition of metabolic features slightly improved the performance of a diagnostic model using traditional (nonmetabolomics) risk factors alone (lifetime depression: area under the curve 0.655 vs. 0.658 with metabolomics; current depression: area under the curve 0.711 vs. 0.716 with metabolomics).</p>
CONCLUSIONS: The machine learning model identified 24 metabolic biomarkers associated with depression. If validated, metabolic biomarkers may have future clinical applications as supplementary information to guide early and population-based depression detection.</p>