WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.
Major depressive disorder (MDD) has been the subject of many neuroimaging case control classification studies. Although some studies report accuracies =80%, most have investigated relatively small samples of clinically-ascertained, currently symptomatic cases, and did not attempt replication in larger samples. We here first aimed to replicate previously reported classification accuracies in a small, well-phenotyped community-based group of current MDD cases with clinical interview-based diagnoses (from STratifying Resilience and Depression Longitudinally cohort, STRADL ). We performed a set of exploratory predictive classification analyses with measures related to brain morphometry and white matter integrity. We applied three classifier types SVM, penalised logistic regression or decision tree either with or without optimisation, and with or without feature selection. We then determined whether similar accuracies could be replicated in a larger independent population-based sample with self-reported current depression (UK Biobank cohort). Additional analyses extended to lifetime MDD diagnoses remitted MDD in STRADL, and lifetime-experienced MDD in UK Biobank. The highest cross-validation accuracy (75%) was achieved in the initial current MDD sample with a decision tree classifier and cortical surface area features. The most frequently selected decision tree split variables included surface areas of bilateral caudal anterior cingulate, left lingual gyrus, left superior frontal, right precentral and paracentral regions. High accuracy was not achieved in the larger samples with self-reported current depression (53.73%), with remitted MDD (57.48%), or with lifetime-experienced MDD (52.68 60.29%). Our results indicate that high predictive classification accuracies may not immediately translate to larger samples with broader criteria for depression, and may not be robust across different classification approaches.