Abstract
There is increasing interest in using data-driven unsupervised methods to identify structural underpinnings of common mental illnesses, including major depressive disorder (MDD) and associated traits such as cognition. However, studies are often limited to severe clinical cases with small sample sizes and most do not include replication. Here, we examine two relatively large samples with structural magnetic resonance imaging (MRI), measures of lifetime MDD and cognitive variables: Generation Scotland (GS subsample, N = 980) and UK Biobank (UKB, N = 8,900), for discovery and replication, using an exploratory approach. Regional measures of FreeSurfer derived cortical thickness (CT), cortical surface area (CSA), cortical volume (CV) and subcortical volume (subCV) were input into a clustering process, controlling for common covariates. The main analysis steps involved constructing participant K-nearest neighbour graphs and graph partitioning with Markov stability to determine optimal clustering of participants. Resultant clusters were (1) checked whether they were replicated in an independent cohort and (2) tested for associations with depression status and cognitive measures. Participants separated into two clusters based on structural brain measurements in GS subsample, with large Cohen's d effect sizes between clusters in higher order cortical regions, commonly associated with executive function and decision making. Clustering was replicated in the UKB sample, with high correlations of cluster effect sizes for CT, CSA, CV and subCV between cohorts across regions. The identified clusters were not significantly different with respect to MDD case-control status in either cohort (GS subsample: pFDR  = .2239-.6585; UKB: pFDR  = .2003-.7690). Significant differences in general cognitive ability were, however, found between the clusters for both datasets, for CSA, CV and subCV (GS subsample: d = 0.2529-.3490, pFDR  < .005; UKB: d = 0.0868-0.1070, pFDR  < .005). Our results suggest that there are replicable natural groupings of participants based on cortical and subcortical brain measures, which may be related to differences in cognitive performance, but not to the MDD case-control status.</p>