Abstract
Liu et al. (2025) analyzed UK Biobank data, using Principal Component Analysis (PCA) to identify lipid patterns associated with depression and bipolar disorder. Their work reported that the first principal component (PC1), reflecting Apolipoprotein B (ApoB), cholesterol, and low-density lipoprotein cholesterol (LDL-C), showed a protective effect against depression. However, their methodological approach warrants discussion. PCA is a linear dimensionality reduction technique. The authors noted nonlinear relationships between lipid profiles and mood disorder risk, contradicting PCA's inherent linearity assumption. Applying linear methods like PCA to nonlinear data can lead to significant distortions, systematic bias, and underfitting, failing to capture true data complexity. PC1 may have obscured genuine associations by forcing distinct biological features into a single linear equation, potentially diluting crucial signals. For future research, complementing PCA with unsupervised learning techniques like Feature Agglomeration (FA) and Highly Variable Gene Selection (HVGS) could offer a more robust approach. Additionally, using nonlinear nonparametric statistical methods such as Spearman's rho or Kendall's tau would be beneficial. These methods detect monotonic relationships without linearity assumptions, precisely capturing potentially nonlinear associations and enhancing interpretability in translational biomarker research.</p>