Abstract
Brain age is an estimate of chronological age obtained from T1-weighted magnetic resonance images (T1w MRI), representing a straightforward diagnostic biomarker of brain aging and associated diseases. While the current best accuracy of brain age predictions on T1w MRIs of healthy subjects ranges from two to three years, comparing results across studies is challenging due to differences in the datasets, T1w preprocessing pipelines, and evaluation protocols used. This paper investigates the impact of T1w image preprocessing on the performance of four deep learning brain age models from recent literature. Four preprocessing pipelines, which differed in terms of registration transform, grayscale correction, and software implementation, were evaluated. The results showed that the choice of software or preprocessing steps could significantly affect the prediction error, with a maximum increase of 0.75 years in mean absolute error (MAE) for the same model and dataset. While grayscale correction had no significant impact on MAE, using affine rather than rigid registration to brain atlas statistically significantly improved MAE. Models trained on 3D images with isotropic 1mm3 resolution exhibited less sensitivity to the T1w preprocessing variations compared to 2D models or those trained on downsampled 3D images. Our findings indicate that extensive T1w preprocessing improves MAE, especially when predicting on a new dataset. This runs counter to prevailing research literature, which suggests that models trained on minimally preprocessed T1w scans are better suited for age predictions on MRIs from unseen scanners. We demonstrate that, irrespective of the model or T1w preprocessing used during training, applying some form of offset correction is essential to enable the model's performance to generalize effectively on datasets from unseen sites, regardless of whether they have undergone the same or different T1w preprocessing as the training set.</p>