Abstract
Polygenic risk scoring (PRS) holds promise for improving disease prediction and medical treatments by evaluating an individual's genetic susceptibility through multiple genetic variants. However, current PRS calculation methods often excel only in specific diseases and populations, with no single approach consistently outperforming others across all contexts. Furthermore, these methods frequently overlook non-genetic factors, such as lifestyle, that also impact disease risk.We introduce an unsupervised Deep Belief Network (DBN) to aggregate PRS generated by various methods, achieving performance comparable to the Super Learner method-a supervised ensemble approach that combines predictions from multiple methods to improve outcomes. Unlike supervised methods, the DBN does not require training data and can directly ensemble the available PRS. Remarkably, on small-scale datasets, the DBN outperforms the Super Learner. Additionally, we present the DBNX model, which integrates PRS with non-genetic factors using a combination of DBN and XGBoost. DBNX produces a Composite Risk Score (CRS) that incorporates information from both PRS and non-genetic factors. In our experiments using the U.K. Biobank (UKBB) dataset across four diseases, DBNX demonstrated superior performance compared to other commonly used ensemble methods.</p>