Notes
Motivation: Transcriptome-wide association studies (TWAS) have successfully facilitated the dis13 covery of novel genetic risk loci for many complex traits, including late-onset Alzheimer s disease (AD). However, most existing TWAS methods rely only on gene expression and ignore epige15 netic modification (i.e., DNA methylation) and functional regulatory information (i.e., enhancer16 promoter interactions), both of which contribute significantly to the genetic basis of AD.
Results: This motivates us to develop a novel gene-level association testing method that inte18 grates genetically regulated DNA methylation and enhancer-target gene pairs with genome-wideassociation study (GWAS) summary results. Through simulations, we show that our approach, referred to as the CMO (cross methylome omnibus) test, yielded well controlled type I error rates and achieved much higher statistical power than competing methods under a wide range of scenar22 ios. Furthermore, compared with TWAS, CMO identified an average of 124% more associations when analyzing several brain imaging-related GWAS results. By analyzing to date the largest AD GWAS of 71,880 cases and 383,378 controls, CMO identified six novel loci for AD, which have been ignored by competing methods.
Application 48240
Integrative analysis of UK Biobank and other genetic and genomic datasets for complex disease detection and prevention
Scientific rationale: Even though understanding how DNA sequences affect disease risk is a central problem in medicine, the knowledge for the genetic basis of complex diseases is still limited. On the other hand, integrative analysis of multiple genetic and genomic datasets turns out to be a beneficial method to gain new insights into the genetic mechanisms of complex traits. While appealing, the techniques used for integrative analysis (especially these tailored for UK Biobank data) are still primitive, and some new statistical methods are urgently needed.
Aims: In this proposal, we will develop integrative analysis methods that integrate UK Biobank data with other genetic and genomic datasets. Specifically, we plan to achieve with three related sub-aims. First, we will develop a deep learning/machine learning framework to improve the disease risk prediction accuracy. Second, we will propose a new method to detect how genes and environmental factors such as smoking status interact with each other. Third, we will propose new methods to identify and prioritize putative causal genes that have a direct effect on complex diseases. In the end, we will release public-domain software and online manuals.
Project duration: The project period will be maximally 36 months.
Public health impact: The proposed research might potentially identify some putative causal genes for complex diseases, significant gene-environmental interactions, and a new way to predict the risk of complex diseases. All these findings will help us gain insights into the genetic mechanisms of complex diseases and develop new prevention and diagnosis methods for complex diseases. The proposed research is in line with the UK Biobank strong interest in improving the prevention and diagnosis of complex diseases, including depression, schizophrenia, and Alzheimer's.
Lead investigator: | Professor Chong Wu |
Lead institution: | University of Texas (MD Anderson) |