We have developed an improved version of the BOLT-LMM software, which identifies genetic variants associated with health-related traits (such as disease status). This software provides large gains in statistical power compared to standard techniques based on linear regression, allowing researchers to discover many more genetic associations. We have released the software along with association test results for a broad set of heritable traits.
The association analyses were performed using BOLT-LMM v2.3 including age, age^2, sex, genotyping array, assessment center, and 20 principal components as covariates. Genomic control was not applied to P-values and standard errors.
Mixed model association for biobank-scale data sets Po-Ru Loh, Gleb Kichaev, Steven Gazal, Armin P. Schoech, Alkes L. Price bioRxiv 194944
Application of fast mixed model association and principal component analysis methods
We aim to identify genetic loci that are associated to specific health-related outcomes. More precisely, we will apply a new, more powerful statistical method (BOLT-LMM) to analyze outcomes that have been demonstrated to be heritable in previous genome-wide association studies, including direct health outcomes (disease status) as well as heritable quantitative measurements such as height, BMI and lipid levels associated to some health outcomes. We will investigate only genetic effects and will use environmental exposure data only as covariates in our analyses. This project is restricted to self-reported outcomes and traits measured at baseline. Our discovery of associated loci that could not be discovered using existing methods may potentially lead to actionable drug targets, and is in the public interest. We will analyze each outcome independently: i.e., for each disease code, we will compute association statistics between all genetic markers and the disease code (independent of other outcomes). We will apply a more powerful statistical method to the data than has previously been available. The new method (BOLT-LMM) applies a linear mixed model to analyze all genetic markers simultaneously, enabling a more powerful statistical analysis that is expected to detect associations that other methods miss. In addition to performing BOLT-LMM analysis, we will also compute association statistics using other standard methods for comparison.
Po-Ru Loh, Gleb Kichaev, Steven Gazal, Armin P. Schoech, Alkes L. Price. Mixed model association for biobank-scale data sets.
|Lead investigator:||Alkes Price|
|Lead institution:||Harvard School of Public Health|