Abstract
The contribution of gene-by-environment (GxE) interactions for many human traits and diseases is poorly characterized. We propose a Bayesian whole-genome regression model for joint modeling of main genetic effects and GxE interactions in large-scale datasets, such as the UK Biobank, where many environmental variables have been measured. The method is called LEMMA (Linear Environment Mixed Model Analysis) and estimates a linear combination of environmental variables, called an environmental score (ES), that interacts with genetic markers throughout the genome. The ES provides a readily interpretable way to examine the combined effect of many environmental variables. The ES can be used both to estimate the proportion of phenotypic variance attributable to GxE effects and to test for GxE effects at genetic variants across the genome. GxE effects can induce heteroskedasticity in quantitative traits, and LEMMA accounts for this by using robust standard error estimates when testing for GxE effects. When applied to body mass index, systolic blood pressure, diastolic blood pressure, and pulse pressure in the UK Biobank, we estimate that 9.3%, 3.9%, 1.6%, and 12.5%, respectively, of phenotypic variance is explained by GxE interactions and that low-frequency variants explain most of this variance. We also identify three loci that interact with the estimated environmental scores (-log10p>7.3).</p>