About
Our research utilizing the UK Biobank survey, health record, and genomic data will focus on two broad interest areas: analysis of genotype-environment interactions, and statistical approaches to sub-classification of disease. The Biobank will allow us to evaluate how genetic factors have different effects in conjunction with diverse lifestyles. For example, consuming large amounts of caffeine may interact with genetic risk of obesity to elevate the likelihood of heart attack. Conversely, thousands of people with coronary disease may be subdivided into smaller sets who have features in common that would not be detected without the scale of the Biobank.
We will also test whether subsets of samples that are clustered from clinical information are enriched for different genetic ancestries. Since people vary in their risk of developing a wide range of diseases because of the joint influences of genetic variation and differences in the environment, including lifestyle choices, our research has implications for discovery of genetic factors, prediction of the course of disease in individuals, and advocacy for public policy decisions. The sub-classification of groups of patients who share etiological factors has the potential to define what treatments are most effective for patients, or to identify high risk groups of healthy adults for whom simple interventions can prevent illness. Several advanced statistical approaches will be used to detect interactions between genetic and environmental factors as sources of disease risk. The idea is that combinations of genes and behaviors reinforce or cancel one another, and it takes very large datasets to evaluate the repeatability of the effects. In addition, we will use something called tensor factorization to combine health record and genotype data to discover novel combinations of variables shared by small sets of patients. The full cohort will be included in our studies.