: Return 3462

Return 3462

Application:	15326, Compressed Sensing and high-dimensional statistical methods in complex trait genomics
Title:	Accurate Genomic Prediction of Human Height
Size:	2.0 MB
Cost Tier:	1
Archived:	26 May 2021
Stability:	Complete
Personal:	No individual-level data

WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.

Notes

We construct genomic predictors for heritable but extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). The constructed predictors explain, respectively, ~40, 20, and 9% of total variance for the three traits, in data not used for training. For example, predicted heights correlate ~0.65 with actual height; actual heights of most individuals in validation samples are within a few centimeters of the prediction. The proportion of variance explained for height is comparable to the estimated common SNP heritability from genome-wide complex trait analysis (GCTA), and seems to be close to its asymptotic value (i.e., as sample size goes to infinity), suggesting that we have captured most of the heritability for SNPs. Thus, our results close the gap between prediction R-squared and common SNP heritability. The ~20k activated SNPs in our height predictor reveal the genetic architecture of human height, at least for common variants. Our primary dataset is the UK Biobank cohort, comprised of almost 500k individual genotypes with multiple phenotypes. We also use other datasets and SNPs found in earlier genome-wide association studies (GWAS) for out-of-sample validation of our results.

Application 15326

Compressed Sensing and high-dimensional statistical methods in complex trait genomics

Our goal is to test new computational methods for determining the genetic architecture of complex traits, including highly heritable conditions such as Type 1 Diabetes, Alzheimer's, and others. The techniques we plan to use have been the subject of intense recent activity in fields such as optimization, signal processing and machine learning, but so far have just begun to be applied in genomics. The research will produce improved predictive models which, based on individual genomics, identify individuals at high risk for certain diseases. It will also identify the many alleles associated with this risk. Early intervention with high risk individuals may decrease rates of incidence and reduce health care costs. Elaboration of underlying genetic architecture is important basic science and may lead to improved treatments (e.g., drug development). We wish to obtain access to genomic data and phenotype data relevant to highly heritable disease conditions (e.g., Type 1 Diabetes) as well as complex traits such as height, BMI, cognitive ability. Advanced computational algorithms will be used to study the genetic architecture of these traits. The techniques we plan to use have been the subject of intense recent activity in fields such as optimization, signal processing and machine learning, but so far have just begun to be applied in genomics. Analysis will be performed on high-performance computing clusters. We would like access to the full cohort (SNP genotypes), and several relevant phenotypes.

Lead investigator:	Professor Stephen Hsu
Lead institution:	Michigan State University

3 related Returns

Return ID	App ID	Description	Archive Date
3460	15326	Genetic architecture of complex traits and disease risk predictors	26 May 2021
3459	15326	Genomic Prediction of 16 Complex Disease Risks Including Heart Attack, Diabetes, Breast and Prostate Cancer	26 May 2021
3461	15326	Sibling validation of polygenic risk scores and complex trait prediction	26 May 2021

1 Publication

Pub ID	Title	Author(s)	Year	Journal
3463	Accurate Genomic Prediction of Human Height	Louis Lello (+5)	2018	Genetics

Enabling scientific discoveries that improve human health