: Application

Application 60755

Title:	High dimensional modelling of complex traits in the presence of corrupted predictor variables
Lead Institution:	Jewish General Hospital
Principal investigator:	Dr Celia Greenwood

WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.

About

We are working on development of improved statistical methods for working with large data like the UKbiobank. In personalized medicine, one of the goals is to predict the likelihood that an individual would develop a disease or some other trait such as bone mineral density (BMD). Some of the challenges include missing data for some participants, inaccuracy of measurements or biases of self reported questionnaires. In this study we want to do a better job in building these complex models, by explicitly accounting for missing data or variables that are inaccurately measured. For example, we know that exercise and diet affect BMD. However, these variables are often inaccurate because they are self reported. To improve prediction we will utilize specially designed equations and algorithms that adjust results for the variables that are measured inaccurately. In this project, we plan to develop and implement these new methods and algorithms so that they work with datasets of the size of the UK Biobank. The Aims include (1) developing a new method that copes with groups of variables where the errors in the data are of different types, and writing software to perform predictions with this method. Our first aim will restrict attention to predicting continuous traits such as BMD. Then (2) we will extend this work to binary traits such as presence or absence of a disease. Our third aim involves exploring more complex models with interactions. The project is expected to last 2 years. The public health impact is indirect. These methods will make it possible to improve predictions for many different traits and diseases.

5 Publications

Pub ID	Title	Author(s)	Year	Journal
4365	Block coordinate descent algorithm improves variable selection and estimation in error-in-variables regression	Célia Escribe (+8)	2021	Genetic Epidemiology
7413	Capturing additional genetic risk from family history for improved polygenic risk prediction	Tianyuan Lu (+3)	2022	Communications Biology
11637	Development of risk prediction models for depression combining genetic and early life risk factors	Tianyuan Lu (+2)	2023	Frontiers in Neuroscience
10600	Genetic determinants of polygenic prediction accuracy within a population	Tianyuan Lu (+3)	2022	Genetics
9251	Identifying Rare Genetic Determinants for Improved Polygenic Risk Prediction of Bone Mineral Density and Fracture Risk	Tianyuan Lu (+4)	2023	Journal of Bone and Mineral Research

Enabling scientific discoveries that improve human health