About
In this proposal we focus on estimating the risk of developing the disease (e.g. breast cancer) over time given risk factors. Survival analysis is a branch of statistics for analysing the time until a pre-specified event happen, such as death or a disease. Hence, it provides a useful tool for such prediction analysis while taking into account competing risks (i.e., a person who died before having the disease cannot have the disease anymore), and left truncation (i.e., only individuals who survive at least by age 40 are included in the UK Biobank dataset). Heritability summarise the proportion of the variance of the trait under study (e.g. disease status) that is due to genetic factors. Predictive performance of a model strongly depends on the extent of heritability of the trait. For any given sample size, more accurate prediction is possible for more heritable traits, such as Crohn disease and type 1 diabetes, than for less heritable traits such as prostate cancer. Improving heritability estimation for various traits could provide insights on several missing heritability questions and lead to practical solutions for improving risk-prediction models.
Finally, we will study complex and high dimensional traits. Current methods, which follow a two-step approaches: first, extract the relevant features from the high-dimensional traits and then perform genome-wide association analysis on these features. In contrast to these methods, we will apply a joint modelling approach where we extract the most heritable features by correlating them with the genotype data, which will enable us to discover new heritable traits, and may lead to changes in definition of relevant traits from high-throughput digital trait data.