About
Recent work has shown that disease-associated variants are enriched in regulatory regions predicted by histone modifications and chromatin accessiblility, enabling discovery of new disease genes and fine-mapping of causal variants, leading to novel therapeutic targets. However, current methods make strong assumptions and do not scale to large number of samples or high resolution genotype data. We develop new methods to perform large-scale sparse regression while relaxing non-realistic assumptions, identifying novel biological pathways, target genes, upstream transcription factors, and regulatory mechanisms. We will develop general methods to interrogate the genetic causes of common disease and other complex traits, and apply them to all of the disease phenotypes available in UK Biobank. Our methods will predict causal mechanisms which can be taken to experimental followup and eventually therapeutic development. We will develop a novel statistical model and computational methods to identify genetic variants associated with disease. The key insight of our method is that epigenomic data is associated with functional non-coding elements, which we can use to understand the transcriptional regulatory network connecting genetic variants, upstream regulators, and downstream target genes and connect disease-associated variants to specific biological mechanisms of action. We can use the network to guide discovery of yet-uncharacterized genetic variants associated with disease and predict the mechanisms by which they can impact disease. We will make use of the full cohort.