: Publication 16688

Publication 16688

Title:	Analytical and computational solution for the estimation of SNP-heritability in biobank-scale and distributed datasets
Journal:	PLOS Computational Biology
Published:	21 Oct 2025
Pubmed:	https://pubmed.ncbi.nlm.nih.gov/41118419/
DOI:	https://doi.org/10.1371/journal.pcbi.1013568

WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.

Abstract

For a complex trait, heritability ([Formula: see text]) gives the genetic determination of its variation. Given the emergence of biobank-scale data, a more powerful method is needed to estimate [Formula: see text]. Based on the framework of Haseman-Elston regression (RHE-reg), we integrate a fast randomization algorithm to estimate [Formula: see text], and RHE-reg can tackle biobank-scale data, such as UK Biobank (UKB), very efficiently. Furthermore, we present an analytical solution that balances computational cost and precision of the estimation, a property that is important in dealing with biobank-scale data. We investigated the performance of the RHE-reg in simulated data and also applied it for 81 UKB quantitative traits; as tested in UKB data of nearly 300,000 unrelated individuals, it took on average about 4.5 hours to complete an estimation when used 10 CPUs. We extended the application of RHE-reg into distributed datasets when privacy is not compromised. As shown in UKB and simulated data the performance of RHE-reg was accurate in estimating [Formula: see text]. The software for estimating SNP-heritability for biobank-scale data is released.</p>

8 Keywords

Algorithms
Biological Specimen Banks
Computational Biology
Computer Simulation
Humans
Models, Genetic
Polymorphism, Single Nucleotide
Quantitative Trait, Heritable

9 Authors

Guo-An Qi
Qi-Xin Zhang
Jingyu Kang
Tianyuan Li
Xiyun Xu
Zhe Zhang
Zhe Fan
Siyang Liu
Guo-Bo Chen

1 Application

Application ID	Title
41376	Statistical genetics methods for complex traits using large-scale genomic data

Enabling scientific discoveries that improve human health