Abstract
The UK Biobank genotyped about 500k participants using Applied Biosystems Axiom microarrays. Participants were subsequently sequenced by the UK Biobank Exome Sequencing Consortium. Axiom genotyping was highly accurate in comparison to sequencing results, for almost 100,000 variants both directly genotyped on the UK Biobank Axiom array and via whole exome sequencing. However, in a study using the exome sequencing results of the first 50k individuals as reference (truth), it was observed that the positive predictive value (PPV) decreased along with the number of heterozygous array calls per variant. We developed a novel addition to the genotyping algorithm, Rare Heterozygous Adjusted (RHA), to significantly improve PPV in variants with minor allele frequency below 0.01%. The improvement in PPV was roughly equal when comparing to the exome sequencing of 50k individuals, or to the more recent ~200k individuals. Sensitivity was higher in the 200k data. The improved calling algorithm, along with enhanced quality control of array probesets, significantly improved the positive predictive value and the sensitivity of array data, making it suitable for the detection of ultra-rare variants.</p>