Name: | Targets within the initial 50k FE data release affected by non-alt aware mapping (BED file) |
The initial UK Biobank release of 50k WES FE CRAMs were incorrectly mapped to a version of the GRCh38 reference that did not include the reference alt file required for alt-aware mappings. Within this initial release, reads that map to both the primary assembly and an alternative contig will have deflated and often zero map qualities. Low-map-quality reads are generally ignored by variant callers and thus this error can result in an undercalling of variants in affected regions. As such, all UKB 50k FE WES data (CRAMs, gVCFs, and PLINKs) were affected. In advance of a corrected dataset being made available, to facilitate existing and continued analysis of the existing FE WES UKB data, this BED file describes the WES target regions impacted by the lack of alt-aware mapping. The capture design for the UKB WES data comprises 204,829 targets (39.20 MBp), 6,784 of which (1.27 MBp) overlap the alt-derived regions described in the reference alt file. An additional 770 targets (0.26 MBp) were observed to have changes in either the number of reads mapped to the target or the average read-mapping quality in test samples when compared between alt-aware and non-alt-aware mappings. This annotated BED file (xgen_plus_spikein_b38_alt_affected.bed) contains these 7,554 targets affected by the non-alt-aware mapping. We recommend that these regions be excluded from any analysis of the FE data and that all researchers consider how this mapping error might impact any results derived from the FE data.
This resource is not suitable for displaying within a web-browser.
It can be downloaded or viewed using the link: xgen_plus_spikein_b38_alt_affected.bed
If you have wget available (typically on linux systems), then you can also obtain a copy using the command
wget -nd biobank.ndph.ox.ac.uk/ukb/ukb/auxdata/xgen_plus_spikein_b38_alt_affected.bed