Name: | Genotype data mapping |
This file defines the order of the loci presented in per-chromosome downloadable genotype data. The file has integer 9 columns:
- Chromosome: 1-22, 23=X, 24=Y, 25=Mitochondrial;
- Order within chromosome: a zero-based index of position in the dataset;
- Affymetrix ID: the unique name Affymetrix uses for the locus;
- RS ID; the standard RS ID where available (0 otherwise);
- The type of locus - see below;
- The start position, in base-pairs;
- Sequence for Allele A;
- Sequence for Allele B;
- Which Allele is the Reference: either A, B or X (if neither).
- 1 : Insertion on the subject sequence. A single nucleotide on the flanking sequence is substituted with the sequence of nucleotides with total length more than one.
- 2 : True SNP. Exactly one nucleotide on the flanking sequence is replaced with exactly one nucleotide on the subject sequence.
- 3 : Deletion on the subject sequence. Part of the flanking sequence, which includes SNP site is deleted from the contig.
- 4 : Range insertion. A shorter part of flanking sequence, containing SNP site is replaced with longer subsequence on the subject. Neighbours are not adjacent to the declared SNP site.
- 5 : Range substitution. Part of the flanking sequence is replaced with another sequence of nucleotides having exactly the same length. Left and right flanking neighbors are not adjacent to the declared SNP site.
- 6 : Range deletion. Part of the flanking sequence containing SNP site is replaced with shorter subsequence on the subject. Neighbors are not adjacent to the declared SNP site.
1,6,14143477,12562034,2,768448,A,G,Bmeans that in downloaded data for Chromosome 1, the 7th value corresponds to Affymetrix ID 14143477 which is also RS ID rs12562034, and is a True SNP wherein exactly one nucleotide on the flanking sequence is replaced with exactly one nucleotide on the subject sequence. The sequence begins at position 768448 and has allele variants "A" and "G", with the "G" variant being the reference.
The conversions offered by the gconv utility program require this file as an input.
This resource can be downloaded or viewed using the link: genotype_map.csv
If you have wget available (typically on linux systems), then you can also obtain a copy using the command
wget -nd biobank.ndph.ox.ac.uk/ukb/ukb/util/genotype_map.csv