Abstract
An important issue in the analysis of rare variant association studies is the ability to annotate nonsynonymous variants in terms of their likely importance as affecting protein function. To address this, AlphaMissense was recently released and was shown to have good performance using benchmarks based on variants causing severe disease and on functional assays. Here, we assess the performance of AlphaMissense across 18 genes which had previously demonstrated association between rare coding variants and hyperlipidaemia, hypertension or type 2 diabetes. The strength of evidence in favour of association, expressed as the signed log p value (SLP), was compared between AlphaMissense and 43 other annotation methods. The results demonstrated marked variability between genes regarding the extent to which nonsynonymous variants contributed to evidence for association and also between the performance of different methods of annotating the nonsynonymous variants. Although AlphaMissense produced the highest SLP on average across genes, it produced the maximum SLP for only 4 genes. For some genes, other methods produced a considerably higher SLP and there were examples of genes where AlphaMissense produced no evidence for association while another method performed well. The marked inconsistency across genes means that it is difficult to decide on an optimal method of analysis of sequence data. The fact that different methods perform well for different genes suggests that if one wished to use sequence data for individual risk prediction then gene-specific annotation methods should be used.</p>