Abstract
INTRODUCTION: Genetic associations for variants identified through genome-wide association studies (GWASs) tend to be overestimated in the original discovery data set as, if the association was underestimated, the variant may not have been detected. This bias, known as winner's curse, can affect Mendelian randomization estimates, but its severity and potential impact are unclear.</p>
METHODS: We performed an empirical investigation to assess the potential bias from winner's curse in practice. We considered Mendelian randomization estimates for the effect of body mass index (BMI) on coronary artery disease risk. We randomly divided a UK Biobank data set 100 times into three equally sized subsets. The first subset was treated as the 'discovery GWAS'. We compared genetic associations estimated in the discovery GWAS to those estimated in the other subsets for each of the 100 iterations.</p>
RESULTS: For variants associated with BMI at P < 5 × 10-8 in at least one iteration, genetic associations with BMI were up to 5-fold greater in iterations in which the variant was associated with BMI at P < 5 × 10-8 compared with its mean association across all iterations. If the minimum P-value for association with BMI was P = 10-13 or lower, then this inflation was <25%. Mendelian randomization estimates were affected by winner's curse bias. However, bias did not materially affect results; all analyses indicated a deleterious effect of BMI on coronary artery disease risk.</p>
CONCLUSIONS: Winner's curse can bias Mendelian randomization estimates, although its practical impact may not be substantial. If avoiding sample overlap is infeasible, analysts should consider performing a sensitivity analysis based on variants strongly associated with the exposure.</p>