Abstract
The complexity and volume of data associated with population-based cohorts means that generating health-related outcomes can be challenging. Using one such cohort, the UK Biobank-a major open access resource-we present a protocol to efficiently integrate the main dataset and record-level data files, to harmonize and process the data using an R package named "ukbpheno". We describe how to use the package to generate binary phenotypes in a standardized and machine-actionable manner. For complete details on the use and execution of this protocol, please refer to Yeung et al. (2022).</p>