Abstract
Atrial fibrillation (AF) and ventricular arrhythmia (Arr) are among the most common and fatal cardiac arrhythmias in the world. Electrocardiogram (ECG) data, collected as part of the UK Biobank, represents an opportunity for analysis and classification of these two diseases in the UK. The main objective of our study is to investigate a two-stage model for the classification of individuals with AF and Arr in the UK Biobank dataset. The current literature addresses heart arrhythmia classification very extensively. However, the data used by most researchers lack enough instances of these common diseases. Moreover, by proposing the two-stage model and separation of normal and abnormal cases, we have improved the performance of the classifiers in detection of each specific disease. Our approach consists of two stages of classification. In the first stage, features of the ECG input are classified into two main classes: normal and abnormal. At the second stage, the features of the ECG are further categorised as abnormal and further classified into two diseases of AF and Arr. A diverse set of ECG features such as the QRS duration, PR interval and RR interval, as well as covariates such as sex, BMI, age and other factors, are used in the modelling process. For both stages, we use the XGBoost Classifier algorithm. The healthy population present in the data, has been undersampled to tackle the class imbalance present in the data. This technique has been applied and evaluated using an ECG dataset from the UKBioBank ECG taken at rest repository. The main results of our paper are as follows: The classification performance for the proposed approach has been measured using F1 score, Sensitivity (Recall) and Specificity (Precision). The results of the proposed system are 87.22%, 88.55% and 85.95%, for average F1 Score, average sensitivity and average specificity, respectively. Contribution and significance: The performance level indicates that automatic detection of AF and Arr in participants present in the UK Biobank is more precise and efficient if done in a two-stage manner. Automatic detection and classification of AF and Arr individuals this way would mean early diagnosis and prevention of more serious consequences later in their lives.</p>