Abstract
Atrial fibrillation (AF) is classically categorised by arrhythmia duration, but these subtypes have limitations in capturing mechanistic and prognostic diversity. A variational autoencoder, trained on >1.1M ECGs, extracted representative features, filtered for an AF cohort of 20,291 unique patients. These features were input into an unsupervised tree-based clustering method to map AF heterogeneity as a tree structure and identify phenogroups. Five phenogroups stratified by future disease risk were identified: (1) higher-risk AF; (2) highest-risk AF with heart failure (HF); (3) average paroxysmal AF; (4) lower-risk paroxysmal AF; and (5) higher-risk paroxysmal AF. The tree trajectory positioned individuals based on shared traits, emphasising explainability. Paroxysmal phenogroups 4 and 5 differed in risk and ventricular structure, with phenogroup 5 exhibiting more adverse features. Mixed AF phenogroup 2 reflected advanced AF with greater HF burden and mortality risk. This AI-ECG framework augments AF subtypes with a risk-based dimension, supporting personalised care.</p>