Machine learning to risk stratify individuals for undiagnosed atrial fibrillation at scale using population-wide electronic health records.

Rokach, L.; Benita, TR.; Nadarajah, R.; Raveendra, K.; Haris, M.; Gale, CP.; Wu, J.; Haim, M.; Zahger, D.; Arbel, R.

Machine learning to risk stratify individuals for undiagnosed atrial fibrillation at scale using population-wide electronic health records.

All Authors

Rokach, L.

Benita, TR.

Nadarajah, R.

Raveendra, K.

Haris, M.

Gale, CP.

Wu, J.

Haim, M.

Zahger, D.

Arbel, R.

LTHT Author

Nadarajah, Ramesh
Gale, Christopher

LTHT Department

Cardio-Respiratory
Cardiology

Publication Date

2026

Item Type

Journal Article

Subject

ATRIAL FIBRILLATION , HEALTH AND CARE RECORDS , ARTIFICIAL INTELLIGENCE , FORECASTING

Abstract

AIMS: Electronic health records (EHR) can be used to target atrial fibrillation (AF) screening. We evaluated the performance of risk prediction models scalable across nationwide EHRs. METHODS: Retrospective cohort study individuals aged >=30 years without diagnosed AF in the Clalit Health Services (Israel) EHR dataset between January 1 2019 and June 30, 2019. The primary outcome was a diagnosis of AF or atrial flutter (AFl) within 6 months. The FIND-AF, CHA2DS2-VASc and C2HEST scores were evaluated, with prediction performance assessed overall and by sex. The optimum threshold to apply in prospective screening was determined with a lift analysis. RESULTS: Of 2,166,795 individuals in the cohort 4,275 developed AF within 6 months. Prediction performance was strongest for FIND-AF (AUROC 0.871, 95% CI 0.864-0.877; calibration slope 0.73, 95% CI 0.67-0.79) compared with CHA2DS2-VASc (AUROC 0.838, 95% CI 0.831-0.845; calibration slope 0.63, 95% CI 0.60-0.67) and C2HEST scores (AUROC 0.834, 95% CI 0.823-0.844; calibration slope 0.62, 95% CI 0.58-0.65), including in women (FIND-AF AUROC 0.883, 95% CI 0.876-0.889; CHA2DS2-VASc AUROC 0.865, 95% CI 0.858-0.872; C2HEST AUROC 0.853, 95% CI 0.846-0.861) and men (FIND-AF AUROC 0.857, 95% CI 0.850-0.864; CHA2DS2-VASc AUROC 0.835, 95% CI 0.828-0.843; C2HEST AUROC 0.814, 95% CI, 0.806-0.822). Lift analysis suggested that screening the top 15% of FIND-AF risk compared to screening by age would identify 72% compared to 63% of AF diagnoses. CONCLUSION: The FIND-AF machine learning algorithm was scalable in routine EHR data with good discrimination for incident AF. Prospective evaluation is now required to evaluate risk-guided AF screening.