Predicting Cardiovascular Diseases Risk in Thai Population by Machine Learning

3

Views

0

Downloads

Veerakul, Gumpanart, Pasupa, Kitsuchart, Pinroj, Yod, Chaothawee, Lertlak, Somprakit, Pradit, Kietdumrongwong, Pongtorn, Tonphu, Somkiat, Chaiwong, Warut and Yasri, Saowaluck (2025) Predicting Cardiovascular Diseases Risk in Thai Population by Machine Learning The Bangkok Medical Journal, 21 (2)., 133-144.

Abstract

OBJECTIVES: To establish a clinical data lake for artificial intelligence (AI) and develop a machine learning model to predict cardiovascular disease (CVD) risk. MATERIALS AND METHODS: Following IRB approval, de-identified clinical data from 2.9 million patients (2010–2019) across eight Bangkok Dusit Medical Services (BDMS) hospitals were collected in compliance with the Personal Data Protection Act (PDPA). Two datasets were constructed: BDMS-CVD-large (n = 9,072), comprising 3-year clinical records with 20 SHAP-selected features plus age and sex; and BDMS-CVD-Small (n = 107), incorporating coronary artery calcium scores (CACS) and time-from-test. XGBoost models were trained using 5-fold cross-validation, grid search, and repeated across 10 random splits. RESULTS: The BDMS-CVD-Large model achieved strong performance (F1-score: Macro 0.93 ± 0.008; Weighted 0.97 ± 0.003), with age, HDL, and LDL as key predictors. Including CACS improved the F1-score (0.92 ± 0.032 vs. 0.87 ± 0.031), confirming its value. Limitations included potential occult CVD, exclusion of over 40% of cases due to incomplete data, and missing longitudinal data in many patients. CONCLUSION: This study demonstrates the feasibility of machine learning (ML) based CVD prediction using large-scale clinical data under PDPA compliance. Prospective validation over 5–10 years is warranted, and integrating CACS may enhance future predictive accuracy.

Item Type:

Article

Identification Number (DOI):

Subjects:

Subjects > Computer Science > Machine Learning

Subjects > Statistics > Applications

Deposited by:

Kitsuchart Pasupa

Date Deposited:

2026-01-06 00:29:43

Last Modified:

2026-01-06 00:38:40

Impact and Interest:

Statistics