Background: Cardiovascular diseases (CVDs) remain the leading cause of mortality globally. Traditional risk scores such as the Framingham Risk Score (FRS) and ASCVD estimator have been widely used to predict cardiovascular events. However, advancements in artificial intelligence (AI) offer the potential for enhanced prediction accuracy by integrating large datasets and identifying complex patterns. This study aimed to compare the predictive performance of AI-based diagnostic models with conventional risk scoring methods in forecasting cardiovascular events. Materials and Methods: A retrospective cohort of 2,000 patients aged 30–75 years, with no prior history of cardiovascular events, was selected from a tertiary care database. Demographic, clinical, and biochemical data were collected. Three AI models—random forest (RF), support vector machine (SVM), and deep neural networks (DNN)—were developed and trained using 70% of the dataset and tested on the remaining 30%. Performance was compared against the FRS and ASCVD scores. Metrics evaluated included sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve (AUC). Results: The DNN model demonstrated the highest predictive performance with an AUC of 0.91, sensitivity of 88.5%, and specificity of 85.2%. The RF model achieved an AUC of 0.87, while SVM reached 0.84. In comparison, the FRS and ASCVD scores yielded AUCs of 0.76 and 0.74, respectively. AI models consistently outperformed traditional scores in correctly identifying high-risk individuals who experienced cardiovascular events over a five-year follow-up period. Conclusion: AI-driven diagnostic models, particularly deep learning algorithms, significantly surpass traditional risk scores in predicting cardiovascular events. These findings support the integration of AI tools into clinical decision-making to enhance early risk identification and preventive strategies.
Cardiovascular diseases (CVDs) continue to be the leading cause of morbidity and mortality worldwide, accounting for nearly 17.9 million deaths annually, representing 32% of all global deaths (1). Early identification of individuals at high risk of cardiovascular events is critical for timely intervention and reduction in disease burden. Traditionally, risk prediction has relied on statistical models such as the Framingham Risk Score (FRS) and the Atherosclerotic Cardiovascular Disease (ASCVD) calculator, which are based on fixed clinical parameters including age, gender, blood pressure, lipid profile, smoking status, and diabetes (2,3). While these models have contributed significantly to preventive cardiology, they are limited by population-specific derivation and linear assumptions that may not capture the complexity of individual patient profiles (4).
With the advent of big data and computational advancements, artificial intelligence (AI) has emerged as a powerful tool in medical diagnostics and prognostics. AI, particularly machine learning (ML) and deep learning (DL) algorithms, can analyze vast amounts of heterogeneous data to identify subtle, non-linear patterns that traditional models may overlook (5,6). Several studies have demonstrated the potential of AI-based systems to improve risk stratification and clinical decision-making in cardiology (7,8). However, comparative studies evaluating the predictive accuracy of AI models against conventional risk scores in real-world populations remain limited.
This study aims to assess and compare the performance of selected AI-driven diagnostic models with established traditional risk scoring systems in predicting cardiovascular events. By evaluating predictive accuracy, sensitivity, specificity, and area under the curve (AUC), we seek to explore the potential clinical utility of integrating AI tools into routine cardiovascular risk assessment.
Study Design and Population
A retrospective observational study was conducted using anonymized patient data obtained from the electronic medical records of a tertiary care hospital. The study included adult patients aged between 30 and 75 years, who had no prior history of cardiovascular events at baseline. Patients with incomplete data or pre-existing cardiovascular conditions were excluded.
Data Collection
A total of 2,000 patient records were randomly selected for analysis. The dataset included demographic variables (age, gender), clinical parameters (blood pressure, body mass index, smoking status, history of diabetes and hypertension), and biochemical values (total cholesterol, LDL-C, HDL-C, triglycerides, and fasting blood glucose). The outcome of interest was the occurrence of a major cardiovascular event (myocardial infarction, stroke, or cardiovascular death) within a five-year follow-up period.
Risk Assessment Models
Traditional risk was assessed using the Framingham Risk Score (FRS) and the ASCVD risk estimator based on standard guidelines. For AI-based prediction, three machine learning models were developed: Random Forest (RF), Support Vector Machine (SVM), and Deep Neural Network (DNN). These models were selected due to their established utility in clinical prediction modeling.
Model Development and Validation
The dataset was split into training (70%) and testing (30%) sets using stratified random sampling. Data preprocessing involved normalization and handling of missing values through multiple imputation. Feature selection was performed using recursive feature elimination for machine learning models.
Each model was trained using the training set and optimized using 10-fold cross-validation to prevent overfitting. Model performance was evaluated on the testing set based on sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUC).
Statistical Analysis
Comparative performance of AI models and traditional risk scores was analyzed using the DeLong test for AUC comparisons. A p-value of less than 0.05 was considered statistically significant. All analyses were performed using Python (version 3.9) with libraries including Scikit-learn, TensorFlow, and Pandas
A total of 2,000 patient records were analyzed, comprising 1,180 males (59%) and 820 females (41%), with a mean age of 54.3 ± 9.2 years. Cardiovascular events were reported in 322 patients (16.1%) during the five-year follow-up period.
Model Performance Comparison
The predictive performance of the artificial intelligence models and traditional risk scores is summarized in Table 1. Among all models tested, the Deep Neural Network (DNN) showed the highest overall accuracy (89.3%), followed by Random Forest (RF) at 85.6%, and Support Vector Machine (SVM) at 83.1%. In contrast, traditional risk scores demonstrated lower predictive power, with the Framingham Risk Score (FRS) achieving 75.4% accuracy and the ASCVD estimator 73.6%.
The DNN model outperformed all other methods in terms of Area Under the Curve (AUC = 0.91), sensitivity (88.5%), and specificity (85.2%). The FRS and ASCVD models showed significantly lower AUCs of 0.76 and 0.74, respectively (Table 1).
Table 1: Comparison of Predictive Performance between AI Models and Traditional Risk Scores
Model |
Accuracy (%) |
Sensitivity (%) |
Specificity (%) |
AUC |
Deep Neural Network (DNN) |
89.3 |
88.5 |
85.2 |
0.91 |
Random Forest (RF) |
85.6 |
83.7 |
82.4 |
0.87 |
Support Vector Machine (SVM) |
83.1 |
81.2 |
78.9 |
0.84 |
Framingham Risk Score (FRS) |
75.4 |
69.8 |
72.3 |
0.76 |
ASCVD Risk Score |
73.6 |
67.1 |
71.4 |
0.74 |
As shown in Table 1, AI models demonstrated notably superior classification performance when compared to traditional scores.
Confusion Matrix Outcomes
To evaluate classification accuracy in more detail, confusion matrices were generated for each model. The DNN model correctly classified 275 of 322 patients who developed cardiovascular events and accurately identified 1,438 of 1,678 non-event cases (Table 2). In contrast, the FRS model misclassified 110 event cases as low risk.
Table 2: Confusion Matrix Summary for All Models
Model |
True Positives |
True Negatives |
False Positives |
False Negatives |
DNN |
275 |
1,438 |
240 |
47 |
RF |
263 |
1,400 |
278 |
59 |
SVM |
257 |
1,370 |
308 |
65 |
FRS |
212 |
1,320 |
358 |
110 |
ASCVD |
205 |
1,299 |
379 |
117 |
The deep learning model consistently demonstrated better discrimination and fewer false negatives than the other models, further supporting its utility in high-risk patient identification (Table 2).
This study evaluated and compared the predictive accuracy of artificial intelligence (AI)-based diagnostic models with conventional cardiovascular risk scoring tools, namely the Framingham Risk Score (FRS) and ASCVD risk calculator. Our findings reveal that AI models, particularly deep neural networks (DNN), outperform traditional methods in predicting cardiovascular events, consistent with prior studies emphasizing the potential of AI in clinical risk prediction (1,2).
The superior performance of AI models, as evidenced by higher AUC, sensitivity, and specificity, suggests their ability to capture complex and nonlinear relationships among multiple variables that traditional models often overlook (3,4). Deep learning architectures can leverage hidden patterns across diverse datasets, which may explain the enhanced prediction accuracy in our cohort (5).
Traditional scores like FRS and ASCVD, although widely used, were developed based on relatively homogeneous populations and assume linear associations among risk factors (6). This limits their generalizability to heterogeneous populations with varying comorbidities, lifestyle factors, and genetic backgrounds (7,8). Several studies have raised concerns regarding the underestimation or overestimation of risk when these tools are applied across different ethnic or demographic groups (9,10).
Our results align with previous investigations that demonstrated the efficacy of AI models in improving cardiovascular risk stratification. Weng et al. reported that machine learning algorithms achieved better discrimination and calibration than traditional models in a primary care setting (11). Similarly, Krittanawong et al. emphasized the value of deep learning in identifying subtle clinical signals predictive of cardiovascular events (12).
Importantly, the high sensitivity of the DNN model in this study indicates its potential for minimizing missed high-risk cases, which is critical in preventive cardiology. At the same time, its acceptable specificity reduces the likelihood of unnecessary interventions in low-risk patients (13). However, it is essential to acknowledge that the interpretability of AI models remains a challenge in clinical practice. Unlike conventional scoring systems, which provide transparent criteria, AI models often function as "black boxes," making it difficult for clinicians to explain individual predictions (14).
Additionally, the success of AI models relies heavily on the quality and quantity of input data. In settings where electronic health records are incomplete or inconsistently updated, the performance of AI tools may decline. Therefore, standardization of data collection and integration across healthcare systems is a necessary step for widespread implementation (15).
Limitations
While the current study provides strong evidence favoring AI-based risk prediction, certain limitations should be acknowledged. The retrospective design may introduce selection bias. Moreover, the models were trained on a single-center dataset, limiting external validity. Future prospective and multicentric studies are needed to confirm these findings and evaluate real-world applicability
Artificial intelligence-based diagnostic models, particularly deep learning algorithms, demonstrated superior predictive accuracy for cardiovascular events compared to traditional risk scores. Their ability to process complex, multi-dimensional data highlights their potential role in enhancing early risk stratification and guiding preventive cardiology. Integration of AI into clinical practice may offer a transformative approach to personalized cardiovascular care.