Predicting Heart Attacks Using Machine Learning Models: A Comprehensive Approach
Last Updated on October 31, 2024 by Editorial Team
Author(s): Farhaan Nazirkhan
Originally published on Towards AI.
By Farhaan Nazirkhan & Sarwin Rajiah
Abstract
This study implements two supervised machine learning models, Decision Tree and Multilayer Perceptron (MLP), to predict heart attack likelihood using a labeled dataset of 1,888 rows and 14 features. Leveraging significant features identified in prior research, our optimized models achieved accuracy and F1-score of 92.33%, evaluated through metrics like precision, recall, and specificity. Compared to similar studies, the models showed enhanced performance due to the larger dataset and hyperparameter tuning. This research demonstrates the potential of machine learning for early heart disease diagnosis, aiming for future real-time clinical applications.
Introduction
Cardiovascular diseases (CVDs) are a major global health concern, responsible for a substantial proportion of worldwide mortality. According to the World Health Organization (2021), approximately 17.9 million people died from CVDs in 2019, representing 32% of all deaths globally. Of these, 85% were due to a heart attack and stroke. While major progress has been made in medical diagnostics, early and accurate prediction of heart attack risk remains critical in reducing mortality rates.
In recent years, machine learning has emerged as a promising tool for predictive healthcare analytics, offering the potential to enhance early diagnosis by identifying patterns in complex datasets that may not be apparent to traditional medical analyses. In this project, we employed two machine learning modelsβββDecision Tree and Multi-Layer Perceptron (MLP) neural networkβββtrained on a large dataset to predict the likelihood of heart attacks. We specifically chose features proven to be significant in heart attack prediction based on a review of more than five research papers.
The dataset used in this study was generated by combining five publicly available datasets, creating a comprehensive dataset of 1888 rows and 14 attributes after dropping missing data. Hyperparameter tuning and performance evaluation were conducted for both models. Additionally, we calculated feature importance to understand which factors played a critical role in the modelβs predictions. However, while feature importance was analyzed, it was not directly used in model tuning.
The following sections will detail the methodology, hyperparameter tuning processes, and the resulting performance of both models. We will also compare our findings with existing research to highlight the advancements made in heart attack prediction using machine learning.
You can access the dataset we used and uploaded on Kaggle here, and the code for the model implementation can also be found on my Kaggle notebook here. Additionally, the full source code is available on GitHub.
You can also check out the video presentation of the system down below.
Research Gap
Despite several studies on heart attack prediction using machine learning have been conducted, there still lies several gaps:
- Limited Dataset Size: Many studies have relied on small datasets, limiting the generalizability of their models. By merging four public datasets, we sought to address this gap and provide a more robust dataset to train our models (Alshraideh, et al., 2024).
- Exclusion of Critical Features: Age, sex, cp, restecg, thalach, exang, oldpeak, slope, ca and thal are found as most relevant attributes in predicting heart diseases (Chellammal & Sharmila, 2019). However, some research models exclude these critical features (Hossain, et al., 2023). Our dataset incorporates the critical attributes for predicting heart diseases.
The findings from this study aim to fill these gaps by using a larger, more diverse dataset and by incorporating critical health features that are often overlooked in prior research.
Data Collection & Pre-processing
We compiled a comprehensive dataset by merging five public heart disease datasets from Kaggle and one from Figshare. This larger dataset provides a richer set of patient data, which will enhance the training and testing of the machine leaning models. The datasets used are detailed in below table.
Key Features in the Dataset:
1. age: The age of the patient
2. sex: Gender (1 = male, 0 = female)
3. cp: Chest pain type (four categories)
4. trestbps: Resting blood pressure (in mm Hg)
5. chol: Serum cholesterol in mg/dl
6. fbs: Fasting blood sugar (1 = >120 mg/dl, 0 = otherwise)
7. restecg: Resting electrocardiographic results (three categories)
8. thalach: Maximum heart rate achieved
9. exang: Exercise-induced angina (1 = yes, 0 = no)
10. oldpeak: ST depression induced by exercise
11. slope: Slope of the peak exercise ST segment
12. ca: Number of major vessels colored by fluoroscopy
13. thal: Thalassemia (four categories)
14. target: Risk of heart attack (1 = high, 0 = low)
Preprocessing
Data Cleaning
The initial combined dataset contained 2,181 rows and fourteen columns. Upon inspection, 293 rows were found to contain missing data across key features. The most significant missing data was in the ca (291 missing values), thal (266 missing values), and slope (190 missing values) columns. Rather than imputing the missing values, we chose to delete these rows, leaving 1,888 rows for training and testing.
The decision to delete rows instead of imputing was driven by:
- High Proportion of Missing Data: Features like ca and thal had sizable portions of missing values. Imputing such extensive missing data could introduce bias and reduce the modelβs reliability.
- Maintaining Data Integrity: Deleting incomplete rows ensured the datasetβs consistency, reducing the risk of introducing unreliable or biased data through imputation.
Standardization of Feature Names
The datasets used different naming conventions for features such as trestbps, exang, ca, thal, target, and slope. To ensure consistency across the combined dataset, we standardized all feature names to maintain uniformity. This step was essential for proper feature alignment during the merging process and subsequent model training and evaluation.
Feature Selection
All 14 features were retained based on their proven significance in predicting heart disease risk, as highlighted in multiple research studies. These features include age, cholesterol levels, resting blood pressure, and ECG outcomes. By retaining all features, we ensured the models had access to sufficient information to accurately predict heart attack risk.
Machine Learning Models
1. Decision Tree Classifier
Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The objective is to build a model that, by utilizing basic decision rules deduced from the data features, predicts the value of a target variable (scikit-learn, n.d.).
Best Parameters:
- Criterion: Gini impurity
- Splitter: Best
- Max Depth: 5
- Type of Pruning: ccp_alpha
- Random State: 8412 (yielded the best accuracy)
Key Metrics for Decision Tree Classifier:
- Accuracy: 92.33%
- Precision: 92.33%
- Recall: 92.33%
- F1-Score: 92.33%
The Decision Tree modelβs performance exceeded expectations, achieving a near-perfect classification of high and low heart attack risks.
2. Multilayer Perceptron (MLP)
The Multilayer Perceptron (MLP) is a type of artificial neural network (ANN) that consists of multiple layers of neurons, including an input layer, hidden layers, and an output layer (Chan, et al., 2023).
Best Hyperparameters:
- Hidden Layers: 2 layers with 50 neurons each (after hyperparameter tuning.
- Activation Function: Logistic (best-performing activation function)
- Batch Size: 200 (best-performing batch size)
- Learning Rate: Constant (best-performing learning rate)
- Epochs: 1000 (optimal number of epochs)
Key Metrics for MLP Classifier:
- Accuracy: 92.33%
- Precision: 92.39%
- Recall: 92.33%
- F1-Score: 92.33%
We performed extensive hyperparameter tuning to find the best set of parameters that yielded the highest accuracy. The MLP classifier algorithm was adjusted to loop through various hyperparameters, including the number of neurons, hidden layers, activation functions, and batch sizes. The best-performing configuration consisted of 2 hidden layers with 50 neurons each, a batch size of 200, a constant learning rate, logistic activation function, and 1000 epochs.
Visualizing Model Performance
Below are visual representations of our modelβs performance metrics, which include confusion matrices, F1 scores, and accuracy-precision-recall charts for both the Decision Tree and MLP models:
1. Multilayer Perceptron Classifier
Performance Metrics
The MLP model achieved an accuracy of 92.33%, an error rate of 7.68%, a specificity of 90.53%, and precision, recall, and F1-score of 0.923. These scores indicate that the MLP model excels at identifying both heart attack and non-heart attack cases with very few false positives or negatives.
Confusion Matrix
The above confusion matrix diagram gives insight into the classification performance:
- The model correctly predicted 172 non-heart attack cases and 177 heart attack cases.
- There were 18 false positives (nonheart attack cases predicted as heart attack) and 11 false negatives (heart attack cases predicted as non-heart attack).
These results indicate that while the neural network is performing well, it still misclassifies some cases.
Neural Network Structure
The above diagram illustrates the structure of the Neural Network, showing the input, hidden, and output layers. Each connection represents the learned weights that are used to make predictions based on the input data.
Learning Curve
The learning curve diagram above shows the modelβs performance as more training data is added. The training score (red line) starts at 1, indicating that the model fits the training data well. This combined with the large gap between the training score and cross-validation score (green line) suggests that the model is potentially overfitting. However, the gap reduces as more data is used, and both the training and cross-validation scores stabilize. The training score decreases slightly, indicating less overfitting, while the cross-validation score improves, showing better generalization.
Feature Importance Analysis
The MLpβs permutation importance reveals the critical features contributing to the modelβs accuracy, with cp, ca, and thal being among the most prominent features.
2. Decision Tree Classifier
Performance Metrics
After 40,000 executions with different random states, the decision tree classifier achieved an accuracy of 92.33%, an error rate of 7.68%, a specificity of 90.53%, and precision, recall, and F1-score of 0.923, when a random state value of 8412 was used. These scores indicate that the Decision Tree model excels at identifying both heart attack and non-heart attack cases with very few false positives or negatives.
Confusion Matrix
The confusion matrix highlights the modelβs performance:
- The model correctly predicted 166 non-heart attack cases and 183 heart attack cases.
- Only 16 non-heart attack cases were incorrectly predicted as heart attacks (false positives), and just 13 heart attack cases were missed (false negative).
These results indicate that while the decision tree is performing well, it still misclassifies some cases, specifically under-predicting heart attacks, which is a critical area to address in future iterations of the model.
Tree Diagram
The decision tree diagram shows the branching decisions made by the model, which are based on feature values. This interpretability is a major strength of Decision Trees, as clinicians can trace back predictions to specific patient features.
Learning Curve
Similarly to the previous learning curve, the one above shows the modelβs performance as more training data is added. The training score (red line) starts at 1, indicating that the model fits the training data well. This combined with the large gap between the training score and cross-validation score (green line) suggests that the model is potentially overfitting. However, the gap reduces as more data is used, and both the training and cross-validation scores stabilize. The training score decreases slightly, indicating less overfitting, while the cross-validation score improves, showing better generalization.
Feature Importance Analysis
The Decision Treeβs feature importance plot reveals that features like cp, thal, and ca contributed the most to model predictions while fbs had no impact on result.
Results & Comparison with Other Research
Our machine learning models demonstrate outstanding performance in predicting heart attack risks compared to several recent research studies. It is important to note that our dataset was significantly larger than those used in most of these studies. This larger dataset likely contributed to the superior performance of our models due to the higher volume of data available for training and testing.
1. Decision Tree Performance:
- Our Model: Achieved an accuracy of 92.33%, with precision, recall, and F1-score of 0.923.
- Comparison 1: A study that applied the Jellyfish Optimization Algorithm to a Decision Tree model reported an accuracy of 97.55% (Ahmad & Polat, 2023). Their higher accuracy could suggest that the Jellyfish Optimization Algorithm would be a better fit for this scenario.
- Comparison 2: Another study using the Particle Swarm Optimization (PSO) technique reported an accuracy of 85.71% with a Decision Tree (Alshraideh, et al., 2024). Our modelβs 92.33% accuracy significantly exceeds this by 6.62%, showing the substantial impact of using a larger dataset and the importance of carefully tuning model parameters.
2. Neural Network (ANN) Performance:
- Our Model: Achieved an accuracy of 92.32%, with precision, recall, and F1-score all around 0.923%.
- Comparison 1: In another study, an ANN-based model reported an accuracy of 73.33% using the same dataset attributes (Rabbi, et al., 2018). The 19.00% increase in accuracy for our model demonstrates the critical role that dataset size and hyperparameter tuning play in model performance. Our larger dataset, combined with optimized ANN configurations, allowed for significantly better results.
- Comparison 2: A CNN-based heart disease prediction model achieved an accuracy of 91.71% (Arooj, et al., 2022). While CNNs are known for their power in image and structured data classification, our simpler ANN model slightly outperformed this with 92.32% accuracy. This further highlights the effectiveness of dataset size and optimization in achieving competitive results even with a relatively simpler model architecture.
These comparisons illustrate that while various optimization techniques and different algorithms can enhance model performance, the size and quality of the dataset play a critical role in determining overall model success. Our models not only benefited from fine-tuned hyperparameters but also from the larger volume of data, which contributed to more reliable and accurate predictions.
Future Work
Future iterations of this study can explore the following:
- Real-Time Applications: While the current study focused on training and evaluating models offline, future work will involve real-time data integration, enabling the model to make predictions on live data streams in clinical settings.
- Applying Optimization Algorithms: As seen above from existing research, different algorithms such as Jellyfish (JSO) and Particle Swarm Optimization (PSO) have been proven to increase the performance and accuracy of the base model.
Conclusion
In this study, we developed two machine learning models, a Decision Tree classifier, and a Multilayer Perceptron neural network, to predict heart attack risk. Both models showed high accuracy. Our findings highlight the potential of machine learning for heart disease prediction, particularly when applied to larger datasets with comprehensive feature sets.
Thank you for reading!
References
Ahmad, A. & Polat, H., 2023. Prediction of Heart Disease Based on Machine Learning Using Jellyfish Optimization Algorithm. Diagnostics, 13(14), pp. 2392β2392.
Alshraideh, M. et al., 2024. Enhancing Heart Attack Prediction with Machine Learning: A Study at Jordan University Hospital. Applied Computational Intelligence and Soft Computing.
Anand, N., 2018. Heart Attack Prediction. [Online] Available at: https://www.kaggle.com/datasets/imnikhilanand/heart-attack-prediction [Accessed 10 September 2024].
Arooj, S. et al., 2022. A Deep Convolutional Neural Network for the Early Detection of Heart Disease. Biomedicines.
Chan, K. Y. et al., 2023. Deep neural networks in the cloud: Review, applications, challenges and research directions. Neurocomputing.
Chellammal, S. & Sharmila, R., 2019. Recommendation of Attributes for Heart Disease Prediction using Correlation Measure. International Journal of Recent Technology and Engineering, 8(2S3), pp. 870β875.
Damarla, R., 2020. Heart Disease Prediction. [Online] Available at: https://www.kaggle.com/datasets/rishidamarla/heart-disease-prediction/data [Accessed 10 September 2024].
Hossain, M. I. et al., 2023. Heart disease prediction using distinct artificial intelligence techniques: performance analysis and comparison.
Lapp, D., 2019. Heart Disease Dataset. [Online] Available at: https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset [Accessed 10 September 2024].
Nandal, N., 2022. heart.csv. [Online] Available at: https://figshare.com/articles/dataset/heart_csv/20236848?file=36169122 [Accessed 10 September 2024].
Rabbi, M. F. et al., 2018. Performance Evaluation of Data Mining Classification Techniques for Heart Disease Prediction. Journal of Engineering Research.
Rahman, R., 2021. Heart Attack Analysis & Prediction Dataset. [Online] Available at: https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset/data [Accessed 10 September 2024].
scikit-learn, n.d. [Online] Available at: https://scikit-learn.org/stable/modules/tree.html [Accessed 14 September 2024].
World Health Organization, 2021. Cardiovascular diseases (CVDs). [Online] Available at: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds) [Accessed 10 September 2024].
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI