0% found this document useful (0 votes)
133 views

Heart Disease Prediction Using Hybrid Model

The document summarizes a study that aims to predict heart disease using a hybrid machine learning model. The study uses the UCI heart disease dataset containing 303 samples with 14 clinical attributes. Three algorithms are applied to the data - Random Forest, Decision Tree, and a hybrid model combining Random Forest and Decision Tree. The experimental results show that the hybrid model achieves the highest prediction accuracy of 85.61%, demonstrating that hybrid models perform better than individual classification models for this task.

Uploaded by

Shivam Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views

Heart Disease Prediction Using Hybrid Model

The document summarizes a study that aims to predict heart disease using a hybrid machine learning model. The study uses the UCI heart disease dataset containing 303 samples with 14 clinical attributes. Three algorithms are applied to the data - Random Forest, Decision Tree, and a hybrid model combining Random Forest and Decision Tree. The experimental results show that the hybrid model achieves the highest prediction accuracy of 85.61%, demonstrating that hybrid models perform better than individual classification models for this task.

Uploaded by

Shivam Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science


( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:10/October-2022 Impact Factor- 6.752 www.irjmets.com

HEART DISEASE PREDICTION USING HYBRID MODEL


Shivam Garg*1, Rohan Gupta*2
*1
Research Student, Department of Computer Science, Delhi Technological University, Delhi, India
*2
Research Student, Department of Computer Science, Delhi Technological University, Delhi, India

ABSTRACT
Heart disease has emerged as a serious health concern for many individuals due to its high mortality rate
throughout the world. Detecting cardiovascular disorders including heart attacks, coronary artery diseases, etc.
by routine clinical data analysis is a critical task; early detection of heart disease may save many lives. The
application of machine learning techniques in the medical sector has advanced significantly. Many researchers
tried to predict heart disease using standard classification algorithm with feature selection. Many others
employed hyperparameter optimization using gridSearchCV, and even ensemble voting technique has been
used. A unique machine learning approach is put forth in the proposed work to forecast cardiac disease. The
UCI heart disease dataset was employed in the proposed study, and data mining techniques including
regression and classification were applied. Random Forest and Decision Tree machine learning algorithms
along with hyperparameter tuning are used. Three machine learning algorithms—Random Forest (RF),
Decision Tree (DT), and Hybrid Model (a hybrid of RF and DT) - are employed in the implementation. According
to experimental findings, the hybrid model's heart disease prediction accuracy rate is 85.61% and conclude that
hybrid models are better than standard classification models.
Keywords: UCI Heart Disease Dataset, Decision Trees, Random Forest, Hybrid algorithm, Machine learning,
Hyperparameter optimization.
I. INTRODUCTION
A lot of data may be studied and understood with the help of data mining. It is used to extract data and to
decide whether to move forward with additional applications. Data mining methods include clustering,
association rule mining, and classifications are the most often used methods. These data mining approaches can
be implemented using a wide variety of algorithms.
In the field of medical diagnostics, where computer analysis may reduce manual error and increase accuracy,
the use of machine learning is rapidly expanding. Through the use of machine learning techniques, disease such
as heart disease, liver disease, diabetes, and tumor predictions are made. Regression algorithms, such as
Random Forest, lasso, and logistic regressions, were employed in the medical sector.
According to survey results, cardiovascular diseases account for close to 17 million fatalities annually (CVD). If
patients take their prescribed medications on schedule, mortality can be decreased and many lives may be
saved by early disease identification. This study uses an automated medical diagnosis method to predict heart
disease using machine learning.

Figure 1: Block diagram of heart disease prediction


www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:10/October-2022 Impact Factor- 6.752 www.irjmets.com
As the best classification technique for predicting cardiac disease, we choose the hybrid model as it provides a
cutting-edge method that leverages probabilities obtained from one machine learning model as input to the
other model. This model, which is taken into account for implementations, provides us with better-optimized
results based on both machine learning techniques. In this paper, I've cited diagnoses of heart conditions from
the heart disease database. The heart disease database consists of mixed data types containing both numerical
and category information. Before continuing to the next level of processing, these entries are filtered and
cleaned to remove any extraneous data from the database.
The objective of the study is to categorise it as a binary classification type ranging from 0 (lack of heart disease)
to 1 (Present of heart disease). Patients may seek treatment based on the outcomes of our suggested model.
The suggested application aids in patient care preparation.
II. LITERATURE SURVEY

Dataset
Author Year Aim Methodology Result Limitation
Used
Each
technique
Palanippan, Combination of has a special
Sellappan, DT, NB, and NN advantage in
and Rafiah data mining achieving the
Awang approaches. specified
mining
goals.

Researchers are currently studying a wide variety of contemporary works on heart disease analysis and
prediction. The works mentioned below are a few examples.
Palaniappan, Sellappan, and Rafiah Awang [1] created a prototype Intelligent Heart Disease Prediction System
(IHDPS) by combining DT, NB, and NN data mining approaches. Results demonstrate that each technique has a
special advantage in achieving the specified mining goals.
Hashi, E.K. and Zaman [5] suggested use of a cognitive strategy for heart disease prediction. Five machine
learning algorithms are taken into consideration for prediction in this work, and each is accurately assessed. To
improve prediction outcomes, a logistic model tree is applied.
Dr. M. Kavitha, G. Gnaneswar, R. Dinesh, Y. Rohith Sai, R. Sai Suraj [6] developed a hybrid model as combination
of decision tree and random forest which outperforms the individual model.
A.Lakshmanarao, A. Srisaila, T.Srinivasa Ravi Kiran [7] proposed an ensemble classifier model for heart disease
prediction. Various classifier techniques with sampling techniques are applied and a good detection rate with
ensemble classifier is achieved.
III. METHODOLOGY
Data Sources
In this paper, UCI's machine learning repository's data on cardiac illness is processed. Researchers interested in
machine learning frequently view this dataset. There are 303 total examples in this collection, 164 of which
have heart-disease, and 139 are healthy, and there are around 14 clinical characteristics.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science


[2]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:10/October-2022 Impact Factor- 6.752 www.irjmets.com
Figure 2: Data Visualization of heart Disease
Data description
Input, key, and predictability are the three categories of attributes that are included in the data gathering.
Table 1: Description of 14 input attributes

Sr.no Attribute Description Values


1. Age Age (in years) Continuous
1 = male
2. Sex Male or Female 0 = female
1 = typical type 1
2 = typical type angina
Cp Chest pain type
3. 3 = non-angina pain
4 = asymptomatic
4. Thestbps Resting blood pressure Continuous value in mm hg
5. Chol Serum cholesterol Continuous value in mm/dl
0 = normal
1 = ST_T wave abnormal
6. Restecg Resting electrographic results
2 = left ventricular
hypertrophy
1 ≥ 120 mg/dl
7. Fbs Fasting blood sugar
0 ≤ 120 mg/dl
8. Thalach Maximum heart rate achieved Continuous value
9. Exang Exercise induced angina 0= no 1 = yes
Exercise-induced ST depression
10. Oldpeak Continuous value
compared to rest
1 = unsloping,
Exercise-induced ST depression
11. Slope 2 = flat,
compared to rest
3 = down sloping
how many large vessels have
12. Ca 0-3 value
been colored by fluoroscopy
3 = normal 6 = fixed 7 =
13. Thal Defect type
reversible defect
1=heart disease
14. Target
0= No heart disease

Data cleaning: Data cleaning is the first and most crucial step in the project's methods and data models. An
organised dataset is built using the gathered data. The data is coded in accordance with the attribute domain
value after the fields are identified, duplicates are removed, and missing values are filled in.
Hyperparameter Optimization: Selecting the best collection of hyperparameters for a learning algorithm is
known as hyperparameter optimization or tuning. In order to produce the best model that minimizes a
predetermined loss function on provided independent data, hyperparameter optimization seeks out a tuple of
hyperparameters.
Classification algorithm:
The design of the suggested system for predicting heart disease using machine learning algorithm models is
depicted in Figure 3 and is briefly explained below.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science


[3]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:10/October-2022 Impact Factor- 6.752 www.irjmets.com

Figure 3: heart disease prediction system architecture


Decision Tree:
One of the learning models used to solve the categorization problem is the decision tree. Using this method, we
split the dataset into two or more sets. Internal nodes of a decision tree indicate a test of the features, a branch
represents the result, and leaves reflect the decisions that are produced after further processing.
Random Forest Regression:
Regression using the random forest method combines numerous decisions into one. It creates decision trees
from several samples, using the majority of those trees for classification and the average tree for regression.
Both categorical and continuous variables can be handled by random forests.
Hybrid Model:
Using the decision tree and random forest algorithms, we created a hybrid model. Based on random forest
probabilities, the combined model operates. The decision tree algorithm receives the train data together with
the probabilities from the random forest. In a similar manner, test data are found and loaded with decision tree
probabilities. Values are forecasted at the end.
The proposed workflow listed below are used for the execution: -
A. Dataset is gathered from uci.edu.
B. Data visualization, preprocessing and cleaning are performed.
C. Test and train data are separated.
D. Initially, Decision Tree (DT) and Random Forest (RF) models with hyperparameter tuning are applied
for training and testing. The results obtained then, are analysed.
E. Build the hybrid model using the combination of Decision Tree (DT) and Random Forest (RF). Here, the
decision tree algorithm receives the train data together with the probabilities from the random forest.
F. Now, model receives a single input from the user and predict heart disease using hybrid model. These
results are then, being compared against the standard classification model.

This dataset comes from UCI. It has been divided into training and testing sets. In order to fit the model, we
used 70% of the dataset as training data for the machine learning techniques. the remaining 30% as test results
for the prognosis of cardiac disease. We made use of the DT, RF, and Hybrid Model. For a 30% test input,
models are used to predict heart disease, and the predicted values are plotted and compared for accuracy.
IV. RESULTS AND DISCUSSION
To improve the work and novelty of the work, we implemented a hybrid model of Decision Tree and Random
Forest. The result shows that heart disease detection is effective using the hybrid model. Hyper-tuned Decision
Tree achieves around 75% accuracy, and Hyper-tuned Random Forest achieves 82.4% accuracy, Hybrid model
achieves 85.6% accuracy.
Table 2: Classification report of the models

Sr.no Model Accuracy Precision F1-score Recall


Decision Tree
1. 75% 74% 73% 74%
Classifier
Random Forest
2. 82.4% 84% 83% 82%
Classifier
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[4]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:10/October-2022 Impact Factor- 6.752 www.irjmets.com

Hybrid
3. 85.6% 84.5% 84.5% 85%
Model (DT+RF)

Figure 4: Classification report of Decision Tree Figure 5: Classification report of Random Forest

Figure 6: Classification report of Hybrid Model (DT+RF)

Hyper-tuned Decision Tree achieves around 74% precision ,73% F1-score, 74% Recall and Hyper-tuned
Random Forest achieves 84% precision, 83% F1-score, 82% Recall. Whereas Hybrid model achieves 84.5%
precision ,84.5% F1-score, 85% Recall. Thus, Hybrid model leads in every single term as compared to DT and
RF models.
V. CONCLUSION
One of the potentially fatal diseases that is prevalent around the world is heart disease. The threat to condition
increases as a result of changing lifestyles and a lack of physical activity. The medical sector offers a variety of
diagnostic procedures. However, machine learning is thought to be the best option in terms of accuracy. The
suggested approach employs a hybrid model that combines Decision Tree and Random Forest for the
prediction of heart disease. For this investigation, the UCI's machine learning repository's data on cardiac
illness is utilised. We have got a higher accuracy, 85.61% using hybrid model as compared to individual models
for the prediction of the heart diseases. Here, we have used a small dataset of 303 entries. Moreover, we have
only two standard classification model, RF and DT. In future, we can further improve this model using a large
dataset. We can employ the other models also, and can compare to find out the most accurate model.
VI. REFERENCES
[1] “A Knowledge-Based Clinical Decision Support System Utilizing an Intelligent Ensemble Voting Scheme for
Improved Cardiovascular Disease Prediction.” A Knowledge-Based Clinical Decision Support System Utilizing
an Intelligent Ensemble Voting Scheme for Improved Cardiovascular Disease Prediction | IEEE Journals &
Magazine | IEEE Xplore, ieeexplore.ieee.org/document/9530429. Accessed 15 Nov. 2022.
[2] “Efficient Medical Diagnosis of Human Heart Diseases Using Machine Learning Techniques With and
Without GridSearchCV.” Efficient Medical Diagnosis of Human Heart Diseases Using Machine Learning
Techniques With and Without GridSearchCV | IEEE Journals & Magazine | IEEE Xplore,
ieeexplore.ieee.org/abstract/document/9751602. Accessed 15 Nov. 2022.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science


[5]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:10/October-2022 Impact Factor- 6.752 www.irjmets.com
[3] S. Mohan, C. Thirumalai and G. Srivastava, "Effective Heart Disease Prediction Using Hybrid Machine
Learning Techniques," in IEEE Access, vol. 7, pp. 81542-81554, 2019, doi: 10.1109/ACCESS.2019.2923707.
[4] Hashi, Emrana Kabir, and Md. Shahid Uz Zaman. “Developing a Hyperparameter Tuning Based Machine
Learning Approach of Heart Disease Prediction.” (PDF) Developing a Hyperparameter Tuning Based
Machine Learning Approach of Heart Disease Prediction | Emrana Kabir Hashi - Academia.edu,
www.academia.edu/53358757/Developing_a_Hyperparameter_Tuning_Based_Machine_Learning_Approac
h_of_Heart_Disease_Prediction. Accessed 15 Nov. 2022.
[5] T., Mythili, et al. “A Heart Disease Prediction Model Using SVM-Decision Trees-Logistic Regression (SDL).”
(PDF) a Heart Disease Prediction Model Using SVM-Decision Trees-Logistic Regression (SDL) | Mythili
Thirugnanam - Academia.edu, www.academia.edu/56949651/A_Heart_Disease_Prediction
_Model_using_SVM_Decision_Trees_Logistic_Regression_SDL_. Accessed 15 Nov. 2022.
[6] Modepalli, Kavitha & Gnaneswar, G. & Dinesh, R. & Sai, Y. & Suraj, R. (2021). Heart Disease Prediction using
Hybrid machine Learning Model. 1329-1333. 10.1109/ICICT50816.2021.9358597.
[7] “Heart Disease Prediction Using Feature Selection and Ensemble Learning Techniques.” Heart Disease
Prediction Using Feature Selection and Ensemble Learning Techniques | IEEE Conference Publication | IEEE
Xplore, ieeexplore.ieee.org/document/9388482. Accessed 15 Nov. 2022.
[8] “HDPF: Heart Disease Prediction Framework Based on Hybrid Classifiers and Genetic Algorithm.” HDPF:
Heart Disease Prediction Framework Based on Hybrid Classifiers and Genetic Algorithm | IEEE Journals &
Magazine | IEEE Xplore, ieeexplore.ieee.org/document/9585496. Accessed 15 Nov. 2022.
[9] “An Effective Heart Disease Detection and Severity Level Classification Model Using Machine Learning and
Hyperparameter Optimization Methods.” An Effective Heart Disease Detection and Severity Level
Classification Model Using Machine Learning and Hyperparameter Optimization Methods | IEEE Journals &
Magazine | IEEE Xplore, ieeexplore.ieee.org/document/9831786. Accessed 15 Nov. 2022.
[10] “Heart Disease Identification Method Using Machine Learning Classification in E-Healthcare.” Heart
Disease Identification Method Using Machine Learning Classification in E-Healthcare | IEEE Journals &
Magazine | IEEE Xplore, ieeexplore.ieee.org/document/9112202. Accessed 15 Nov. 2022.
[11] “Comparative Study of Optimum Medical Diagnosis of Human Heart Disease Using Machine Learning
Technique With and Without Sequential Feature Selection.” Comparative Study of Optimum Medical
Diagnosis of Human Heart Disease Using Machine Learning Technique With and Without Sequential Feature
Selection | IEEE Journals & Magazine | IEEE Xplore, ieeexplore.ieee.org/document/9718089. Accessed 15
Nov. 2022.
[12] “An Integrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases.” An
Integrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases | IEEE Journals &
Magazine | IEEE Xplore, ieeexplore.ieee.org/document/9491140. Accessed 15 Nov. 2022.
[13] Cardiovascular Disease dataset (Cleveland) - (https://archive.ics.uci.edu/ml/machine-learning-
databases/heart-disease/).

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science


[6]

You might also like