Contents
Download PDF
pdf Download XML
197 Views
9 Downloads
Share this article
Research Article | Volume 14 Issue 6 (Nov - Dec, 2024) | Pages 391 - 399
Risk of Diabetes Disease Prediction Using Machine Learning Approach
 ,
1
Department Of Statistics, B. N. College, Patna University, Patna, Bihar. India
2
Department Of Psychology, D.D.U. Gorakhpur University Gorakhpur, UP, India
Under a Creative Commons license
Open Access
DOI : 10.5083/ejcm
Received
Oct. 9, 2024
Revised
Oct. 26, 2024
Accepted
Nov. 18, 2024
Published
Dec. 2, 2024
Abstract

Overall, machine learning is fundamentally one of the standard and evolving approach which has efficient algorithms for classification and reorganization through recursive learning. I argue that machine learning allows it’s possible to build and verify a classification system were, on a human level, can be called ‘intelligence’. In term of disease forecasting, it is machine learning which has done the greatest wonder provided one has the right training and testing case. This Study introduces a novel approach of predicting diabetes using Machine Learning Classification which is based on other factors that contributes to an individual’s diabetes risks. We have a dataset with a total of 768 instances and 9 attributes. It includes the usual risk factors such as age, glucose, and BMI. We were six method uses that is Logistic regression, Random Forest, KNN, Support Vector, Decision Tree and Naïve Bayes. The accuracy of the different algorithm of the training data set was obtained by 77 %, 100%, 81%, 81%, 100% and 74% respectively.

Keywords
INTRODUCTION

Diabetes refers to a medical condition characterized by an excessive buildup of glucose or sugar in the blood. After a period of time, excessive glucose levels in blood can cause harm to various body tissues. For example change of micro angiopathy to macro angiopathy with a great risk for coronary heart disease, stroke, kidney, eye, gum, foot, and nerve. Eventually, this can also result in coronary heart disease. People suffering from the blood sugar ailment experience symptoms of heart disease earlier than those without it. People with diabetes have nearly twice the risk of heart disease or stroke compared to moderates without diabetes.

 

Diabetes is one such illness which one can survive very easily once the treatment has been administered and controlled as mentioned by WHO (2019) diabetes means ‘a disease characterized by hyperglycemia due to insulin deficiency or its insufficient effect’. This deficiency may be either genetic or acquired, resulting in the people suffering from hyperglycemia which, in turn, destroys many systems within the body with the most commonly affected being blood vessels and nerves. Chronic diseases such as diabetes rarely cure; therefore, an integrated approach is necessary to reduce the risk of occurrence or onset of the complication by managing and controlling the disease over a long time period. Behavioral changes, pharmacotherapy and continuous follow-up are necessary to control blood glucose levels at optimal targets and lower and reduce other related health risks.

 

The American orthodox diabetic association (ADA, Cahn et al 2014) further explains that, diabetes is a group of diseases of metabolic dysfunction, where diabetes was exhibited by an increase in blood sugar level, and where there is abnormality or either or both in the production and utilization of carbohydrates. Diabetes has many long term complications, but the one with the most significance is chronic high blood sugar and its undulated effects on progression causing long term damage to the eyes, hearts, blood vessels and nerves as well as the kidneys, In the view of the last statement it is clear, that it would be impossible to completely eradicate the management of diabetic type 2 nowadays. Diabetes America ADA has tackled such an issue, where, management and thorough care are essential to avoid or postpone the onset of these complications, making diabetes much more of a rather complex ailment that should be taken care of with utmost caution. Chichwa, 2009 states that, diabetes is an ailment of a long duration where all those people afflicted with it and whose state of health is only worsened by the increased blood sugar level which could b e supported by a wide range of clinical manifestations if the course is left neglected or poorly managed. It happens as a consequence of an insufficient creation of insulin, the key hormonal agent controlling the glucose content in blood, by pancreas or an ineffective action of the insulin that is produced by the human body. Once these cause and further management are discovered, which could entail lifestyle alteration quite frankly that could be a great inconvenience in their ordinary living.

 

Diabetic patients are dependent on treatments and other strategies to control their disease and guarantee their health and the risk of further complications, and this poses an enormous challenge in their everyday life. The diabetes lifestyle imposes a considerable burden to the sick since they need to keep checking their sugar levels, eat properly, exercise and take drugs whenever necessary. Such includes controlling the amount of carbohydrates eaten, checking glucose content 3 7 times a day and changing the Ruiz doses. Diabetic patients also require seeing its physician for opportunistic examinations to revise therapeutic schedules, exclude risk factors for potential organ/ tissue damage (kidneys, eyes, heart) and also deal with some common diseases (high pressure, raised cholesterol levels, nerve damage). At this degree or level of the management of crystalline diabetes, foot ulcers, inflammatory diseases, or brain functional impairment appear, making therapy more complicated. So working on different diseases a patient has and equally studying all health conditions is of utmost importance in all quarters. On the other hand, many patients suffering from diabetes lead dynamic life in spite of the glycemia due to diet, medication and devices e.g. insulin pumps and glucose continuous monitors.

MATERIALS AND METHODS

In the present examine we've taken diverse system gaining knowledge of classification set of rules is used. we have carried out Logistics Regression approach, k-nearest neighbor(KNN) algorithm decision Tree, Random forest and Naïve Bayes algorithm by way of the use of Python with the aid of enforcing the  information set to get prediction, Accuracy, recall and Precision for fitting the algorithm to the education set: education facts is the largest (in -size) subset of the unique dataset, that's used to train or in shape the machine studying model. In our model 80% of the information was for training set and 20 % of the information changed into for the test information set.

 

2.1 Confusion Matrix in Machine Learning

A confusion matrix (or, mistakes matrix) is a visualization approach for classifier set of rules effects. Greater specifically, it's far a table that breaks down the range of floor reality instances of a specific magnificence in opposition to the number of anticipated class times. Confusion matrices are one in all several evaluation metrics measuring the overall performance of a category model. They can be used to calculate some of other version overall performance metrics, which includes precision and recall, amongst others.

 

Confusion matrices may be used with any classifier algorithm, which includes Naïve Bayes, logistic regression fashions, decision trees, and so on. due to their huge applicability in information technology and machine getting to know models, many programs and libraries come preloaded with features for growing confusion matrices, such scikit-learn’s sklearn.           

 

2.2    Histogram of data Set:

 

2.6        Correlations among the Data set:

 

 

  1. CLASSIFICATION REPORTS

 

3.1 Confusion matrix for different classification algorithm

     

 3.2 ROC Curve for Classification of Algorithm

 

 

 

 

RESULTS

Logistic Regression

77.36156351791531

77.27272727272727

KNN

81.10749185667753

 74.67532467532467

Support Vector

81.92182410423453

83.1168831168831

Decision tree

100.0

79.87012987012987

Random Forest

100.0

81.16883116883116

Naïve Bayes

74.2671009771987

74.02597402597402

 

3.4   Accuracy contrast of the model

Algorithm

Precision Score

Recall Score

ROC AUC SCORE

Logistic Regression

75.0000

 57.8947

0.7327

KNN

68.75

 57.89473

0.7121

Support Vector

86.04651

64.9122

0.79363

Decision tree

69.4915

71.9298

0.7668

Random Forest

72.5490

 64.9122

 0.7523

Naïve Bayes

 65.4544

63.1578

0.7178

CONCLUSION

One of the important real-world medical problems is the detection of diabetes at its early stage. In this study, systematic efforts are made in designing a system which results in the prediction of disease like diabetes. During this work, 6 machine learning classification algorithms are studied and evaluated on various measures. That is Logistic regression, Random forest, KNN, Support Vector, Decision Tree and Naïve Bayes for the prediction of Diabetes disease. We find that Decision tree, Random forest and Support vector gives the better accuracy others.

REFERENCES
  1. Fikirte Girma Wolde Michael and Sumitra Menaria, "Prediction of Diabetes using Data Mining Techniques Dept. of Computer Science and Engineering", Proceedings of the 2nd International conference on Trends in Electronics and Informatics (ICOEI 2018).
  2. Terry Jacob Mathew and Elizabeth Sherly, Analysis Supersived Learning Techniques for Cost Effective Diaease Prediction using Non-Clinical Parameters, Trivndrum, July 2018.
  3. Deepti Sisodiaa and Dilip Singh Sisodiab, "Prediction Diabetes and Classification Algorithm", International Conference on Computational Intelligence and Data Science.
  4. Analysis of a Random Forests Model Gerard Biau LSTA&LPMA University Pierre et Marie Curie – Paris VI’ Boîte 158 Tour 15–25 2eme ‘etage’ 4 place Jussieu 75252 Paris Cedex 05, France.
  5. A Novel Classification Method for Diagnosis of Diabetes Mellitus Using Artificial Neural Networks- 1.T.Jayalakshmi Computer Science Department         CMS College of Science and Commerce Coimbatore, India , Dr .A. Santhakumaran Statistics Department Salem Sowdeswari College Salem, India   2010 International Conference on Data Storage and Data Engineering.
  6. Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus Nahla H.Barakat, Andrew P. Bradley, Senior Member, IEEE,and Mohamed Nabil         Barakat IEEE Transactions on Information Technology in biomedicine, vol. 14, no. 4, July 2010.
  7. Design of a hybrid system for the diabetes and heart diseases Humar Kahramanli *, Novruz Allahverdi Department of Electronic and Computer Education, Selcuk University, Konya, Turkey.
  8. Nonparametric criteria for supervised classification of fuzzy data Ana Colubi a, Gil González-Rodríguez a,­ , M. Angeles Gil a,Wolfgang Trutschnig b a Department of Statistics, University of Oviedo, 33007 Oviedo, Spain b Research Unit on Intelligent Data Analysis and Graphical Models, European Centre for Soft Computing, 33600 Mieres, Spain
  9. A fast and adaptive automated disease diagnosis method with an innovative neural network model Erdem Alkım, Emre Gürbüz, Erdal Kılıç Department of               Computer Engineering, Faculty of Engineering, Ondokuzmayıs Universities, 55139 Kurupelit, Samsun, Turkey,Pg, Neural Networks 33 (2012) 88–96
  10. An automatic diabetes diagnosis system based on LDA-Wavelet Support Vector Machine Classifier Duygu Calisir a, Esin Dog˘antekin ba Istanbul University, Cerrahpas a Medical Faculty, Istanbul, Turkey b Firat University, Firat Medicine Center, Department of Microbiology and Clinical Microbiology, 23119 Elazi , Turkey, pg, Expert Systems with Applications        38 (2011) 8311–8315.
  11. A cascade learning system for classification of diabetes disease: Generalized Discriminant Analysis and Least Square Support Vector Machine Kemal Polat a, Salih Gu¨nes a, Ahmet Arslan b a Selcuk University,          Electrical and Electronics Engineering, 42075 Konya,Turkey b Selcuk University, Computer Science, 42075 Konya, Turkey, Expert Systems with Applications 34 (2008) 482–487.
  12. Data mining a diabetic data warehouse Joseph L. Breaulta,b,*, Colin R. Goodallc,d, Peter J. Fose, b, pg, Artificial Intelligence in Medicine 26 (2002) 37–54.
  13. Revision of the ADA-classification of diabetes mellitus type 2 (DMT2): The importance of maturity onset diabetes (MOD), and senile diabetes (DS) MarcoVacante, Michele   Malaguarnera, Massimo Motta *, pg, Archives of Gerontology and Geriatrics 53 (2011) 113–119.
  14. Using fuzzy Ant Colony Optimization for Diagnosis of Diabetes Disease Mostafa Fathi Ganji Faculty of Electrical and Computer Engineering University of Tarbiat Modares Tehran, Iran, Mohammad Saniee Abadeh Faculty      of Electrical and Computer Engineering University of Tarbiat Modares Tehran, Iran.
Recommended Articles
Research Article
Prevalence and Risk Factors of Anemia among Undergraduate Medical Students: A Cross-Sectional Study
...
Published: 06/05/2025
Download PDF
Research Article
A Correlation Between Dietary and Exercise Habits and Perceived Barriers Among Medical Students in Tertiary Health Care Institutions in South India
...
Published: 30/04/2025
Download PDF
Research Article
A Prospective Comparative Study Between Stapled and Conventional Haemorrhoidectomy
...
Published: 30/04/2025
Download PDF
Research Article
Retrospective Study of Uterine Corpus Lesions Over a Period of One Year in Tertiary Care Centre
...
Published: 25/04/2025
Download PDF
Chat on WhatsApp
Copyright © EJCM Publisher. All Rights Reserved.