Empirical Evaluation of Different Algorithms to Assess The Probability of Diabetes in its Early Stages

Rania Ashraf*; Roz Nisha; Fahad Shamim; Shahzad Nasim; Sarmad Shams

doi:-

Empirical Evaluation of Different Algorithms to Assess The Probability of Diabetes in its Early Stages

Download

Volume 5 Issue 2 2024
Author(s):	Rania Ashraf* Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan , raniya844@yahoo.com Roz Nisha Liaquat University of Medical and Health Sciences, Jamshoro,Pakistan, rosenisha734@gmail.com Fahad Shamim Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan, fahad.shamim@lumhs.edu.pk Shahzad Nasim The Begum Nusrat Bhutto Women University, Sukkur, Pakistan, shahzadnasim@live.com Sarmad Shams Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan, sarmad.shams@lumhs.edu.pk
Abstract	High blood sugar is a symptom of metabolic disorder, diabetes, an incurable and fatal disease. The primary cause of the disease is a hormone imbalance, which causes insulin impaction. Insulin is the specific hormone that regulates the sugar intake from the blood. The disease results in the body's inability to either make sufficient insulin or inadequate use of the produced insulin. Almost 1.6 million population die yearly due to this deadly disease. Early diagnosis can help reduce malignancy and enhance life expectancy. Since the medical data of diabetic individuals display a recognizable pattern, diabetes can be predicted in its early stages using machine learning algorithms. This is another way to get an early diagnosis without a glucose screening test. In this proposed paper, the prediction of early-stage diabetes is made by machine learning. The study individually experimented with eight machine learning algorithms over a dataset of 521 instances with 17 features. The performance assessment of every model is evaluated not only with accuracy metrics and confusion matrix, but AUC, F-score, recall, precision, TPR, & FPR are also observed to improve the algorithms' performance. The results of the applied techniques are validated using 5-fold cross-validation. AdaBoost classifier measures the lowest accuracy score with 82.89% accuracy. In comparison, the best score is measured by a Random Forest of 93.4. Similarly, the highest rating, calculated using Support Vector Machine, is 93.4 as well. Still, SVM exhibits a higher score of F-score and recall than RF, making it the best fit classifier for the study conducted. The rest of the classifiers have also performed well-having an accuracy of more than 80%. The findings indicate that the SVM Classifier is the most effective machine learning technique against binary-based classification datasets and can be utilized in predicting early-stage diabetes.
Keywords	Diabetes, Decision Tree, Naïve Bayes, Random Forest.
Year	2024
Volume	5
Issue	2
Type	Research paper, manuscript, article
Recognized by	Higher Education Commission of Pakistan, HEC
Category
Journal Name	ILMA Journal of Technology & Software Management
Publisher Name	ILMA University
Jel Classification	--
DOI	-
ISSN no (E, Electronic)	2790-590X
ISSN no (P, Print)	2709-2240
Country	Pakistan
City	Karachi
Institution Type	University
Journal Type	Open Access
Manuscript Processing	Blind Peer Reviewed
Format	PDF
Paper Link	https://ijtsm.ilmauniversity.edu.pk/arc/Vol5/i2/pdf1.pdf
Page	1-10

ILMA Journal of Technology & Software Management | Volume. 5 Issue 2

Empirical Evaluation of Different Algorithms to Assess The Probability of Diabetes in its Early Stages