Empirical Evaluation of Different Algorithms to Assess The Probability of Diabetes in its Early Stages
Download Volume 5 Issue 2 2024 | |
---|---|
Author(s): |
Rania Ashraf* Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan , raniya844@yahoo.com Roz Nisha Liaquat University of Medical and Health Sciences, Jamshoro,Pakistan, rosenisha734@gmail.com Fahad Shamim Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan, fahad.shamim@lumhs.edu.pk Shahzad Nasim The Begum Nusrat Bhutto Women University, Sukkur, Pakistan, shahzadnasim@live.com Sarmad Shams Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan, sarmad.shams@lumhs.edu.pk |
Abstract | High blood sugar is a symptom of metabolic disorder, diabetes, an incurable and fatal disease. The primary cause of the disease is a hormone imbalance, which causes insulin impaction. Insulin is the specific hormone that regulates the sugar intake from the blood. The disease results in the body's inability to either make sufficient insulin or inadequate use of the produced insulin. Almost 1.6 million population die yearly due to this deadly disease. Early diagnosis can help reduce malignancy and enhance life expectancy. Since the medical data of diabetic individuals display a recognizable pattern, diabetes can be predicted in its early stages using machine learning algorithms. This is another way to get an early diagnosis without a glucose screening test. In this proposed paper, the prediction of early-stage diabetes is made by machine learning. The study individually experimented with eight machine learning algorithms over a dataset of 521 instances with 17 features. The performance assessment of every model is evaluated not only with accuracy metrics and confusion matrix, but AUC, F-score, recall, precision, TPR, & FPR are also observed to improve the algorithms' performance. The results of the applied techniques are validated using 5-fold cross-validation. AdaBoost classifier measures the lowest accuracy score with 82.89% accuracy. In comparison, the best score is measured by a Random Forest of 93.4. Similarly, the highest rating, calculated using Support Vector Machine, is 93.4 as well. Still, SVM exhibits a higher score of F-score and recall than RF, making it the best fit classifier for the study conducted. The rest of the classifiers have also performed well-having an accuracy of more than 80%. The findings indicate that the SVM Classifier is the most effective machine learning technique against binary-based classification datasets and can be utilized in predicting early-stage diabetes. |
Keywords | Diabetes, Decision Tree, Naïve Bayes, Random Forest. |
Year | 2024 |
Volume | 5 |
Issue | 2 |
Type | Research paper, manuscript, article |
Recognized by | Higher Education Commission of Pakistan, HEC | Category | Journal Name | ILMA Journal of Technology & Software Management | Publisher Name | ILMA University | Jel Classification | -- | DOI | - | ISSN no (E, Electronic) | 2790-590X | ISSN no (P, Print) | 2709-2240 | Country | Pakistan | City | Karachi | Institution Type | University | Journal Type | Open Access | Manuscript Processing | Blind Peer Reviewed | Format | Paper Link | https://ijtsm.ilmauniversity.edu.pk/arc/Vol5/i2/pdf1.pdf | Page | 1-10 |