PERBANDINGAN KINERJA TEKNIK PENYEIMBANG KUMPULAN DATA DALAM MODEL PREDIKSI PENYAKIT STROKE

Main Article Content

Eugene Vincent Arends

Abstract

Stroke is a dangerous neurological disease in which the blood vessels in the brain become blocked or rupture. Accurate stroke predictions can help treat stroke and improve care. Prediction models can be created using machine learning techniques using patient data from hospitals. During the pre-processing stage, data with empty values are deleted, and the balancing techniques used are Random Oversampling (ROS), Random Undersampling (RUS), and Synthetic Minority Oversampling Technique (SMOTE). The machine learning models used in the experiment are Gaussian Naïve Bayes (GNB), Logistic Regression (LR), and K-Nearest Neighbors (K-NN). The Sensitivity of the model is the evaluation method that is valued in this research. Sensitivity score before using the balancing technique is 33% for GNB and LR, and 1% for K-NN. After using the balancing technique, the greatest sensitivity score for GNB and LR was 76%, and for K-NN it was 73%. It can be concluded that dataset balancing techniques play a big role in improving model performance on highly imbalanced datasets, such as the dataset used in this research.

Article Details

Section

Articles

References

[1] D. Kuriakose and Z. Xiao, "Pathophysiology and Treatment of Stroke: Present Status and Future Perspectives," International Journal of Molecular Sciences, vol. 21, no. 20, 2020.

[2] Y. Ge, M. Zadeh, C. Yang, E. Candelario-Jalil and M. Mohamadzadeh, "Ischemic Stroke Impacts the Gut Microbiome, Ileal Epithelial and Immune Homeostasis," iScience, vol. 25, no. 11, 2022.

[3] L. V. Feigin, M. Brainin, B. Norrving, S. Martins, R. L. Sacco, W. Hacke, M. Fisher, J. Pandian and P. Lindsay, "World Stroke Organization (WSO): Global Stroke Fact Sheet 2022," International Journal of Stroke, vol. 17, no. 1, pp. 18-29, 2022.

[4] N. Young and M. Yousufuddin, "Aging and ischemic stroke," Aging (Albany NY), vol. 11, no. 9, pp. 2542-2544, 2019.

[5] K. M. Rexrode, T. E. Madsen, A. Y. X. Yu, C. Carcel, J. H. Lichtman and E. C. Miller, "The Impact of Sex and Gender on Stroke," Circulation Research, vol. 130, no. 4, pp. 512- 528, 2022.

[6] M. M. Martin-Saez and N. James, "The experience of occupational identity disruption post stroke: a systematic review and meta-ethnography," Disability and Rehabilitation, vol. 43, no. 8, pp. 1044-1055, 2021.

[7] E.-C. Chiu, F.-C. Chi and P.-T. Chen, "Investigation of the home-reablement program on rehabilitation outcomes for people with stroke," Medicine, vol. 100, no. 26, 2021.

[8] P. B. Gorelick, "The global burden of stroke: persistent and disabling," The Lancet Neurology, vol. 18, no. 5, pp. 417-418, 2019.

[9] M. U. Emon, M. S. Keya, T. I. Meghla, M. M. Rahman, S. M. Kaiser and S. A. Mamun, "Performance Analysis of Machine Learning Approaches in Stroke Prediction," in 4th IEEE International Conference on Electronics, Communication and Aerospace Technology (ICECA 2020), Coimbatore, 2020.

[10] J. Heo, G. J. Yoon, Y. D. Kim, H. S. Nam and J. H. Heo, "Machine Learning–Based Model for Prediction of Outcomes in Acute Stroke," Stroke, p. 1263–1265, 2019.

[11] T. I. Shoily, T. Islam, S. Jannat, S. A. Tanna, T. M. Alif and R. R. Ema, "Detection of Stroke Disease using Machine Learning Algorithms," in 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, 2019.

[12] S. Dev, H. Wang, C. S. Nwosu, N. Jain, B. Veeravalli and D. John, "A predictive analytics approach for stroke prediction using machine learning and neural networks," Healthcare Analytics, vol. 2, 2022.

[13] G. Sailasya and G. L. A. Kumari, "Analyzing the Performance of Stroke Prediction using ML Classification Algorithms," International Journal of Advanced Computer Science and Applications (IJACSA), vol. 12, no. 6, pp. 539-545, 2021.

[14] E. Dristas and M. Trigka, "Stroke Risk Prediction with Machine Learning Techniques,"

Sensors, vol. 22, no. 13, 2022.

[15] C. S. Nwosu, S. Dev, P. Bhardwaj, B. Veeravalli and D. John, "Predicting Stroke from Electronic Health Records," in Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Glasgow, 2019.

[16] T. Tazin, M. N. Alam, N. N. Dola, M. S. Bari, S. Bourouis and M. M. Khan, "Stroke Disease Detection and Prediction Using Robust Learning Approaches," Journal of Healthcare Engineering, 2021.

[17] N. Biswas, K. M. M. Uddin, S. T. Rikta and S. K. Dey, "A comparative analysis of machine learning classifiers for stroke prediction: A predictive analytics approach," Healthcare Analytics, vol. 2, 2022.

[18] M. S. Shelke, P. R. Deshmukh and V. K. Shandilya, "A Review on Imbalanced Data Handling Using Undersampling and Oversampling Technique," International Journal of Recent Trends in Engineering and Research, vol. 3, no. 4, 2017.

[19] R. Mohammed, J. Rawashdeh and M. Abdullah, "Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results," in 20th International Conference on Information and Communications System (ICICS), Copenhagen, 2020.

[20] A. Fernández, S. Gárcia, F. Herrera and N. V. Chawla, "SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary," Journal of Artificial Intelligence Research, vol. 61, pp. 863-905, 2018.

[21] N. V. Chawla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.

[22] R. Pundlik, "Comparison of Sensitivity for Consumer Loan Data Using Gaussian Naïve Bayes (GNB) dan Logistic Regression (LR)," in 7th International Conference on Intelligent Systems, Modelling and Simulatio, Bangkok, 2016.

[23] Y. Wu and Y. Fang, "Stroke Prediction with Machine Learning Methods among Older Chinese," International Journal of Environmental Research and Public Health, vol. 17, no. 6, p. 1828, 2020.

[24] N. Prentzas, C. S. Pattichis and A. C. Kakas, "Integrating Machine Learning with Symbolic Reasoning to Build an Explainable AI Model for Stroke Prediction," in 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE), Athens, 2019.

[25] S. Paliwal, S. Parveen, M. A. Alam and J. Ahmed, "Improving Brain Stroke Prediction through Oversampling Techniques: A Comparative Evaluation of Machine Learning Algorithms," Preprints, 2023.

[26] E. Y. Boateng, J. Otoo and D. A. Abaye, "Basic Tenets of Classification Algorithms K- Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review," Journal of Data Analysis and Information Processing, vol. 8, no. 4, pp. 341-357, 2020.