KLASIFIKASI TINGKAT KESELAMATAN PENYAKIT KANKER PAYUDARA MENGGUNAKAN METODE RANDOM FOREST

Main Article Content

Wirya Aditya

Abstract

This study explores the classification of breast cancer safety levels using three distinct methods: Random Forest, Decision Tree, and K-Nearest Neighbors (KNN). The primary focus is on comparing the experimental outcomes and evaluating the models using metrics evaluation, confusion matrices, train and test accuracies, and cross-validation. Random Forest consistently outperforms Decision Tree and KNN in terms of precision and accuracy due to its robustness in handling data variability through multiple independent decision trees, demonstrating a notable advantage despite small differences in performance. Cross-validation is employed to ensure the models generalize well to unseen data. Moreover, tracking train and test accuracies assists in assessing potential overfitting. The study underscores the importance of adapting the choice of classification method to the dataset's unique characteristics. In conclusion, the findings suggest that Random Forest is the optimal choice for breast cancer safety classification in this specific dataset. However, it is essential to consider dataset context when selecting the most suitable classification method for any given scenario. The evaluation results using cross-validation show that Random Forest is the best indicator with the highest accuracy rate, which is 91.49% compared to Decision Tree and K Nearest Neighbors, which are 85.49% and 90.42%.

Article Details

Section

Articles

References

[1] T. a. F. G. Kadir, "Lung cancer prediction using machine learning and advanced imaging techniques," Translational lung cancer research, vol. III, no. 7, 2018.

[2] Y. Xiao, "A deep learning-based multimodel ensemble method for cancer prediction.," Computer methods and programs in biomedicine, p. 153, 2018.

[3] B. S. Ma, "02 Brain cancer prediction using machine learning methods and high-throughput molecular data.," 2017.

[4] K. P. S. S. Ch. Shravya, "Prediction of Breast Cancer Using Supervised Machine Learning Techniques," International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. VIII, no. 6, pp. 1106-1110, 2019.

[5] H. I. M. R. H. a. M. K. H. M. M. Islam, "Prediction of breast cancer using support vector machine and K-Nearest neighbors," in IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, Bangladesh, 2017.

[6] T. M. A. a. F. H. J. M. U. Ghani, "Comparison of Classification Models for Early Prediction of Breast Cancer," in 2019 International Conference on Innovative Computing (ICIC), Lahore, Pakistan, 2019.

[7] D. G. A. R. d. K. G. Ravi Kumar, "An Efficient Prediction of Breast Cancer Data using Data Mining Techniques," International Journal of Innovations in Engineering and Technology (IJIET), vol. II, no. 4, 2013.

[8] V. K. V. M. A. A. Y. a. A. J. T. Jain, "Supervised Machine Learning Approach For The Prediction of Breast Cancer," in 2020 International Conference on System, Computation, Automation and Networking (ICSCAN), Pondicherry, India, 2020.

[9] V. S. Madhu Kumari, "Breast Cancer Prediction system," Procedia Computer Science, vol. 132, pp. 371-376, 2018.

[10] G. Singh, "Breast Cancer Prediction Using Machine Learning," International Journal of Scientific Research in Computer Science, Engineering and Information Technology | IJSRCSEIT, vol. VI, no. 4, pp. 278-284, 2020.

[11] P. C. A. D. N. K. Mandeep Rana, "Breast Cancer Diagnosis And Recurrence Predictionusing Machine Learning Techniques," International Journal of Research in Engineering and Technology | IJRET, vol. IV, no. 4, pp. 372-376, 2015.

[12] M. O. M. H. H. A. M. T. M. A. A. Omar Tarawneh, "Breast Cancer Classification using Decision Tree Algorithms," International Journal of Advanced Computer Science and Applications | IJACSA, vol. XIII, no. 4, 2022.

[13] B. C. A. T. O. D. a. S. H. S. Laghmati, "Classification of Patients with Breast Cancer using Neighbourhood Component Analysis and Supervised Machine Learning Techniques," in 2020 3rd International Conference on Advanced Communication Technologies and Networking (CommNet), Marrakech, Morocco, 2020.

[14] A. Z. K. A. B. H. G. D. S. M. Y. Q. H. L. a. B. Z. Morteza Heidari, "Prediction of breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm," Institute of Physics and Engineering in Medicine, vol. 63, no. 3, 2018.

[15] L. W. F. X. X. &. Z. S. Lin, "Random forests-based extreme learning machine ensemble for multi- regime time series prediction," Expert Systems with Applications, vol. 83, p. 164–176.

[16] B. M. K. a. S. A. S. Murugan, "Classification and Prediction of Breast Cancer using Linear Regression, Decision Tree and Random Forest," in Mysore, IndiaInternational Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC), Mysore, India, 2017.

[17] A. R. a. S. A. D. Cahyanti, "Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara," Indones. J. Data Sci., vol. I, no. 2, pp. 39-43, 2020.

[18] D. S. P. a. D. S. B. P. I. Nainggolan, "Klasifikasi Informasi Kesehatan Pada Data Media Sosial Menggunakan Support Vector Machine dan K-Fold Cross Validation," Malikussaleh J. Mech. Sci. Technol., vol. V, no. 2, pp. 34-38, 2021.

[19] A. J. H. H. Ikhsan Nuh Atthalla, "Klasifikasi Penyakit Kanker Payudara Menggunakan Metode K Nearest Neighbor (KNN)," Annual Research Seminar | ARS, vol. IV, no. 1, pp. 148-151, 2018.

[20] W. S. J. S. A. R. &. A. A. Z. Saputra, "Seleksi Fitur Menggunakan Random Forest Dan Neural Network," Islamic Education Studies | IES, pp. 978-979, 2011.