Perbandingan Kinerja Metode Klasifikasi Untuk Memprediksi Putus Sekolah Dan Keberhasilan Akademik Siswa
Main Article Content
Abstract
Putus sekolah dan keberhasilan akademik siswa merupakan dua hal yang penting dalam pendidikan. Penelitian ini membandingkan kinerja metode klasifikasi untuk memprediksi putus sekolah dan keberhasilan akademik siswa. Metode klasifikasi yang digunakan adalah Random Forest Classifier, AdaBoost, Decision Tree, Logistic Regression, dan XGBoost. Dataset yang digunakan berasal dari perguruan tinggi yang memiliki 4424 sampel dengan 36 fitur dan 3 kelas. Hasil penelitian menunjukkan bahwa metode Random Forest Classifier memiliki kinerja terbaik dengan akurasi 76%, diikuti oleh XGBoost 76%, AdaBoost 74%, Logistic Regression 74%, dan Decision Tree 71%. Oleh karena itu, metode Random Forest Classifier dapat digunakan untuk memprediksi putus sekolah dan keberhasilan akademik siswa dengan lebih akurat. Namun, perlu dicatat bahwa meskipun semua metode klasifikasi yang digunakan dalam penelitian ini telah mengalami perbaikan kinerja melalui penggunaan teknik ADASYN dan penyetelan parameter, mereka masih menghadapi tantangan dalam mengidentifikasi dengan akurat kasus-kasus dalam salah satu kelas minoritas. Oleh karena itu, langkah selanjutnya yang perlu diambil adalah melakukan penelitian lebih lanjut untuk mengoptimalkan parameter dengan lebih cermat dan juga mempertimbangkan pendekatan lain yang dapat lebih lanjut meningkatkan kinerja model, seperti mempertimbangkan penambahan informasi tambahan yang mungkin ada dalam dataset.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This work is licensed under a Jurnal Komunikasi Creative Commons Attribution-ShareAlike 4.0 International License.
References
[1] D. and M. J. and B. L. M. T. and R. V. Martins Mónica V. and Tolledo, 2021, Early Prediction of student’s Performance in Higher Education: A Case Study, Trends and Applications in Information Systems and Technologies, vol. 1, hal. 166–175.
[2] G. Lemaître, F. Nogueira, dan C. K. Aridas, 2017, Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning, Journal of Machine Learning Research, vol. 18, no. 17, hal. 1–5, [Daring]. Tersedia pada: http://jmlr.org/papers/v18/16-365.html
[3] Haibo He, Yang Bai, E. A. Garcia, dan Shutao Li, 2008, ADASYN: Adaptive synthetic sampling approach for imbalanced learning, dalam 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), IEEE, hal. 1322–1328. doi: 10.1109/IJCNN.2008.4633969.
[4] F. Pedregosa dkk., 2011, Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, vol. 12, hal. 2825–2830.
[5] K. Lu, Logistic Regression in Biomedical Study, 2022, 2022 International Conference on Biotechnology, Life Science and Medical Engineering (BLSME 2022), [Daring]. Tersedia pada: https://api.semanticscholar.org/CorpusID:248935866
[6] J. Friedman, T. Hastie, dan R. Tibshirani, 2000,Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors), The Annals of Statistics, vol. 28, no. 2, doi: 10.1214/aos/1016218223.
[7] M. Zanchak, V. Vysotska, dan S. Albota, 2021, The Sarcasm Detection in News Headlines Based on Machine Learning Technology, dalam 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT), IEEE, hal. 131–137. doi: 10.1109/CSIT52700.2021.9648710.
[8] Y. Yue, L. Jia, H. Zhai, M. Kong, dan M. Li, 2020, CFS-DT : a Combined Feature Selection and Decision Tree based Method for Octane Number Prediction, dalam 2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA), IEEE, hal. 100–103. doi: 10.1109/ICDSBA51020.2020.00033.
[9] J. R. Quinlan, 1986, Induction of decision trees, Mach Learn, vol. 1, no. 1, hal. 81–106, doi: 10.1007/BF00116251.
[10] E. Momeni, M. R. Sahebi, dan A. Mohammadzadeh, 2020, CLASSIFICATION OF HIGH-RESOLUTION SATELLITE IMAGES USING FUZZY LOGICS INTO DECISION TREE, Malaysian Journal of Geosciences, vol. 4, no. 1, hal. 07–12, doi: 10.26480/mjg.01.2020.07.12.
[11] L. Wang dan Y. Zhang, 2020, Clustering Reduction Method Analysis of Rough Set and Decision Tree based on Weight Matrix Analysis, IOP Conf Ser Mater Sci Eng, vol. 750, no. 1, hal. 012205, doi: 10.1088/1757-899X/750/1/012205.
[12] N. Nakaryakova, S. Rusakov, dan O. Rusakova, 2020, PREDICTION OF THE RISK GROUP (BY ACADEMIC PERFORMANCE) AMONG FIRST COURSE STUDENTS BY USING THE DECISION TREE METHOD, Applied Mathematics and Control Sciences, no. 4, hal. 121–136, doi: 10.15593/2499-9873/2020.4.08.
[13] S. Abdullah dan G. Prasetyo, 2020, EASY ENSEMMBLE WITH RANDOM FOREST TO HANDLE IMBALANCED DATA IN CLASSIFICATION, Journal of Fundamental Mathematics and Applications (JFMA), vol. 3, no. 1, hal. 39–46, doi: 10.14710/jfma.v3i1.7415.
[14] L. Breiman, Random Forests, Mach Learn, 2001, vol. 45, no. 1, hal. 5–32, doi: 10.1023/A:1010933404324.
[15] C. Han dan H. Jia, 2022, Multi-Modal Representation Learning with Self-Adaptive Thresholds for Commodity Verification, dalam China Conference on Knowledge Graph and Semantic Computing. [Daring]. Tersedia pada: https://api.semanticscholar.org/CorpusID:251740987
[16] Z. Zheng dan Y. Yang, 2021, Adaptive Boosting for Domain Adaptation: Toward Robust Predictions in Scene Segmentation, IEEE Transactions on Image Processing, vol. 31, hal. 5371–5382, [Daring]. Tersedia pada: https://api.semanticscholar.org/CorpusID:232417741
[17] N. A. Akbar, A. Sunyoto, M. Rudyanto Arief, dan W. Caesarendra, 2020, Improvement of decision tree classifier accuracy for healthcare insurance fraud prediction by using Extreme Gradient Boosting algorithm, dalam 2020 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), IEEE, hal. 110–114. doi: 10.1109/ICIMCIS51567.2020.9354286.
[18] M. Alqahtani, H. Mathkour, dan M. M. Ben Ismail, 2020, IoT Botnet Attack Detection Based on Optimized Extreme Gradient Boosting and Feature Selection, Sensors, vol. 20, no. 21, hal. 6336, doi: 10.3390/s20216336.
[19] Z. Yan dan H. Wen, 2020, Electricity Theft Detection Base on Extreme Gradient Boosting in AMI, dalam 2020 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), IEEE, hal. 1–6. doi: 10.1109/I2MTC43012.2020.9128712.
[20] T. Chen dan C. Guestrin, 2016, XGBoost, dalam Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA: ACM, hal. 785–794. doi: 10.1145/2939672.2939785.
[21] D. Johannßen, C. Biemann, S. Remus, T. Baumann, dan D. Scheffer, 2020, GermEval 2020 Task 1 on the Classification and Regression of Cognitive and Motivational Style from Text: Companion Paper. [Daring]. Tersedia pada: https://api.semanticscholar.org/CorpusID:220320932