CLUSTERING DATA SISWA PUTUS SEKOLAH DENGAN ALGORITMA K-MEANS DAN DBSCAN
Isi Artikel Utama
Abstrak
Pembangunan sumber daya manusia dan kemajuan suatu negara bergantung pada pendidikan dasar. Meskipun partisipasi siswa di sekolah dasar telah meningkat pesat, fenomena putus sekolah di jenjang ini masih menjadi masalah besar di Indonesia. Penelitian ini menggunakan algoritma clustering K-Means dan DBScan untuk mengelompok data jumlah siswa putus sekolah setiap kota di Indonesia. Dataset yang digunakan berasal dari Kemendikbud yang diterbitkan pada tahun 2023, dataset ini memiliki variabel kota/kabupate, jumlah siswa putus sekolah SD, jumlah siswa putus sekolah SMP, jumlah siswa putus sekolah SMA, dan jumlah siswa putus sekolah SMK. Metode yang memiliki hasil terbaik yaitu algoritma K-Means dengan nilai K = 2 dengan nilai silhoutte 0.722. Secara umum hasil pengelompokkan menunjukkan daerah dengan jumlah siswa putus sekolah yang tinggi di angka 40 daerah, jumlah ini relatif rendah bila dibanding dengan kelompok daerah dengan jumlah siswa putus sekolah rendah yang bisa mencapai angka 474. Walaupun perbedaan yang cukup signifikan, hal ini bisa menandakan terdapat kesenjangan pendidikan antar daerah sehingga perbedaan nilai yang cukup jauh. Penelitian ini diharapkan bisa menjadi informasi penting bagi pemangku kepentingan.
Rincian Artikel

Artikel ini berlisensiCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Menawarkan akses terbukaReferensi
[1] Laila Khoirun Nisa, Tari Fitri Ningsih, Burhanuddin Izzul Salam, F. Fauzi, And Eny Winaryati, “Clustering Model K-Means Pada Kasus Angka Putus Sekolah Tingkatan Sekolah Dasar di Provinsi Jawa Tengah”, Logiclink, Vol. 1, No. 1, Hal. 13–20, Jun. 2024.
[2] Windarto, A.P., Herawan, T., K-Means Algorithm with Rapidminer in Clustering School Participation Rate in Indonesia. In: Ab. Nasir, A.F., Ibrahim, A.N., Ishak, I., Mat Yahya, N., Zakaria, M.A., P. P. Abdul Majeed, A. (Eds) Recent Trends in Mechatronics Towards Industry 4.0. Lecture Notes in Electrical Engineering, Vol 730, Hal. 779–794, 2022.
[3] Ade, Implementasi Kurikulum 2013 Dalam Pembelajaran SD/MI. Prenada Media, 2019.
[4] P. D. Purnasari and Y. D. Sadewo, ‘Strategi Pembelajaran Pendidikan Dasar di Perbatasan Pada Era Digital’, Jurnal Basicedu, Vol. 5, No. 5, Pp. 3089–3100, 2021.
[5] E. S. Dalmaijer, C. L. Nord, And D. E. Astle, Statistical Power for Cluster Analysis’, BMC Bioinformatics, Vol. 23, No. 1, P. 205, 2022.
[6] Kais Ghedira Et Al., “Design and Implementation of a Scalable High-Performance Computing (HPC) Cluster for Omics Data Analysis: Achievements, Challenges and Recommendations in Lmics.,” Gigascience, Vol. 13, Jan. 2024, Doi: Https://Doi.Org/10.1093/Gigascience/Giae060.
[7] [1] Priyanka Nandal, Optimizing Web Search Results for Image. K-Means Clustering Algorithm. GRIN Verlag, 2021.
[8] João Moreira, C. Ponce, And Tomáš Horváth, A General Introduction to Data Analytics. Chichester: Wiley Blackwell, 2019.
[9] Tshepo Chris Nokeri, Data Science Revealed: Feature Engineering, Data Visualization, Pipeline Development, And Hyperparameter Tuning. United States: Apress, 2021.
[10] B. Peter, DATA MINING BUSINESS ANALYTICS: Concepts, Techniques and Applications in Python. S.L.: Wiley-Blackwell, 2020.
[11] M. Cui, “Introduction to the K-Means Clustering Algorithm Based on the Elbow Method,” 2020, Doi: Https://Doi.Org/10.23977/Accaf.2020.010102.
[12] S. Priya and R. Manavalan, “Kmeans-NM-Salpepi: Genetic Interactions Detection Through K-Means Clustering with Nelder-Mead and Salp Optimization Techniques in Genome-Wide Association Studies,” Artificial Intelligence Evolution, Pp. 67–80, Oct. 2021, Doi: Https://Doi.Org/10.37256/Aie.2220211099.
[13] F. Pedregosa Et Al., ‘Scikit-Learn: Machine Learning in Python’, Journal of Machine Learning Research, Vol. 12, Pp. 2825–2830, 2011.
[14] J. Brownlee, Data Preparation for Machine Learning. Machine Learning Mastery, 2020.
[15] K. Jajuga, BatógJ., And M. Walesiak, Classification and Data Analysis: Theory and Applications. Cham: Springer, 2020.
[16] S. Paembonan and H. Abduh, Penerapan Metode Silhouette Coefficient Untuk Evaluasi Clustering Obat’, PENA TEKNIK: Jurnal Ilmiah Ilmu-Ilmu Teknik, Vol. 6, P. 48, 09 2021.
[17] C. Fan, M. Chen, X. Wang, J. Wang, And B. Huang, ‘A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery from Building Operational Data’, Frontiers in Energy Research, Vol. 9, P. 652801, 2021.
[18] K. R. Shahapure and C. Nicholas, ‘Cluster Quality Analysis Using Silhouette Score’, In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020, Pp. 747–748.
[19] M. Shutaywi and N. N. Kachouie, Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering’, Entropy, Vol. 23, No. 6, P. 759, 2021.
[20] K. R. Shahapure and C. Nicholas, ‘Cluster Quality Analysis Using Silhouette Score’, In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020, Pp. 747–748.
[21] A. Dudek, ‘Silhouette Index as Clustering Evaluation Tool’, In Classification and Data Analysis: Theory and Applications 28, 2020, Pp. 19–33.
[22] M. Shutaywi and N. N. Kachouie, Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering’, Entropy, Vol. 23, No. 6, P. 759, 2021.
[23] D. Deng, ‘DBSCAN Clustering Algorithm Based on Density’, In 2020 7th International Forum on Electrical Engineering and Automation (IFEEA), 2020, Pp. 949–953.
[24] J. Brownlee, Data Preparation for Machine Learning. Machine Learning Mastery, 2020.
[25] K. R. Shahapure and C. Nicholas, ‘Cluster Quality Analysis Using Silhouette Score’, In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020, Pp. 747–748