PERBANDINGAN ALGORITMA CLUSTERING K-MEANS, GAUSSIAN MIXTURE MODEL, DAN DBSCAN PADA DATA INDEKS STANDAR PENCEMAR UDARA (ISPU) DI PROVINSI DKI JAKARTA

Main Article Content

Apriyanto Chandra

Abstract

The purpose of this study is to compare the performance of three clustering algorithms, namely K-Means Clustering, Gaussian Mixture Model (GMM), and DBSCAN, when analyzing the Air Pollution Standard Index (ISPU) data of DKI Jakarta Province. Air quality parameters such as nitrogen dioxide, ozone, sulfur dioxide, carbon monoxide, and PM10 were used. The research process includes a preprocessing stage, where missing values are used with the average imputation method and the data is normalized with Standard Scaler. With a Silhouette Score of 0.2784, Calinski-Harabasz Score of 403.111, and Davies-Bouldin Index of 1.2481, K-Means Clustering showed the best results of the three-clustering metrics. Meanwhile, of the threes coring metrics, GMM and DBSCAN showed inferior results. The results show that the K-Means Clustering algorithm performs better clustering of Jakarta ISPU data than other algorithms, with advantages in compactness and better cluster separation. With the use of additional clustering algorithms and more in-depth data analysis, this research opens up opportunities for further development.

Article Details

Section
Articles

References

1. A. Riyanto, A. Maheswara, R. Zulianty, V. M. Alegra, A. N. Muhammad, and P. Hukum, “Tanggung Jawab Pemerintah dalam Penyelesaian Masalah Polusi Udara di DKI Jakarta”.

2. A. Amalia et al., “Prediksi Kualitas Udara Menggunakan Algoritma K-Nearest Neighbor”, [Online]. Available: https://data.jakarta.go.id/.

3. A. Budianita, N. Iman, F. Maisa Hana, and C. Berlian Hakim, “Algoritma K-Nearest Neighbor dan Naive Bayes pada Klasifikasi Tingkat Kualitas Udara Kota Tangerang Selatan,” Jurnal Informatika dan Rekayasa Perangkat Lunak Komparasi.

4. S. Sosnegher Ndelawa, R. Dwi Bekti, M. T. Jatipaningrum, F. Astuti, and P. S. Statistika, “Penerapan Metode K-Means Pada Data Ordinal Untuk Pengelompokan Daerah Berdasarkan Kualitas Udara di Daerah Istimewa Yogyakarta,” Jurnal Statistika Industri dan Komputasi, vol. 09, no. 02, pp. 60–71, 2024.

5. A. Davyn Daniel, E. Dewayani, and T. Sutrisno, “Analisis Dan Prediksi Data Pemantauan Coronavirus Disease 2019 Di Provinsi Daerah Khusus Ibukota Jakarta Dengan Metode Double Exponential Smoothing,” Computatio: Journal of Computer Science and Information Systems, vol. 6, no. 2, pp. 98–106, 2022.

6. A. Prawira and C. Ariya, “Loan Prediction App Using Polynomial Regression,” Computatio: Journal of Computer Science and Information Systems, vol. 8, no. 1, pp. 73–85, 2024, [Online]. Available: https://www.kaggle.com/altruistdelhite04/loan-prediction-problem-

7. F. Putra, H. F. Tahiyat, R. M. Ihsan, R. Rahmaddeni, and L. Efrizoni, “Penerapan Algoritma K-Nearest Neighbor Menggunakan Wrapper Sebagai Preprocessing untuk Penentuan Keterangan Berat Badan Manusia,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 4, no. 1, pp. 273–281, Jan. 2024, doi: 10.57152/malcom. v4i1.1085.

8. N. Hadi and J. Benedict, “Implementasi Machine Learning Untuk Prediksi Harga Rumah Menggunakan Algoritma Random Forest,” Computatio: Journal of Computer Science and Information Systems, vol. 8, no. 1, pp. 50–61, 2024, [Online]. Available . https://www.kaggle.com/harlfoxem/housesalesprediction

9. M. Anjelita, A. P. Windarto, A. Wanto, and I. Sudahri, Seminar Nasional Teknologi Komputer & Sains (SAINTEKS) Pengembangan Datamining Klastering Pada Kasus Pencemaran Lingkungan Hidup.

10. I. Virgo, S. Defit, and Y. Yuhandri, “Klasterisasi Tingkat Kehadiran Dosen Menggunakan Algoritma K-Means Clustering,” Jurnal Sistim Informasi dan Teknologi, pp. 23–28, Mar. 2020, doi: 10.37034/jsisfotek. v2i1.17.

11. P. Alkhairi and A. P. Windarto, Seminar Nasional Teknologi Komputer & Sains (SAINTEKS) Penerapan K-Means Cluster Pada Daerah Potensi Pertanian Karet Produktif di Sumatera Utara. [Online]. Available: https://seminar-id.com/semnas-sainteks2019.html

12. D. Faidah Yusti, A. Maula Hudzaifa, N. Theresia, and C. Egytia Widiantoro, “Optimalisasi Strategi Pengelompokkan Potensi Padi Sebagai Solusi Efektif Kelangkaan Beras di Jawa Barat”.

13. M. Farid, “Pengelompokan Data Pendistribusian Listrik Menggunakan Algoritma Density Based Spatial Clustering of Application with Noise (DBSCAN) Tugas Akhir,” 2024.

14. I. Widaningrum, D. Mustikasari, R. Arifin, S. L. Tsaqila, and D. Fatmawati, “Algoritma Term Frequency-Inverse Document Frequency (TF-IDF) dan K-Means Clustering Untuk Menentukan Kategori Dokumen.”

15. I. Firman Ashari, E. Dwi Nugroho, R. Baraku, I. N. Yanda, and R. Liwardana, “Analysis of Elbow, Silhouette, Davies-Bouldin, Calinski-Harabasz, and Rand-Index Evaluation on K-Means Algorithm for Classifying Flood-Affected Areas in Jakarta,” 2023. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC

16. I. T. Umagapi, B. Umaternate, S. Komputer, P. Pasca Sarjana Universitas Handayani, B. Kepegawaian Daerah Kabupaten Pulau Morotai, and B. Riset dan Inovasi, “Uji Kinerja K-Means Clustering Menggunakan Davies-Bouldin Index Pada Pengelompokan Data Prestasi Siswa.”

17. M. P. A. Budiman and D. Winarso, “Penerapan Algoritma K-Medoids Clusteringuntuk Pengelompokan Bulan Rawan Bencana Kabut Asap di Kota Pekanbaru,” Jurnal Fasilkom, Apr. 2024.