KLASIFIKASI TOKSISITAS KOMENTAR DENGAN ALGORITMA NAIVE BAYES DAN DECISION TREE
Main Article Content
Abstract
This study aims to develop a toxicity comment classification model using Naive Bayes and Decision Tree algorithms, specifically in the context of the online environment. The dataset consists of online comments, involving preprocessing steps such as text cleaning, normalization, and feature extraction using methods like TF-IDF. The Naive Bayes and Decision Tree classification models are trained on this dataset, and their performance is evaluated using standard metrics such as accuracy, precision, recall, and F1-score. Additionally, a comparative analysis between Naive Bayes and Decision Tree is conducted, focusing on the online context. This analysis aims to provide insights into their effectiveness in identifying toxicity in online comments. The findings of this study serve as a foundation for developing content moderation solutions that can adapt to the dynamic nature of human interactions in the online world. The results of this research have significant implications for building more efficient and effective content moderation systems in the online environment. By concentrating on the online context, the study makes a valuable contribution to understanding the performance of classification algorithms in addressing toxicity in online interactions. Consequently, the study's findings can help enhance user safety and comfort in the online environment through the development of more sophisticated content moderation solutions.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Menawarkan akses terbukaReferences
[1] D. Jurafsky dan J. H. Martin, Speech and Language Processing, 2022.
[2]. C. D. Manning, P. Raghavan dan H. Schütze, Introduction to Information Retrieval, Cambridge University Press, 2022.
[3]. J. Chen, L. Song, W. Li, Y. Zhang dan X. Cheng, “Exploring Sentiment in Social Media: A Comprehensive Survey.,” Knowledge-Based Systems, vol. 198, p. 105947, 2023.
[4] B. Pang dan L. Lee, “Opinion Mining and Sentiment Analysis: Foundations and Trends,” Foundations and Trends in Information Retrieval, vol. 2, no. 1-2, pp. 1-135, 2019.
[5]. F. Sebastina, “Machine Learning in Automated Text Categorization,” ACM Computing Surveys (CSUR), vol. 34, no. 1, pp. 1-47, 2017.
[6]. A. Srivasta dan V. Singh, “A Comprehensive Review on Text Mining using Novel Methods,” Procedia Computer Science, vol. 165, pp. 197-204, 2023
[7]. S. Tan, X. Cheng dan Y. Wang, “Feature Engineering and Selection for Text Classification: A Review,” Data and Knowledge Engineering, vol. 100, pp. 13-21, 2021.
[8]. H. Witten, E. Frank dan M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2021.
[9]. S. Kim, “Mining Twitter Data with Python (Part 1: Collecting Data),” 2018.
[10]. J. Saldaña, The Coding Manual for Qualitative Researchers, SAGE Publications, 2017.
[11]. I. Rish, “An Empirical Study of the Naive Bayes Classifier,” dalam IJCAI 2011 Workshop on Empirical Methods in Artificial Intelligence, 2011.
[12]. J. S. R. Pennington dan C. D. Manning, “GloVe: Global Vectors for Word Representation,” dalam Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016.
[13]. S. Hochreiter dan J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 2017.
[14]. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado dan J. Dean, “Distributed Representations of Words and Phrases and Their Compositionality,” dalam Advances in Neural Information Processing Systems, 2018.
[15]. P. Shrestha, A. Mahmood dan E. Yafi, “A Comprehensive Survey of Machine Learning Techniques in Sentiment Analysis,” Information Processing & Management, vol. 56, no. 5, pp. 1794-1818, 2023.
[16]. Y. Yang dan J. O. Pedersen, “A Comparative Study on Feature Selection in Text Categorization,” dalam Proceedings of the Fourteenth International Conference on Machine Learning, 2017.
[17]. S. R. Makhija dan P. Srinivasan, “Text Classification Using Deep Learning Models: A Comprehensive Review,” Journal of King Saud University - Computer and Information Sciences, 2022.
[18]. Y. Zhang dan B. Wallace, “A Survey of Emerging Trends in Sentiment Analysis in Social Media,” Journal of Artificial Intelligence Research, vol. 71, pp. 933-993, 2021
[19]. C. E. dan W. B., “Jumping NLP Curves: A Review of Natural Language Processing Research,” IEEE Computational Intelligence Magazine, vol. 9, no. 2, pp. 48-57, 2019.
[20]. L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2016.