Classification of Hate Speech in TikTok Social Media Comments Using Naive Bayes Algorithm and TF-IDF Weighting

Utami , Putri Febi; Krisbiantoro, Dwi; Santiko, Irfan; Riyanto, Andi Dwi

doi:10.35671/jmtt.v4i3.102

📅 27 December 2025

DOI: 10.35671/jmtt.v4i3.102

Classification of Hate Speech in TikTok Social Media Comments Using Naive Bayes Algorithm and TF-IDF Weighting

Utami; Putri Febi; Krisbiantoro; Dwi; Santiko; Irfan; Riyanto; Andi Dwi

Journal of Multimedia Trend and Technology

Universitas Amikom Purwokerto

📄 Abstract

This research focuses on the classification of hate speech in Indonesian Tik Tok comments. Tik Tok, as a social media platform with high interaction intensity, generates a large volume of comments with diverse linguistic characteristics, including the use of formal and informal language. This linguistic variation poses challenges in the content moderation process, particularly in automatically identifying hate speech. The research dataset is secondary data obtained by combining public datasets and scraped Tik Tok comments, with an initial total of 5,698 comments. The collected data represent general user comments with variations in formal and informal language. To improve data quality, pre-processing stages were carried out including text cleaning, tokenization, normalization, stop-word removal, and stemming. After pre-processing, 4,542 comments were obtained that were suitable for use in the modeling process. Experimental results show that the Multinomial Naïve Bayes model with TF-IDF weighting is able to classify hate speech with high performance. Model accuracy reached 93% before parameter optimization and increased to 95% after hyperparameter tuning with an alpha value of 0.5. The confusion matrix results show a relatively low misclassification rate, although the class distribution in the dataset still shows imbalance. The findings of this study indicate that the Multinomial Naïve Bayes approach is effective in recognizing linguistic patterns of hate speech in Indonesian TikTok comments, including text with informal language characteristics.

🔖 Keywords

#Hate speech; TikTok; Multinomial Naive Bayes; TF-IDF; Text classification

ℹ️ Informasi Publikasi

Tanggal Publikasi

27 December 2025

Volume / Nomor / Tahun

Volume 4, Nomor 3, Tahun 2025

📝 HOW TO CITE

Utami , Putri Febi; Krisbiantoro, Dwi; Santiko, Irfan; Riyanto, Andi Dwi, "Classification of Hate Speech in TikTok Social Media Comments Using Naive Bayes Algorithm and TF-IDF Weighting," Journal of Multimedia Trend and Technology, vol. 4, no. 3, Dec. 2025.

ACM

ACS

APA

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver