πŸ“… 24 May 2026
DOI: 10.26877/asset.v8i3.2862

Comparative Evaluation of Automatic Labeling and Modeling Strategies for Indonesian Sentiment Analysis: Methodology and Performance Evaluation

Advance Sustainable Science, Engineering and Technology (ASSET)
Universitas Persatuan Guru Republik Indonesia Semarang

πŸ“„ Abstract

Sentiment analysis is vital for understanding consumer perception, yet Indonesian sentiment classification faces challenges due to labeled data scarcity and computational constraints. This study advances automatic labeling techniques and establishes performance benchmarks for Indonesian text. The research compares two labeling approaches InSet Lexicon and IndoBERT based Hugging Face pipeline on 8,447 Tapera-related opinions. Results show InSet Lexicon produced a highly skewed distribution (89.66% neutral), while the IndoBERT pipeline achieved a more balanced distribution (47.66% neutral, 38.43% positive, 13.91% negative).. Evaluation of various modeling strategies revealed that combining InSet Lexicon + TF-IDF with NaΓ―ve Bayes or Random Forest achieved scores above 85%. While RNN-LSTM reached >90% accuracy, it required significant resources. Notably, fine-tuning IndoBERT with optimal hyperparameters yielded the most robust performance, achieving 80–90% accuracy with a low validation loss of 0.1. The study concludes that for small datasets (<12,000 samples), the most effective strategies for Indonesian sentiment analysis are either the InSet Lexicon paired with traditional Machine Learning or automatic labeling using pre-trained models followed by rigorous fine-tuning.

πŸ”– Keywords

#Low resources nlp; sentiment analysis; automatic labeling; vectorization; postagging

ℹ️ Informasi Publikasi

Tanggal Publikasi
24 May 2026
Volume / Nomor / Tahun
Tahun 2026

πŸ“ HOW TO CITE

Latifa, Khoiriya; Agung Handayanto; Nur Latifah Dwi M.S; Rahul Bhandari; Ton Nguyen Trong Hien; Doston Pirnazarov, "Comparative Evaluation of Automatic Labeling and Modeling Strategies for Indonesian Sentiment Analysis: Methodology and Performance Evaluation," Advance Sustainable Science, Engineering and Technology (ASSET), May. 2026.

ACM
ACS
APA
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
DOI

πŸ”— Artikel Terkait dari Jurnal yang Sama

πŸ“Š Statistik Sitasi Jurnal