Toward a Modular, Low-Latency Architecture with BERT-based Big Media Data Analysis

Telematika
Universitas Amikom Purwokerto

📄 Abstract

The significant growth of digital and social media platforms has introduced massive streams of unstructured media data. However, current big data approaches are not specifically tailored to the high volume and velocity of media data, which consists of unstructured and lengthy full-text messages. This study proposes a modular and stream-oriented big data architecture for media data. The proposed architecture consists of data crawlers, a message broker, machine learning modules, persistent storage, and analytical dashboards, with a publish-subscribe communication pattern to enable asynchronous, decoupled data processing. The system integrates IndoBERT, a transformer-based model fine-tuned for the Indonesian language, enabling real-time semantic tagging within the streaming pipeline. The proposed solution has been implemented as a prototype using open-source technologies in an on-premise cluster. As such, the primary novelty is the successful integration and operationalization of a large, transformer-based language model (IndoBERT) within a low-latency streaming pipeline. The experimental results underscore the feasibility of deploying scalable, vendor-neutral media analytics platforms for institutions with high sensitivity to privacy and cost. Architectural quality is quantitatively evaluated through Martin's Instability Metric and Coupling Between Objects (CBO), confirming high modularity across components. The system demonstrates an end-to-end latency of 3.121 seconds, deep learning latency of 2.333 seconds, and processes 32,102 messages per day, making an explicit trade-off where the 2.333-second deep learning inference provides advanced semantic depth. This study presents a reference architecture for scalable, intelligent real-time media analytics systems that support public sector and academic deployments, requiring data privacy and control over infrastructure.

🔖 Keywords

#big media data; modular architecture; latency; BERT

ℹ️ Informasi Publikasi

Tanggal Publikasi
25 August 2025
Volume / Nomor / Tahun
Volume 18, Nomor 2, Tahun 2025

📝 HOW TO CITE

Widyawan, Widyawan; Universitas Gadjah Mada; Murti, Handoko Wisnu; Semesta Data Digital; Putra, Guntur Dharma; Universitas Gadjah Mada; Nurmanto, Eddy; Semesta Data Digital; Affandi, Achmad; Institut Teknologi Sepuluh Nopember; , "Toward a Modular, Low-Latency Architecture with BERT-based Big Media Data Analysis," Telematika, vol. 18, no. 2, Aug. 2025.

ACM
ACS
APA
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver

🔗 Artikel Terkait dari Jurnal yang Sama

A Systematic Analysis of the Impact of Non-Academic Factors on Student Academic Performance Prediction Using Data Mining

Ningsih, Gabriella Caroline Prihayu; Universitas Sebelas Maret; Liantoni, Febri; Sebelas Maret University; Sujana, Yudianto; Sebelas Maret University;

02 Apr 2026

Architecture and Field Evaluation of an IoT-Integrated Village Information System for Public Service

Hartono, Susilo; Universitas Muhammadiyah Pringsewu; Sutikno, Tole; Ahmad Dahlan University; Yudhana, Anton; Ahmad Dahlan University;

09 Mar 2026

Development of a Lightweight CNN Architecture for Multiclass Brain Tumor Detection Based on RGB Images

Fauzi, Ahmad; Pamulang University; Yunial, Agus heri; Pamulang University;

09 Mar 2026

Portfolio Risk Assessment Using VaR and CVaR: A Comparative Study of Variance–Covariance Method and Monte Carlo Simulation

Supandi, Epha Diana; Oktavia, Atika; Sunan Kalijaga State Islamic University Yogyakarta;

05 Mar 2026

Fairness Auditing and Bias Mitigation in Aspect-Based Sentiment Models for Indonesian Public Services

Jondien, Muhammad Shihab Fathurrahman; Magister of Computer Science, Amikom Purwokerto University, Indonesia; Hariguna, Taqwa; Magister of Computer Science, Amikom Purwokerto University, Indonesia; Saputra, Dhanar Intan Surya; Magister of Computer Science, Amikom Purwokerto University, Indonesia;

05 Mar 2026

Performance Analysis of the Fuzzing Method in Detecting API Vulnerabilities in Mobile Healthcare Application X Based on OWASP API Security Top 10

Hakim, Muhammad Ikhwanul; Nugroho, Radityo Adi; Nugrahadi, Dodon Turianto; Herteno, Rudy; Saputro, Setyo Wahyu;

19 Feb 2026

📊 Statistik Sitasi Jurnal