📄 Abstract

As the complexity and scale of projects increase, new challenges arise related to handling software defects. One solution uses machine learning-based software defect prediction techniques, such as the K-Nearest Neighbors (KNN) algorithm. However, KNN’s performance can be hindered by the majority vote mechanism and the distance/similarity metric choice, especially when applied to imbalanced datasets. This research compares the effectiveness of Euclidean, Hamming, Cosine, and Canberra distance metrics on KNN performance, both before and after the application of SMOTE (Synthetic Minority Over-sampling Technique). Results show significant improvements in the AUC and F-1 measure values across various datasets after the SMOTE application. Following the SMOTE application, Euclidean distance produced an AUC of 0.7752 and an F1 of 0.7311 for the EQ dataset. With Canberra distance and SMOTE, the JDT dataset produced an AUC of 0.7707 and an F-1 of 0.6342. The LC dataset improved to 0.6752 and 0.3733 in tandem with the ML dataset, which climbed to 0.6845 and 0.4261 with Canberra distance. Lastly, after using SMOTE, the PDE dataset improved to 0.6580 and 0.3957 with Canberra distance. The findings confirm that SMOTE, combined with suitable distance metrics, significantly boosts KNN’s prediction accuracy, with a P-value of 0.0001.

🔖 Keywords

#Software Defect Prediction; SMOTE; KNN Algorithm; Distance Metrics

ℹ️ Informasi Publikasi

Tanggal Publikasi
27 February 2025
Volume / Nomor / Tahun
Volume 18, Nomor 1, Tahun 2025

📝 HOW TO CITE

Maulidha, Khusnul Rahmi; Lambung Mangkurat University; Faisal, Mohammad Reza; Lambung Mangkurat University; Saputro, Setyo Wahyu; Lambung Mangkurat University; Abadi, Friska; Lambung Mangkurat University; Nugrahadi, Dodon Turianto; Lambung Mangkurat University; Adi, Puput Dani Prasetyo; National Research and Innovation Agency (Badan Riset dan Inovasi Nasional); Hariyady, Hariyady; Universiti Malaysia Sabah; , "Comparative Analysis of Distance Metrics in KNN and SMOTE Algorithms for Software Defect Prediction," Telematika, vol. 18, no. 1, Feb. 2025.

ACM
ACS
APA
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver

🔗 Artikel Terkait dari Jurnal yang Sama

A Systematic Analysis of the Impact of Non-Academic Factors on Student Academic Performance Prediction Using Data Mining

Ningsih, Gabriella Caroline Prihayu; Universitas Sebelas Maret; Liantoni, Febri; Sebelas Maret University; Sujana, Yudianto; Sebelas Maret University;

02 Apr 2026

Architecture and Field Evaluation of an IoT-Integrated Village Information System for Public Service

Hartono, Susilo; Universitas Muhammadiyah Pringsewu; Sutikno, Tole; Ahmad Dahlan University; Yudhana, Anton; Ahmad Dahlan University;

09 Mar 2026

Development of a Lightweight CNN Architecture for Multiclass Brain Tumor Detection Based on RGB Images

Fauzi, Ahmad; Pamulang University; Yunial, Agus heri; Pamulang University;

09 Mar 2026

Portfolio Risk Assessment Using VaR and CVaR: A Comparative Study of Variance–Covariance Method and Monte Carlo Simulation

Supandi, Epha Diana; Oktavia, Atika; Sunan Kalijaga State Islamic University Yogyakarta;

05 Mar 2026

Fairness Auditing and Bias Mitigation in Aspect-Based Sentiment Models for Indonesian Public Services

Jondien, Muhammad Shihab Fathurrahman; Magister of Computer Science, Amikom Purwokerto University, Indonesia; Hariguna, Taqwa; Magister of Computer Science, Amikom Purwokerto University, Indonesia; Saputra, Dhanar Intan Surya; Magister of Computer Science, Amikom Purwokerto University, Indonesia;

05 Mar 2026

Performance Analysis of the Fuzzing Method in Detecting API Vulnerabilities in Mobile Healthcare Application X Based on OWASP API Security Top 10

Hakim, Muhammad Ikhwanul; Nugroho, Radityo Adi; Nugrahadi, Dodon Turianto; Herteno, Rudy; Saputro, Setyo Wahyu;

19 Feb 2026

📊 Statistik Sitasi Jurnal