PENERAPAN ALGORITMA NAIVE BAYES DENGAN CHI-SQUARE UNTUK KLASIFIKASI SPAM EMAIL BERBASIS KATA DAN FREKUENSI

Authors

  • Faiz Agil Firmansyah Universitas Singaperbangsa Karawang
  • Ultach Enri Universitas Singaperbangsa Karawang
  • Iqbal Maulana Universitas Singaperbangsa Karawang

DOI:

https://doi.org/10.23960/jitet.v13i1.5506

Abstract Views: 296 File Views: 220

Abstract

Email has become an essential communication tool in everyday life. However, the ease of its use is also exploited by irresponsible parties to spread spam. This research aims to implement the Naive Bayes algorithm with Chi-square in classifying spam emails based on words and frequency. The dataset used in this research consists of 153 data. This data was processed using the classification method using the Naive Bayes algorithm with Chi-square through the Knowledge Discovery in Databases (KDD) process. The results show that the accuracy value is 81.00%, the precision value is 100%, the recall value is 65%, and the F1-score value is 79% using Naive Bayes with Chi-square. Furthermore, the evaluation results using the ROC curve show that the AUC value reaches 0.91, which is categorized as very good. This research shows that the Naive Bayes algorithm with Chi-square is successful in classifying spam emails based on words and frequency.

Downloads

Download data is not yet available.

References

A. Hidayat, “Aplikasi Teks Mining Untuk Mendeteksi Spam Pada Email Berbasis Naive Bayes,” Jurnal Teknologi Pintar, vol. 2, no. 8, 2022.

A. Wibisono, “Filtering Spam Email Menggunakan Metode Naive Bayes,” Jurnal Teknologi Pintar, vol. 3, no. 4, 2023.

M. A. M. Foqaha, “Email spam classification using hybrid approach of RBF neural network and particle swarm optimization,” International Journal of Network Security & Its Applications, vol. 8, no. 4, pp. 17–28, 2016.

M. R. Rahmaputri, D. S. Y. Kartika, and S. F. A. Wati, “KLASIFIKASI TINGKAT KEPUASAN PELANGGAN SAT & SUN : THE ALMEATY SERVICE MENGGUNAKAN NAIVE BAYES,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 12, no. 3, pp. 2658–2663, Aug. 2024, doi: 10.23960/jitet.v12i3.4844.

G. Borotić, L. Granoša, J. Kovačević, and M. Bagić Babac, “Effective Spam Detection with Machine Learning,” Croatian Regional Development Journal, vol. 4, no. 2, pp. 43–64, 2023.

B. Christanto and D. H. Setiabudi, “Penerapan Random Forest dalam Email Filtering untuk Mendeteksi Spam,” Jurnal Infra, vol. 8, no. 2, pp. 138–142, 2020.

H. Iswanto, E. Seniwati, Y. Astuti, and D. Maulina, “Comparison of Algorithms on Machine Learning For Spam Email Classification,” IJISTECH (International Journal of Information System and Technology), vol. 5, no. 4, pp. 446–455, 2021.

P. S. Zakaria, R. Julianto, and R. S. Bernada, “Implementasi Naive Bayes Menggunakan Python Dalam Klasifikasi Data,” Buletin Ilmiah Ilmu Komputer dan Multimedia (BIIKMA), vol. 1, no. 1, pp. 126–131, 2023.

J. Mythili, B. Deebeshkumar, T. Eshwaramoorthy, and J. N. Ajay, “Enhancing Email Spam Detection with Temporal Naive Bayes Classifier,” in 2024 International Conference on Communication, Computing and Internet of Things (IC3IoT), IEEE, 2024, pp. 1–6.

R. Aulianita, A. M. B. Aji, and Y. E. Achyani, “TEXT MINING MENGGUNAKAN NAIVE BAYES BERBASIS PARTICLE SWARM OPTIMIZATION UNTUK SENTIMENT RESTAURANT,” JUTIM (Jurnal Teknik Informatika Musirawas), vol. 6, no. 1, pp. 21–29, 2021.

T. Toma, S. Hassan, and M. Arifuzzaman, “An analysis of supervised machine learning algorithms for spam email detection,” in 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), IEEE, 2021, pp. 1–5.

T. Lv, P. Yan, H. Yuan, and W. He, “Spam filter based on naive Bayesian classifier,” in Journal of Physics: Conference Series, IOP Publishing, 2020, p. 012054.

H. O. Lancaster and E. Seneta, “Chi‐square distribution,” Encyclopedia of biostatistics, vol. 2, 2005.

C. E. Purnomo and R. Rikendry, “Penerapan Metode C4. 5 Untuk Klasifikasi Warga Miskin Pada Desa Mengandung Sari,” Jurnal Teknologi dan Sistem Informasi, vol. 2, no. 3, pp. 14–25, 2021.

V. Pareto and J. Lopreato, Vilfredo Pareto. TY Crowell Company, 1965.

T. Ridwansyah, “Implementasi text mining terhadap analisis sentimen masyarakat dunia di twitter terhadap Kota Medan menggunakan k-fold cross validation dan naïve bayes classifier,” KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 2, no. 5, pp. 178–185, 2022.

I. P. Putri, “Analisis Performa Metode K-Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indonesian Journal of Data and Science, vol. 2, no. 1, pp. 21–28, 2021.

A. Nugroho, A. B. Gumelar, A. G. Sooai, D. Sarvasti, and P. L. Tahalele, “Perbandingan Performansi Algoritma Pengklasifikasian Terpandu Untuk Kasus Penyakit Kardiovaskular,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 5, pp. 998–1006, 2020.

Downloads

Published

2025-01-20

How to Cite

Firmansyah, F. A., Enri, U., & Maulana, I. (2025). PENERAPAN ALGORITMA NAIVE BAYES DENGAN CHI-SQUARE UNTUK KLASIFIKASI SPAM EMAIL BERBASIS KATA DAN FREKUENSI. Jurnal Informatika Dan Teknik Elektro Terapan, 13(1). https://doi.org/10.23960/jitet.v13i1.5506

Issue

Section

Articles