PENERAPAN PEMBOBOTAN TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY DAN ALGORITMA K-NEAREST NEIGHBOR UNTUK ANALISIS ULASAN HOTEL DI SITUS TRIPADVISOR

Khairul Huda, Sry Dhina Pohan, Youfih Herlina

Abstract


Penelitian ini latarbelakangi oleh masalah evaluasi produk dan layanan menggunakan metode tradisional seperti survei, kuisioner dan wawancara yang sering menghasilkan analisis yang tidak konsisten dan tidak akurat. Salah satu pendekatan untuk mengatasi masalah tersebut adalah dengan menerapkan Teknik pembobotan Term Frequency-Inverse Document Frequency (TF-IDF) dan algoritma K-Nearest Neighbor untuk menganalisis ulasan pelanggan hotel dari situs TripAdvisor, yang dikategorikan menjadi 3 kelas sentimen yaitu netral, negatif dan positif menggunakan text mining. Algoritma K-Nearest Neighbor dipilih karena kemampuannya dalam komputasi yang efisien, mudah beradaptasi dengan berbagai data yang besar, serta relative rendah untuk kompleksitas algoritmanya. Hasil penelitian menunjukkan bahwa sistem ini mampu mengklasifikasikan ulasan hotel dengan tingkat akurasi yang optimal, mencapai 76% untuk data pelatihan dengan K=31, dan meningkatkan akurasi hingga 84% setelah melalui penerapan teknik random over-sampling untuk mengatasi imbalanced dataset

Full Text:

PDF 2536-2546

References


Y. Guo, S. J. Barnes, and Q. Jia, “Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation,” Tour Manag, vol. 59, pp. 467–483, 2017, doi: https://doi.org/10.1016/j.tourman.2016.09.009.

V. O. Tama, Y. Sibaroni, and Adiwijaya, “Labeling Analysis in the Classification of Product Review Sentiments by using Multinomial Naive Bayes Algorithm,” J Phys Conf Ser, vol. 1192, no. 1, p. 012036, 2019, doi:10.1088/1742-6596/1192/1/012036.

V. Kotu and B. Deshpande, Predictive analytics and data mining : concepts and practice with RapidMiner.

I. F. Rahman, A. N. Hasanah, and N. Heryana, “ANALISIS SENTIMEN ULASAN PENGGUNA APLIKASI SAMSAT DIGIITAL NASIONAL (SIGNAL) DENGAN MENGGUNAKAN METODE NAÏVE BAYES CLASSIFIER,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 12, no. 2, Apr. 2024, doi: 10.23960/jitet.v12i2.4073.

V. Chandani and R. S. Wahono, “Komparasi Algoritma Klasifikasi Machine Learning Dan Feature Selection pada Analisis Sentimen Review Film,” Journal of Intelligent Systems, vol. 1, no. 1, 2015, [Online]. Available: http://journal.ilmukomputer.org

S. B. Imandoust and M. Bolandraftar, “Application of K-nearest neighbor (KNN) approach for predicting economic events theoretical background,” Int J Eng Res Appl, vol. 3, pp. 605–610, Jan. 2013.

M. Bilal, H. Israr, M. Shahid, and A. Khan, “Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques,” Journal of King Saud University - Computer and Information Sciences, vol. 28, no. 3, pp. 330–344, 2016, doi: https://doi.org/10.1016/j.jksuci.2015.11.003.

R. Khorsand, M. Rafiee, and V. Kayvanfar, “Insights into TripAdvisor’s online reviews: The case of Tehran’s hotels,” Tour Manag Perspect, vol. 34, p. 100673, 2020, doi: https://doi.org/10.1016/j.tmp.2020.100673.

E. Suganya and S. Vijayarani, “Sentiment Analysis for Scraping of Product Reviews from Multiple Web Pages Using Machine Learning Algorithms,” in Intelligent Systems Design and Applications, A. Abraham, A. K. Cherukuri, P. Melin, and N. Gandhi, Eds., Cham: Springer International Publishing, 2020, pp. 677–685.

S. Kaur, G. Sikka, and L. K. Awasthi, “Sentiment Analysis Approach Based on N-gram and KNN Classifier,” in 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), 2018, pp. 1–4. doi: 10.1109/ICSCCC.2018.8703350.

V. Ganganwar, “An overview of classification algorithms for imbalanced datasets,” 2012. [Online]. Available: https://api.semanticscholar.org/CorpusID:7033031

U. Erra, S. Senatore, F. Minnella, and G. Caggianese, “Approximate TF–IDF based on topic extraction from massive message stream using the GPU,” Inf Sci (N Y), vol. 292, pp. 143–161, 2015, doi: https://doi.org/10.1016/j.ins.2014.08.062.

M. Bilal, H. Israr, M. Shahid, and A. Khan, “Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques,” Journal of King Saud University - Computer and Information Sciences, vol. 28, no. 3, pp. 330–344, 2016, doi: https://doi.org/10.1016/j.jksuci.2015.11.003.

K. Shah, H. Patel, D. Sanghvi, and M. Shah, “A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification,” Augmented Human Research, vol. 5, no. 1, p. 12, 2020, doi: 10.1007/s41133-020-00032-0.

M. Bramer, Principles of Data Mining. Springer London, 2020. doi: 10.1007/978-1-4471-7493-6.

S. Makki, R. Haque, Y. Taher, Z. Assaghir, M.-S. Hacid, and H. Zeineddine, “A Cost-Sensitive Cosine Similarity K-Nearest Neighbor for Credit Card Fraud Detection,” in International Conference on Big Data and Cyber-Security Intelligence, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:115144809

M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf Process Manag, vol. 45, no. 4, pp. 427–437, 2009, doi: https://doi.org/10.1016/j.ipm.2009.03.002.




DOI: http://dx.doi.org/10.23960/jitet.v12i3.4800

Refbacks

  • There are currently no refbacks.


This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Publisher
Jurusan Teknik Elektro, Fakultas Teknik, Universitas Lampung
Jl. Prof. Soemantri Brojonegoro No. 1 Bandar Lampung 35145
Email: jitet@eng.unila.ac.id
Website : https://journal.eng.unila.ac.id/index.php/jitet

Copyright (c) Jurnal Informatika dan Teknik Elektro Terapan (JITET)
pISSN: 2303-0577   eISSN: 2830-7062