Implementasi Analisis Sentimen Masyarakat Mengenai Kenaikan Harga BBM Pada Komentar Youtube Dengan Metode Gaussian naïve bayes

Syamsul Mujahidin; Bagus Prasetio; Muchammad Chandra Cahyo Utomo

Abstract

Youtube merupakan platform video terbesar di dunia dengan total pengguna sebanyak 1,5 miliar pada tahun 2018. Youtube menjadi salah satu platform penyedia informasi, salah satunya yakni kenaikan harga minyak mentah dunia hingga berada di atas US$100 per barel. Berdasarkan permasalahan tersebut, penulis melakukan penelitian terkait analisis sentimen dari komentar pengguna Youtube mengenai kenaikan harga BBM menggunakan metode Gaussian naïve bayes. Percobaan dilakukan menggunakan 3053 dataset dengan pelabelan menggunakan lexicon dan split data 8:2. Penerapan vektorisasi kata menggunakan word embedding Fasttext dan Bag of word sebagai pembanding terhadap akurasi. Percobaan dilakukan dengan kombinasi perbedaan dimensi size pada proses pembuatan language model fasttext. Berdasarkan hasil penelitian yang telah dilakukan, didapatkan nilai akurasi tertinggi pada percobaan dengan dataset tanpa filtering stopword dan model fasttext size 100 dengan akurasi sebesar 74%. Berdasarkan hasil evaluasi, sistem yang dibangun dapat mengklasifikasikan sentimen atau opini publik ke dalam sentimen positif dan sentiment negatif secara otomatis.

Kata kunci : BBM, Fasttext, Lexicon, Gaussian naïve bayes, Word embedding

Youtube is the largest video platform in the world with a total of 1.5 billion users in 2018. Youtube is one of the information provider platforms, one of which is the increase in world crude oil prices to above US$100/barrel. Based on these problems, the authors conducted research related to sentiment analysis from Youtube user comments regarding the increase in fuel prices using the Gaussian nave Bayes method. The experiment was carried out using 3053 datasets with labeling using lexicon and 8:2 data split. The vectorization uses Fasttext and BoW as a comparison of accuracy. The experiment was carried out with a combination of size dimensions fasttext. Based on the results of the research, the highest accuracy value was obtained in experiments with a dataset without stopword and fasttext size 100 with an accuracy of 74%. The system built can classify public sentiment into positive and negative sentiments automatically.

Keywords: Fuel, Fasttext, Lexicon, Gaussian naïve bayes, Word embedding