Klasifikasi dan Analisis Semantik Cyberbullying Sosial Media X: Integrasi Web Scraping dan Natural Language Processing (NLP)

Authors

  • Syifa Aulia Azzahra Pendidikan Sistem Teknologi Informasi, Universitas Pendidikan Indonesia
  • Nuur Wachid Abdul Majid Pendidikan Sistem Teknologi Informasi, Universitas Pendidikan Indonesia

DOI:

https://doi.org/10.31949/educatio.v11i2.12725

Abstract

Cyberbullying di media sosial, khususnya X, telah menjadi isu kritis dengan dampak psikologis yang signifikan. Studi ini menganalisis sejauh mana cyberbullying masih terjadi di platform X dengan pendekatan semantik. Data dikumpulkan melalui proses web scraping menggunakan Selenium dengan menggunakan kategori dan kata kunci spesifik seperti “gendut” dan “bodoh” selama periode Desember 2024. Sebanyak 700 data berhasil dikumpulkan setelah melalui proses deduplikasi, yang mana memenuhi kriteria Slovin (margin of error 3.77%). Proses analisis melibatkan  Natural Language Processing (NLP), termasuk text-cleaning, lowercasing, normalization, tokenization, stopword removal, klasifikasi model menggunakan model BERT yang telah di-fine-tune untuk memastikan program mengenali sebuah komentar termasuk cyberbullying atau tidak, serta pemetaan kata kunci ke 8 kategori, seperti “rasisme” dan “sara”. Hasil menunjukkan bahwa sebanyak 55,4% mengandung indikasi cyberbullying, dengan kategori seksual sebagai yang paling dominan dengan 26,6%, serta kata kunci anjing yang disebut 99 kali. Kata-kata negatif tertentu menunjukkan pola temporal yang fluktuatif, di mana intensitas cyberbullying mencapai puncak pada Minggu 4 dengan persentase tertinggi (79,1%). Temuan ini mengonfirmasi bahwa cyberbullying masih menjadi fenomena signifikan di platform X, oleh karena itu diperlukan kebijakan moderasi konten yang lebih ketat serta pengembangan sistem deteksi otomatis berbasis machine learning untuk mitigasi cyberbullying secara lebih efektif.

Keywords:

Cyberbullying, Analisis Semantik, Web Scraping, NLP, Sosial Media X

Downloads

Download data is not yet available.

References

Adellia, A. P., Sulistiyana, & Putro, H. Y. S. (2024). Studi komparatif: Bullying di dunia nyata dan dunia maya (cyberbullying). Edukatif: Jurnal Ilmu Pendidikan, 6(4), 4000-4007. https://doi.org/10.31004/edukatif.v6i4.7240

Balet, T., Vo, Q., Salem, O., & Mehaoua, A. (2023). Cyberbullying detection on tweets from Twitter using machine learning algorithms. 2023 International Conference on Intelligent Computing, Communication, Networking and Services (ICCNS). https://doi.org/10.1109/ICCNS58795.2023.10193450

Copp, J. E., Mumford, E. A., & Taylor, B. G. (2021). Online sexual harassment and cyberbullying in a nationally representative sample of teens: Prevalence, predictors, and consequences. Journal of Adolescence, 93, 202-211. https://doi.org/10.1016/j.adolescence.2021.10.003

Fati, S. M., Muneer, A., Alwadain, A., & Balogun, A. O. (2023). Cyberbullying detection on Twitter using deep learning-based attention mechanisms and continuous bag of words feature extraction. Mathematics, 11(16), 1-21. https://doi.org/10.3390/math11163567

Hinduja, S., & Patchin, J. W. (2018). Cyberbullying: Identification, prevention, and response. Cyberbullying Research Center. Retrieved from https://cyberbullying.org/Cyberbullying-Identification-Prevention-Response-2018.pdf

Kepios. (2024). Digital 2024 October Global Statshot Report. We Are Social & Meltwater. https://datareportal.com/reports/digital-2024-october-global-statshot

Kholis, K., Wiranata, T. D., Aisyah, S., & Asror, A. G. (2024). Semantik: Pengertian, teori, dan penerapannya dalam pembelajaran bahasa. Prosiding Seminar Nasional Daring, 4(1), 570-578. https://prosiding.ikippgribojonegoro.ac.id/index.php/SPBSI/article/view/2727/pdf

Lalithia, N., Sumaya T. Sk., Tejaswini, N., Bhagya, Sri D., & Srivani, R. (2023). Enhancing cyberbullying detection on Twitter with psychological features and machine learning. International Conference on Emerging Research in Computational Science (ICERCS). https://doi.org/10.1109/ICERCS57948.2023.10434258

Laxmi, S. T., Rismala, R., & Nurrahmi, H. (2021). Cyberbullying detection on Indonesian Twitter using Doc2Vec and Convolutional Neural Network. 2021 International Conference on Information and Communication Technology (ICoICT). https://doi.org/10.1109/ICoICT52021.2021.9527420

Marta, R. F., Kristina, Yulianto, A., & Febrianto, Y. (2024). Metode penelitian: Memahami pendekatan kuantitatif, kualitatif, dan campuran. PT Media Penerbit Indonesia. http://repository.mediapenerbitindonesia.com/338/1/Naskah%20Fix%20K%20204%20-%20%28FINISH%20LAYOUT%29%20Metode%20Penelitian%20Memahami%20Pendekatan%20Kuantitatif%2C%20Kualitatif%2C%20dan%20Campuran.pdf

Nugraha, D., & Astuti, P. (2023). Analisis sentimen cyberbullying pada sosial media Instagram menggunakan metode Support Vector Machine. Information System for Educators and Professionals, 8(2), 152-164. https://doi.org/10.51211/isbi.v8i2.2535

Office for National Statistics. (2023). Children’s online behaviour, England and Wales: Year ending March 2023. Office for National Statistics. https://www.ons.gov.uk/

Olasanmi, O. O., Agbaje, Y. T., & Adeyemi, M. O. (2020). Prevalence and prevention strategies of cyberbullying among Nigerian students. Open Journal of Applied Sciences, 10(6), 351-363. https://doi.org/10.4236/ojapps.2020.106026

Rizalespe. (2019). Dataset-Sentimen-Analisis-Bahasa-Indonesia [Dataset]. GitHub. https://github.com/rizalespe/Dataset-Sentimen-Analisis-Bahasa-Indonesia

Sandrila, W., & Wahyunengsih. (2023). Motives of cyberbullying behavior by teenage K-Pop fans on Twitter social media. Jurnal Riset Rumpun Ilmu Sosial, Politik dan Humaniora (JURRISH), 2(2), 190-196. https://doi.org/10.55606/jurrish.v2i2.1351

Ula, M., & Fachrurrazi, S. (2023). Analisis sentimen cyberbullying pada media sosial Twitter menggunakan metode Support Vector Machine dan Naïve Bayes Classifier. TECHSI - Jurnal Teknik Informatika, 14(2), 91. https://doi.org/10.29103/techsi.v14i2.12103

We Are Social. (2024). Digital 2024: Indonesia. We Are Social & Meltwater. https://datareportal.com/reports/digital-2024-indonesia

Winarno, Wiranto, & Harjito, B. (2023). Enhancing machine learning performance in cyberbullying detection through hyperparameter optimization. 2023 International Conference on Technology, Engineering, and Computing Applications (ICTECA). https://doi.org/10.1109/ICTECA60133.2023.10490843

Yasirutomo. (2021). text-normalization [Normalization]. GitHub. https://github.com/yasirutomo/text-normalization

Downloads

Abstract Views : 27
Downloads Count: 15

Published

2025-04-17

How to Cite

Azzahra, S. A., & Majid, N. W. A. (2025). Klasifikasi dan Analisis Semantik Cyberbullying Sosial Media X: Integrasi Web Scraping dan Natural Language Processing (NLP). Jurnal Educatio FKIP UNMA, 11(2). https://doi.org/10.31949/educatio.v11i2.12725

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.