Semi-meta-supervised hate speech detection

Cendra Devayana Putra, Hei Chia Wang

研究成果: Article同行評審

1 引文 斯高帕斯(Scopus)

摘要

On social media, hate speech is a daily occurrence but has physical and psychological implications. Utilizing a deep learning strategy to combat hate speech is one method for preventing it. Deep learning techniques may require massive datasets to generate accurate models, but hate speech samples (such as misogyny and cyber samples) are frequently insufficient and diverse. We offer methods for leveraging these diverse datasets and enhancing deep learning models through knowledge sharing. We analyzed the existing Bidirectional Encoder Representations from Transformers (BERT) technique and built a BERT-3CNN method to generate a single-task classifier that optimally absorbs the target dataset's features. Second, we proposed a shared BERT layer to gain a general understanding of hate speech. Third, we proposed a method for adapting another dataset to the desired dataset. We conducted several quantitative experimental investigations on five datasets, including Hatebase, Supremacist, Cybertroll, TRAC, and TRAC 2020, and assessed the achieved performance using the accuracy and F1 metrics. The first experiment demonstrated that our BERT-3CNN model improved the average accuracy by 5% and the F1 score by 18%. The second experiment demonstrated that BERT-SP improved the average accuracy by 0.2% and the F1 score by 2%. TRAC, Supremacist, Hatebase, and Cybertroll all showed improvements in accuracy, with Semi BERT-SP enhancing accuracy by 6% and F1 score by 5%, while TRAC2020 showed 10% and 9% improvements.

原文English
文章編號111386
期刊Knowledge-Based Systems
287
DOIs
出版狀態Published - 2024 3月 5

All Science Journal Classification (ASJC) codes

  • 軟體
  • 管理資訊系統
  • 資訊系統與管理
  • 人工智慧

指紋

深入研究「Semi-meta-supervised hate speech detection」主題。共同形成了獨特的指紋。

引用此