HateThaiSent: Sentiment-Aided Hate Speech Detection in Thai Language

21

Views

0

Downloads

Maity, Krishanu, Poornash, A. S., Bhattacharya, Shaubhik, Phosit, Salisa, Kongsamlit, Sawarod, Saha, Sriparna and Pasupa, Kitsuchart (2024) HateThaiSent: Sentiment-Aided Hate Speech Detection in Thai Language IEEE Transactions on Computational Social Systems.. ISSN 2329-924X (In Press)

Abstract

Social media platforms are a double-edged sword: on the one hand, they enable the dissemination of information, but on the other hand, they also provide an avenue for spreading online abuse and harassment, such as hate speech. While significant research efforts are being devoted to detecting online hate speech in the English language, little attention has been paid to the Thai language. In this study, we created a benchmark dataset, called HateThaiSent, which labels each post with both hate speech and sentiment information. To detect hate speech, we created a multi-task model that uses a dual-channel deep learning approach based on FastText and BERT embeddings, with an added capsule network. One channel utilizes pre-trained FastText embeddings while the other uses embeddings from the BERT language model. We aimed to answer two research questions: (Q1) Does incorporating sentiment information improve the performance of hate speech detection in the Thai language? (Q2) What is the comparative effectiveness of two different approaches for sentiment-aware hate speech detection in the Thai language: feature engineering versus multi-tasking? Our proposed approach outperformed other baselines and state-of-the-art models on the HateThaiSent dataset, with overall accuracy/macro-F1 values of 89.67%/89.79%, and 80.92%/80.97% for hate speech and sentiment detection tasks, respectively. We concluded that multi-tasking is more effective than feature engineering in enhancing the performance of the main task (hate speech detection).

Item Type:

Article

Identification Number (DOI):

Subjects:

Subjects > Computer Science > Artificial Intelligence

Subjects > Computer Science > Machine Learning

Subjects > Computer Science > Computation and Language (Computational Linguistics and Natural Language and Speech Processing)

Deposited by:

Kitsuchart Pasupa

Date Deposited:

2024-04-06 17:32:06

Last Modified:

2024-04-24 13:05:17

Impact and Interest:

Statistics