Maity, Krishanu, Poornash, A. S., Saha, Sriparna and Pasupa, Kitsuchart (2024) ToxVI: a Multimodal LLM-based Framework for Generating Intervention in Toxic Code-Mixed Videos In: The 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024), 21-25 October 2024, Biose, Idaho, USA.
While considerable research has delved into detecting toxic content in text-based data, the realm of video content, particularly in languages other than English, has received less attention. Prior studies have primarily focused on creating automated tools to identify online toxic speech but have often overlooked the crucial next steps of mitigating its impact and discouraging future use. We can discourage social media users from sharing such material by automatically generating interventions that explain why certain content is inappropriate. To bridge this research gap, we propose an innovative task: generating interventions for toxic videos in code-mixed languages which go beyond existing methods focusing on text and images to combat online toxicity. We are introducing a Toxic Code-Mixed Intervention Video benchmark dataset (ToxCMI), comprising 1697 code-mixed toxic video utterances sourced from YouTube. Each utterance in this dataset has been meticulously annotated for toxicity and severity, accompanied by interventions provided in Hindi-English code-mixed languages. We have developed an advanced multimodal framework ToxVI, specifically designed for the task of generating Toxic Video appropriate Interventions, leveraging Large Language Models (LLMs), which comprises three modules - Modality module, Cross-Modal Synchronization module and Generation module. Our experiments demonstrate that integrating multiple modalities from the videos significantly enhances the performance of the proposed task and outperforms all the baselines by a significant margin.
Item Type:
Conference or Workshop Item (Paper)
Identification Number (DOI):
Subjects:
Subjects > Computer Science > Artificial Intelligence
Subjects > Computer Science > Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
Subjects > Computer Science > Machine Learning
Deposited by:
Kitsuchart Pasupa
Date Deposited:
2026-01-06 22:30:46
Last Modified:
2026-01-07 15:45:02