Keyword-Text Graph Representation for Short Text Classification

332

Views

0

Downloads

Chuangkrud, Piyawat, Leelanupab, Teerapong, Damrongrat, Chaianun and Kanungsukkasem, Nont (2021) Keyword-Text Graph Representation for Short Text Classification In: The 13th International Conference on Information Technology and Electrical Engineering (ICITEE 2021), 14-15 October 2021, Chiang Mai, Thailand.

Abstract

Short text classification is an essential task in Natural Language Processing. This task is widely applied to many applications, such as spam filtering, question-answering, artificial conversational agent, sentiment analysis, review mining, etc. Short texts usually encounter a great challenge for classification due to data sparseness as they do not provide sufficient contextual information. In this paper, we introduce Keyword-Text Graph Convolutional Networks (KwTGCN) for short-text classification. We also propose a method to identify keywords by estimating word distribution over different categories. These category keywords are then used to build a special keyword-text graph of short-text corpus. We employ Graph Convolutional Network (GCN) and our keyword-text graph to generate the representation of short-text corpus based on the relations of document-keyword and document-word as well as the word co-occurrence. This document, word and keyword representation is further used as an input feature for the next layer of short-text classification. The experimental results on multiple benchmark datasets show that our proposed model outperforms the state-of-the-art models for short-text classification in multiple attempts.

Item Type:

Conference or Workshop Item (Paper)

Identification Number (DOI):

Subjects:

Subjects > Computer Science > Computation and Language (Computational Linguistics and Natural Language and Speech Processing)

Subjects > Computer Science > Information Retrieval

Deposited by:

Teerapong Leelanupab

Date Deposited:

2021-10-20 21:05:16

Last Modified:

2022-08-14 20:16:50

Impact and Interest:

Statistics