Graph-based Dependency Parser Building for Myanmar Language

152

Views

0

Downloads

Hlaing, Zar Zar, Ye, Kyaw Thu, Supnithi, Thepchai and Netisopakul, Ponrudee (2022) Graph-based Dependency Parser Building for Myanmar Language In: 2022 17th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), 05-07 November 2022, Chiang Mai, Thailand.

Abstract

Examining the relationships between words in a sentence to determine its grammatical structure is known as dependency parsing (DP). Based on this, a sentence is broken down into several components. The process is based on the concept that every linguistic component of a sentence has a direct relationship to one another. These relationships are called dependencies. Dependency parsing is one of the key steps in natural language processing (NLP) for several text mining approaches. As the dominant formalism for dependency parsing in recent years, Universal Dependencies (UD) have emerged. The various UD corpus and dependency parsers are publicly accessible for resource-rich languages. However, there are no publicly available resources for dependency parsing, especially for the low-resource language, Myanmar. Thus, we manually extended the existing small Myanmar UD corpus (i.e., myPOS UD corpus) as myPOS version 3.0 UD corpus to publish the extended Myanmar UD corpus as the publicly available resource. To evaluate the effects of the extended UD corpus versus the original UD corpus, we utilized the graph-based neural dependency parsing models, namely, jPTDP (joint POS tagging and dependency parsing) and UniParse (universal graph-based parsing), and the evaluation scores are measured in terms of unlabeled and labeled attachment scores: (UAS) and (LAS). We compared the accuracies of graphed-based neural models based on the original and extended UD corpora. The experimental results showed that, compared to the original myPOS UD corpus, the extended myPOS version 3.0 UD corpus enhanced the accuracy of dependency parsing models.

Item Type:

Conference or Workshop Item (Paper)

Identification Number (DOI):

Subjects:

Subjects > Computer Science > Artificial Intelligence

Subjects > Computer Science > Computation and Language (Computational Linguistics and Natural Language and Speech Processing)

Subjects > Computer Science > Computers and Society

Subjects > Statistics > Computation

Deposited by:

Ponrudee Netisopakul

Date Deposited:

2023-11-28 18:26:22

Last Modified:

2023-11-29 12:45:59

Impact and Interest:

Statistics