Increasing SMT and NMT Performance by Corpus Extension with Free Online Machine Translation Services

358

Views

0

Downloads

Hlaing, Zar Zar, Thu, Ye Kyaw, Supnithi, Thepchai and Netisopakul, Ponrudee (2020) Increasing SMT and NMT Performance by Corpus Extension with Free Online Machine Translation Services In: 2020 International Conference on Advanced Information Technologies (ICAIT), 2020-11-04, Yangon, Myanmar.

Abstract

In machine translation, parallel corpora of source-target language pair are essential to improve the performance of the translation. However, the existing parallel corpora for the low resource language is not sufficient to improve the quality of the translation. In this paper, we explore the role of corpus extension by using the three freely available online machine translation services; “Google Translate”, “SYSTRAN Translate” and “Yandex Translate” for English and Thai language pair. We compare three statistical and neural machine translation performances between the original ASEAN-MT corpus, and their extended version, which double the original size of the ASEAN-MT. The results showed that, for SMT models, extended Thai corpus can help improve the translation performance for th-en translation up to 2.6% and the extended English corpus can do so significantly for en-th translation up to 4.2%. While for the NMT model, the extended Thai corpus can improve the translation performance up to 5.5%.

Item Type:

Conference or Workshop Item (Paper)

Identification Number (DOI):

Deposited by:

ระบบ อัตโนมัติ

Date Deposited:

2021-09-09 23:53:48

Last Modified:

2021-09-19 05:50:12

Impact and Interest:

Statistics