Statistical machine translation between related languages

Pushpak Bhattacharyya; Mitesh M. Khapra

NAACL-HLT 2016

Conference paper

12 Jun 2016

Statistical machine translation between related languages

Abstract

Languageindependent Statistical Machine Translation (SMT) has proven to be very challenging. The diversity of languages makes high accuracy difficult and requires substantial parallel corpus as well as linguistic resources (parsers, morph analyzers, etc.). An interesting observation is that a large chunk of machine translation (MT) requirements involve related languages. They are either : (i) between related languages, or (ii) between a lingua franca (like English) and a set of related languages. For instance, India, the European Union and SouthEast Asia have such translation requirements due to government, business and sociocultural communication needs.

Conference paper