Publication
NAACL-HLT 2016
Conference paper
Statistical machine translation between related languages
Abstract
Languageindependent Statistical Machine Translation (SMT) has proven to be very challenging. The diversity of languages makes high accuracy difficult and requires substantial parallel corpus as well as linguistic resources (parsers, morph analyzers, etc.). An interesting observation is that a large chunk of machine translation (MT) requirements involve related languages. They are either : (i) between related languages, or (ii) between a lingua franca (like English) and a set of related languages. For instance, India, the European Union and SouthEast Asia have such translation requirements due to government, business and sociocultural communication needs.