News Translation Task
Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio
Valerio Miceli Barone, Alexandra Birch
arXiv (arXiv: 1907.05854v1)
Generated on April 27, 2025
, The University of Edinburgh's Submissions to the WMT19
News Translation Task
Abstract
The University of Edinburgh participated in the WMT19 Shared Task on News Translation in six
language directions: English-to-Gujarati, Gujarati-to-English, English-to-Chinese, Chinese-to-English,
German-to-English, and English-to-Czech. For all translation directions, we created or used
back-translations of monolingual data in the target language as additional synthetic training data. For
English-Gujarati, we also explored semi-supervised MT with cross-lingual language model pre-training,
and translation pivoting through Hindi. For translation to and from Chinese, we investigated
character-based tokenisation vs. sub-word segmentation of Chinese text. For German-to-English, we
studied the impact of vast amounts of back-translated training data on translation quality, gaining a few
additional insights over Edunov et al. (2018). For English-to-Czech, we compared different
pre-processing and tokenisation regimes.
The University of Edinburgh’s Submissions to the WMT19 News Translation Task Rachel Bawden
Nikolay Bogoychev Ulrich Germann Roman Grundkiewicz Faheem Kirefu Antonio Valerio Miceli
Barone Alexandra Birch School of Informatics, University of Edinburgh, Scotland
Abstract The University of Edinburgh participated in the WMT19 Shared
Task on News Translation in six language directions: English $Gujarati, English$Chinese, German
!English, and English!Czech. For all translation direc- tions, we created or used back-translations of
monolingual data in the target language as additional synthetic training data. For English$Gujarati, we
also explored semi- supervised MT with cross-lingual language model pre-training, and translation
pivoting through Hindi. For translation to and from Chi- nese, we investigated character-based tokeni-
sation vs. sub-word segmentation of Chinese text. For German!English, we studied the im- pact of vast
amounts of back-translated train- ing data on translation quality, gaining a few additional insights over
Edunov et al. (2018). For English!Czech, we compared different pre-processing and tokenisation
regimes. 1 Introduction The University of Edinburgh participated in the WMT19 Shared Task on News
Transla- tion in six language directions: English-Gujarati (EN$GU), English-Chinese (EN $ZH),
German- English (DE!EN) and English-Czech (EN !CS). All our systems are neural machine translation
(NMT) systems trained in constrained data condi- tions with the Marian1toolkit (Junczys-Dowmunt et
al., 2018). The different language pairs pose very different challenges, due to the characteristics of the
languages involved and arguably more impor- tantly, due to the amount of training data available.
Pre-processing For EN$ZH, we investigate character-level pre-processing for Chinese com- pared with
subword segmentation. For EN !CS, we show that it is possible in high resource settings to simplify
pre-processing by removing steps. 1https://marian-nmt.github.ioExploiting non-parallel resources For
all lan- guage directions, we create additional, synthetic parallel training data. For the high resource lan-
guage pairs, we look at ways of effectively us- ing large quantities of backtranslated data. For example,
for DE!EN, we investigated the most effective way of combining genuine parallel data with larger
quantities of synthetic parallel data and for CS!EN, we ■lter backtranslated data by re- scoring
translations using the MT model for the op- posite direction. The challenge for our low resource pair,
EN$GU, is producing suf■ciently good mod- els for back-translation, which we achieve by train- ing
semi-supervised MT models with cross-lingual language model pre-training (Lample and Conneau,
2019). We use the same technique to translate ad- ditional data from a related language, Hindi. NMT
Training settings In all experiments, we test state-of-the-art training techniques, including using
ultra-large mini-batches for DE !EN and EN$ZH, implemented as optimiser delay. Results summary
Of■cial automatic evaluation results for all ■nal systems on the WMT19 test set are summarised in
Table 1. Throughout the paper, BLEU is calculated using SACRE BLEU2 (Post, 2018) unless otherwise