100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary English essay

Beoordeling
-
Verkocht
-
Pagina's
7
Geüpload op
27-04-2025
Geschreven in
2024/2025

The vast majority of the popular English named entity recognition (NER) datasets contain American or British English data, despite the existence of many global varieties of English. As such, it is unclear whether they generalize for analyzing use of English globally.

Meer zien Lees minder
Instelling
Freshman / 9th Grade
Vak
English language and composition









Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Geschreven voor

Instelling
Freshman / 9th grade
Vak
English language and composition
School jaar
1

Documentinformatie

Geüpload op
27 april 2025
Aantal pagina's
7
Geschreven in
2024/2025
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

An Empirical Investigation of Multi-bridge Multilingual NMT
models

Anoop Kunchukuttan

arXiv (arXiv: 2110.07304v1)

Generated on April 27, 2025

, An Empirical Investigation of Multi-bridge Multilingual NMT
models


Abstract
In this paper, we present an extensive investigation of multi-bridge, many-to-many multilingual NMT
models (MB-M2M) ie., models trained on non-English language pairs in addition to English-centric
language pairs. In addition to validating previous work which shows that MB-M2M models can
overcome zeroshot translation problems, our analysis reveals the following results about multibridge
models: (1) it is possible to extract a reasonable amount of parallel corpora between non-English
languages for low-resource languages (2) with limited non-English centric data, MB-M2M models are
competitive with or outperform pivot models, (3) MB-M2M models can outperform English-Any models
and perform at par with Any-English models, so a single multilingual NMT system can serve all
translation directions.

arXiv:2110.07304v1 [cs.CL] 14 Oct 2021An Empirical Investigation of Multi-bridge Multilingual N MT
models Anoop Kunchukuttan Microsoft India, Hyderabad Abstract In this
paper, we present an extensive investigation of multi-bridge, many-to-many multilingual NMT models
(MB- M2M) i.e.,models trained on non-English language pairs in addition to English-centric language
pairs. In addition to val- idating previous work which shows that MB-MNMT models can overcome
zeroshot translation problems, our analysis r e- veals the following results about multibridge models: (1)
i t is possible to extract a reasonable amount of parallel corpora be- tween non-English languages for
low-resource languages (2 ) with limited non-English centric data, MB-M2M models are competitive
with or outperform pivot models, (3) MB-M2M models can outperform English-Any models and perform
at par with Any-English models, so a single multilingual NMT system can serve all translation
directions. Introduction Neural Machine Translation has led to signi■cant advances in MT quality in
recent times (Bahdanau, Cho, and Bengio 2015; Wu et al. 2016; Sennrich, Haddow, and Birch 2016b,a;
Vaswani et al. 2017). MT research has seen signi■cant efforts in translation between English and othe
r languages, driven in signi■cant measure by availability of English-centric parallel corpora.
Particularly, multili ngual NMT models using English-centric parallel corpora have shown signi■cant
improvements for translation between En- glish and low-resources languages (Firat, Cho, and Bengio
2016; Johnson et al. 2017). Translation between non- English languages has received lesser attention,
with the default approach being pivot translation (Lakew et al. 2017 ). Pivot translation is a strong
baseline, but needs multiple decoding steps resulting in increased latency and cascadin g errors.
Zeroshot translation using English-centric many-to-many multilingual models (EC-M2M) (Johnson et al.
2017) is promising, but is plagued by problems of spurious corre- lation between input and output
language (Gu et al. 2019; Arivazhagan et al. 2019). Hence, vanilla zeroshot translat ion quality
signi■cantly lags behind pivot translation. Vario us methods have been proposed to address these
limitations by aligning encoder representations (Arivazhagan et al. 2019 ) Copyright © 2022,
Association for the Advancement of Arti■c ial Intelligence (www.aaai.org). All rights reserved.or using
pseudo-parallel corpus between non-English lan- guages during training (Lakew et al. 2017). Recently,
there has been interest in multi-bridge many-to- many multilingual models (MB-M2M, referred to as
multi- bridge models henceforth). These models are trained on di- rect parallel corpora between
non-English languages in add i- tion to English-centric corpora (Rios, M¨ uller, and Sennri ch 2020;
Freitag and Firat 2020; Fan et al. 2020). Such corpora can either be mined from monolingual cor- pora
(Fan et al. 2020) using bitext mining approaches like LASER (Artetxe and Schwenk 2019) and LABSE
(Feng et al. 2020) or extracted from English-centric parall el corpora (Rios, M¨ uller, and Sennrich 2020;
Freitag and Fira t 2020). These works show that multi-bridge models can over- come zeroshot
translation problems and perform at par/bet- ter than pivot approaches. In addition, models using sep-
€6,19
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper
Seller avatar
cleoellis

Maak kennis met de verkoper

Seller avatar
cleoellis University of the People
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
0
Lid sinds
8 maanden
Aantal volgers
0
Documenten
11
Laatst verkocht
-
Essay, Notes, Test, Quizzes

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen