Download presentation
Presentation is loading. Please wait.
Published byWilfrid Allen Modified over 9 years ago
1
MonoTrans2: A New Human Computation System to Support Monolingual Translation Chang Hu, Benjamin B. Bederson, Philip Resnik and Yakov Kronrod Translating with people who speak only one language
2
International Children’s Digital Library – 4,386 books – 54 languages – 100K unique visitors/month – 1,500 volunteer translators www.childrenslibrary.org English and Spanish? Croatian and Japanese? Too Much to Translate
3
Fanm gen tranche pou fe` yon pitit nan Delmas 31 Undergoing children delivery Delmas 31 Fanm gen tranche pou fe` yon pitit nan Delmas 31 Undergoing children delivery Delmas 31 Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland. Uncommon Languages
4
Bilingual Translators are Hard to Find
5
Machine Translation? Large volume, cheap, fast Unreliable quality
6
Translation with bilingual translators vs. 1,200,000 contributors Wikipedia: 900 translators Translate with the Monolingual Crowd Chang Hu. Collaborative Translation by Monolingual Users, CHI '09 Chang Hu, Benjamin B. Bederson, Philip Resnik. Translation by Iterative Collaboration between Monolingual Users (MonoTrans), GI '10
7
Monolingual Crowds Fixing Machine Translation Together
8
Estoy bien. I am fine. 1 1 Vote on back translation 1 1 Vote on candidates
9
Estoy bien. Am fine. 2 2 Target-side editing I am fine. 1 1 Vote on candidates 1 1 Vote on back translation
10
Estoy bien. I am been. 1 1 Vote on back translation 2 2 Target-side editing 3 3 Identify translation errors I am been. been. bien. 2 2 Explain phrase Estoy bien. bien. I am been. been. 1 1 Vote on candidates
11
Estoy bien. I am been. 1 1 Vote on back translation 2 2 Target-side editing 3 3 Identify translation errors I am been. been. bien. 2 2 Explain phrase Estoy bien. bien. 3 3 Paraphrase source sentence repeat … 1 1 2 2 3 3 … Yo estoy bien. 1 1 Vote on candidates
12
UI
22
Experiments
23
Experiment1 – Children’s Books 60 Spanish / 22 German speakers ICDL volunteers Worked on – 4 Spanish books => German – 1 German book => Spanish Machine translation engine: Google Translate
24
Evaluation of MonoTrans2 Output 2 German-Spanish bilingual evaluators (not part of MonoTrans2!) Fluency and accuracy 5-point score How much improvement over Google Translate? Original: Estoy muy bien. Fluent, not accurate: The weather is good. Accurate, not fluent: Me is very good.
25
Results - Fluency WorstBest
26
Results - Fluency WorstBest
27
Results - Accuracy WorstBest
28
Results - Accuracy WorstBest
29
Ready for ICDL? Ready: both bilingual evaluators agree score = 5 Machine translation (Google) only: 10% of sentences MonoTrans2: 68% of sentences ready
30
Experiment2 Haitian Earthquake SMS 4 Haitian Creole speakers 5 English-speaking students 21 other English speakers Worked on 408 text messages Machine translation (Google) only: 25% of sentences MonoTrans2: 38% of sentences ready Difficulty: text messages >> children’s books
31
Sample Results Haitian Creole:Enfòmasyon sou tranblemen de tè Ground Truth:Information on the earthquake Google: Information tranblemen ground MonoTrans2:Information on the earthquake
32
Sample Results Haitian Creole:Bonjou. Mwen ta renmen konnen si imigrasyon ouvè SVP. Mèsi. Ground Truth:Hello. I would like to know if immigration is open please. Thank you. Google: Hello. I would like to know if open immigration SVP. Thank you. MonoTrans2: Hello. I would like to know if immigration is open please. Thank you.
33
MonoTrans2 – No human bilingual knowledge – Dramatic improvement from machine translation translatetheworld.org ? Recap
34
Take-Away Message People + machine > people or machine Combining two crowds with different skills translatetheworld.org
35
MonoTrans2 User Actions Source: vote119 candidate 59 answer45 approve77 Target: vote1012 candidate202 error span162
36
Backup Slides
37
International Children’s Digital Library [previously funded by NSF ITR] www.childrenslibrary.org
38
Translation speed – Professional translators: 2000 words per day – MonoTrans2: 800 words per day – Translation firm on the four German/Spanish books: 4 days – MonoTrans2: 4 days – Haitian SMS experiment: 284.75 words per minute
39
UI
40
Target Side - Identify Errors
41
Target Side - Edit Translations
42
Source Side
43
Source Side – Explain Errors
44
Ready for ICDL? GoogleMonoTrans2 Sentences with fluency = 521112 Sentences with adequacy = 517118 Sentences where BOTH = 517110 Sentences for which both bilingual evaluators agree score = 5 (N=162 sentences worked on in the experiment) Machine translation only: 10% of sentences ready MonoTrans2: 68% of sentences ready
45
My family in Carrefour, 24 Cote Plage, 41A needs food and water People trapped in Sacred Heart Church, PauP General Hospital has less than 24 hrs. supplies Undergoing children delivery Delmas 31 My family in Carrefour, 24 Cote Plage, 41A needs food and water People trapped in Sacred Heart Church, PauP General Hospital has less than 24 hrs. supplies Undergoing children delivery Delmas 31 Experiment 3 An alternative use case for crowdsourced translation… Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.
46
MonoTrans2 now available at: www.translatetheworld.org
48
Fluency Distribution
49
Adequacy Distribution
50
Punchline (provisional) GoogleMonoTrans2 Sentences with fluency = 51 (1%)22 (30%) Sentences with adequacy = 511 (14%)29 (38%) Sentences where BOTH = 50 (0%)14 (18%) Sentences for which three bilingual evaluators agree score = 5 (N=76 sentences completed) Straight MT: 0% of sentences preserve all the meaning MonoTrans2: 38% of sentences preserve all the meaning
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.