Download presentation
Presentation is loading. Please wait.
Published bySpencer McBride Modified over 9 years ago
1
Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain) carlostx@linguanativa.com.br Knowledge of provenance and its effects on translation performance (in an integrated TM/MT environment) NLPCS 2011 8th International Workshop on Natural Language Processing and Cognitive Science Special Issue: Human-Machine Interaction in Translation 20-21 August, 2011 - Copenhagen, Denmark
2
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
3
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
4
Speed: Will you translate faster? Effort: Will you feel more tired? Quality: Will you translate better? Reason: Does provenance play a role? Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
5
Speed: V is faster than B H1: The translation speed is higher in V than in B Effort: V requires less editing than B H2: The amount of editing is smaller in V than in B Quality: V and B produce similar quality H4: There is no significant difference in quality between V and B Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
6
English textSpanish text Translation Memory (Alignment) Source text 1 Source text 2 Exact matches 90-99% fuzzy 80-89% fuzzy 70-79% fuzzy No matches (MT) TM 1 TM 2
7
◦ Same type of text ◦ Same types of matches ◦ Same machine-translation engine (ecological validity) So what is different? Provenance information Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
9
BBFlashBack ◦ Screen activity ◦ Keystrokes ◦ Mouse movements and clicks ◦ Translator’s face ◦ Sound (voices, keyboard, etc) Retrospective interviews Quality assessment
10
Data treatment 1 st RENDERINGTYPINGNOTES2 nd RENDERINGTYPINGNOTES 1FUZZY 75% 00:00,0000:40,3340,3311 18:38,4418:43,8905,450 00,00 2FUZZY 86% 00:40,5601:37,7857,2239 18:44,5618:46,2201,660 00,00 3NO MATCH 02:30,3304:31,67121,3417Asks a question to researcher18:46,8919:19,3332,4419 00,00 4NO MATCH 04:31,6704:38,2206,55 19:20,0019:28,2208,220 05:05,7805:09,5603,78 00,00 06:28,5606:51,4422,88 00,00 11:51,3313:06,3375,0060 00,00 14:04,3314:35,1130,7834 5NO MATCH 14:35,6715:57,2281,5522 19:28,7819:43,6714,898 00,00 6NO MATCH 15:57,8917:17,8980,0048 19:44,3319:59,8915,560 00,00 7FUZZY 87% 17:18,5619:14,44115,88101 20:00,4420:16,5616,125 00,00 8EXACT 19:14,4420:49,1194,67119 20:17,2220:34,3317,119 22:47,2223:03,7816,561 00,00 9NO MATCH 23:04,3323:24,4420,11- 20:35,0020:59,6742,4526 26:08,4426:52,0043,5668 00,00 27:53,5628:15,2221,6610 00,00 10FUZZY 95% 28:16,0030:35,11139,1126 21:00,4421:11,6711,230 31:03,5631:39,7836,220 00,00 31:51,3332:48,6757,3420 00,00 33:23,6734:28,0064,3348 00,00 11FUZZY 99% 34:28,5634:51,6723,1140 21:12,2221:13,3301,110 00,00 12FUZZY 74% 34:52,3335:14,5622,230 21:14,1121:25,1111,000 35:41,1137:17,8996,7890 00,00 37:46,4438:04,3317,892 00,00 43:11,5643:19,5608,000 00,00 55:10,6755:51,8941,2238 13EXACT 55:52,4457:12,2279,7850 21:25,7821:39,5613,780 00,00 14EXACT 57:12,7857:23,2210,443 Researcher interrupts subject to tell he has to leave the room for a while21:40,0021:44,3304,330 57:35,8958:25,2249,3335 00,00 59:10,2259:31,8921,67 00,00 15NO MATCH 59:32,5600:44,8972,3336 21:44,8921:58,7813,890 00,00 16EXACT 00:45,5601:27,4441,88 21:59,4422:04,1104,670 02:28,7802:48,2219,44 00,00 05:22,4405:33,5611,12 00,00 05:37,7806:25,3347,55 00,00 06:54,0007:17,8923,89 00,00 09:11,7809:31,0019,22 00,00 11:05,6711:24,2218,5510 00,00 12:07,8912:28,1120,22 00,00 13:15,6713:32,1116,44 00,00 13:44,2214:24,1139,8975 00,00 258,20 00,00 17FUZZY 86% 14:24,6715:01,4436,7721 22:04,6722:17,1112,440 00,00 18NO MATCH 15:02,0015:14,0012,0011 22:17,6722:18,8901,220 00,00 19FUZZY 93% 15:14,6715:35,2220,5531 22:19,4423:00,4441,0023Check sound here! 15:56,0016:24,0028,0044 00,00 20FUZZY 72% 16:24,5617:11,5647,0039 23:01,1123:04,3303,220 00,00 21EXACT 17:12,2217:47,1134,899 23:04,8923:14,4409,550 00,00
11
Data treatment SOURCE WORDS TIME (sec) 1 st rendition SPEED (words/h) 1 st rendition TIME (sec) Proof- reading SPEED (words/h) Combined TARGET CHARS TYPED CHARS 1 st rendition AMOUNT OF EDITING 1 st rendition TYPED CHARS 2 nd rend AMOUNT OF EDITING Combined TRANSLATION BLIND (Text12) EXACT (100%) MATCHES SEGMENT #130111,2397117,1184215212078,95%984,87% SEGMENT #23079,78135413,7811541785028,09%0 SEGMENT #32581,4411054,3310491503825,33%0 SEGMENT #418258,22514,672471568554,49%0 SEGMENT #52534,8925809,55202515695,77%0 TOTAL128565,5481549,4474979230238,13%939,27% 90-99% MATCHES SEGMENT #13829746111,234443089430,52%0 SEGMENT #2723,1110901,111040414097,56%0 SEGMENT #32048,551483418041197563,03%2382,35% TOTAL65368,6663553,3455546820944,66%2349,57% 80-89% MATCHES SEGMENT #12757,2216991,6616511083936,11%0 SEGMENT #224115,8874616,1265516010163,13%566,25% SEGMENT #32636,77254612,4419021482114,19%0 TOTAL77209,87132130,22115541616138,70%539,90% 70-79% MATCHES SEGMENT #11640,3314285,4512581081110,19%0 SEGMENT #244186,128511180421613060,19%0 SEGMENT #3174713023,221219943941,49%0 TOTAL77273,45101419,6794641818043,06%0 NO MATCHES (MT FEEDS) SEGMENT #131121,3492032,44726219177,76%1916,44% SEGMENT #230138,997778,227341629458,02%0 SEGMENT #32681,55114814,899711532214,38%819,61% SEGMENT #42880126015,5610552234821,52%0 SEGMENT #51585,3363342,45423957882,11%26109,47% SEGMENT #62972,33144313,8912111843619,57%0 SEGMENT #761218001,221634331133,33%0 TOTAL165591,541004128,67825106930628,62%5333,58%
12
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili SOURCE WORDS TIME (sec) 1 st rendition SPEED (words/h) 1 st rendition TIME (sec) 2 nd rendition SPEED (words/h) Combined TARGET CHARS TYPED CHARS 1 st rendition AMOUNT OF EDITING 1 st rendition TYPED CHARS 2 nd rend AMOUNT OF EDITING Combined COPY 135207,892337,77478 752806107,18% TRANSL W/O CAT 79380,89746,672268 602703116,78% VISUAL EXACT (100%) MATCHES 1311553036941895792 90-99% MATCHES 912341397101977464 80-89% MATCHES 511531197271019376 70-79% MATCHES 8745468988577457 NO MATCHES (MT FEEDS) 1507836901325911018 510178010314418273107 BLIND EXACT (100%) MATCHES 1285668154974979230238,13%939,27% 90-99% MATCHES 653696355355546820944,66%2349,57% 80-89% MATCHES 77210132130115541616138,70%539,90% 70-79% MATCHES 7727310142094641818043,06%0 NO MATCHES (MT FEEDS) 1655921004129825106930628,62%5333,58% 51220099172818053163 102,71% Preliminary results
13
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 1: Translation speed (words/hour)
14
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 1: Translation speed (words/hour)
15
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 1: Translation speed (words/hour)
16
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 2: Translation speed (words/hour)
17
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 2: Translation speed (words/hour)
18
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 2: Translation speed (words/hour)
19
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Quality
20
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Conclusions: Testing of first hypothesis (speed) is inconclusive if we take the whole texts as a reference. Subject1 was slightly faster (5.2 percent) in environment V, while Subject2 was slightly faster (5.6 percent) in environment B. Overall speed depends on the distribution of different types of translation suggestions in the texts (besides individual-specific differences).
21
Small number of subjects Small number of segments Irregular segments Terminology Segment identification Experience increases over time Subject variability Quality assessment? Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
22
Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Conclusions ?Generalisations? Specific type of text Particular subject Given fuzzy match grid A particular MT engine
23
Quality assessments Retrospective interviews Statistical analysis MT trust scores? Eye-tracking? Translog? Implications/Applications of findings Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
24
O’Brien, Sharon. 2006. Eye-tracking and translation memory matches. Perspectives: Studies in Translatology 14, n. 3: 185-205. Guerberof, Ana. 2009. Productivity and quality in the post-editing of outputs from translation memories and machine translation. Localisation Focus - The International Journal of Localisation 7, n. 1: 11-21. Christensen, Tina Paulsen & Anne Schjoldager. 2010. “Translation-Memory (TM) Research: What Do We Know and How Do We Know It?” Hermes – Journal of Language and Communication Studies. Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili
25
Thank you! carlostx@linguanativa.com.br Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.