Download presentation
Presentation is loading. Please wait.
1
Ishida & Matsubara Laboratory – Ari Hautasaari Target system introduction Quantitative analysis of the statistics Research problems Solution discussion Future research Computer-Mediated Multilingual Communication Case: Pangaea @ COCON Karasuma 27.3.2009
2
Target System Introduction
3
“Pangaea is the non-profit organization headquartered in Tokyo, Japan. The people who participate in a project of pangaea by the various shapes from each country are called “pangaean”. This site is the place with which pangaean in all over the world can communicate seamlessly.” Target System Introduction Pangaea Multilingual BBS The target system for the case-study is Pangaea-organizations multilingual bulletin board system (BBS) for intra-organization communication and CSCW. Working in 5 countries – Japan, Korea, Austria, Kenya and most recently Malaysia. Pangaea organization has approximately 240 volunteers around the world, 130 in Japan and 110 in other countries. Pangaea has introduced their multilingual and multicultural services to over 3000 children all over the world.
4
Multilingual BBS architecture. Basic BBS design where all the messages are saved under BBS Topics and BBS Messages The BBS is accessible to only users with an assigned user ID and password Target System Introduction Pangaea Multilingual BBS Users choose the language of the interface and default language of messages as they login the system Four languages are available: Japanese, Korean, English and German. In the future more languages may be added. The language of the interface is not connected to the language used in messages, thus German speaking users can use the German interface and post messages in English.
5
Flowchart for posting messages – Topic starters post their messages in their mother tongue. – The plain message is translated into 3 target languages (in reality to four languages as the “translation” of the source text stays the same). – The users are able to read the messages and post answers in their native or preferred language depending on what language they choose to use in the BBS. – The users have a possibility to correct the machine translated messages by hand through the BBS. Target System Introduction Pangaea Multilingual BBS Japanese the main language as a source language. German is supported but not frequently used. LanguagePosts Japanese408 Korean 25English 32 German 0 Total 465 Messages by source language
6
Quantitative Analysis of the Statistics
7
The quantitative data on the Pangaea-BBS was collected from the late 2008 version of the service. The data was extracted from the Pangaea SQL-server contents. Because of privacy issues and decentralized data storages for some data (log-in information, personal information ect) some statistics were not available for this study. Since some essential variables are not stored in the database, some of the data had to be extracted and examined by hand (translation corrections). Quantitative Analysis of the Statistics Topics include system messages and user instructions as well as intra-organizational communication in form of reports. Topics are divided by activity sites and activities. The amount of actual users was not available. Topic starters will thus represent the amount of Active Users. Topics339 Messages Posts to topic Answer rate 126 465 37% Topic starters61 Basic statistics Categories16
8
The amount of topics greatly exceeds the amount of reply messages in the system. Topics in this case represent the individual reports, messages and announcements. Quantitative Analysis of the Statistics The amount of topics is displayed as a negatively skewed distribution. In the horizontal axis the amount of users increases to the right. In the vertical axis the amount of topics started increases upwards. Topics started per user Users Topics distribution
9
Percentage of posts by Top-5 posters: 48% Percentage of Top-5 posters in the user base: 8% Quantitative Analysis of the Statistics Average amount of topics started per user (mean) : 5.6 Median of topics started per user: 3 Mode of topics started per user: 1 As the topics started per user is negatively skewed, we use the median as the average topics per user. Most common users post 1 topic (mode). Topics started per user Users 68 1 131810 37 22 19 17 Topics distribution
10
Quantitative Analysis of the Statistics Users Topics started + UsersTopics started 181 102 113 34 55 36 37 18 09 110 111 117 119 122 137 168 Topics per user
11
Translation corrections were collected by hand by comparing the update tag and creation tag in the database and comparing a new MT to the existing one. Target System Introduction Pangaea Multilingual BBS Total MT corrections rate by users (Japanese): 14% Total correctors: 6 Percentage of user base: ~10% Total topics started by correctors: 28 Percentage of total topics: 8% Average (mean) of topics started by correctors: 4,67 User IDTopics started 768 637 822 2119 917 Total163 User IDEntries corrected 133 Total 7 Topics started 2715 5615 6215 7912 8014 288 Top-5 topic postersTranslation corrections by users
12
Poster demographic is highly skewed with almost half of the posts by only 5 users. Answering rate to topics quite low No interaction between people through the BBS. Most of the users only post one topic No incentive to use the BBS as a main communication medium for an average user. Translation correction rate low in Japanese. Translation corrections were not found in other supported languages. Translation corrections done by 6 people, 3 by one person. None of the correctors were among the Top-5 posters. Only messages translated from English or Korean to Japanese were modified by users. Even though the amount of people within the organization is distributed 50 / 50 between Japanese and rest of the world, a clear majority of messages are in Japanese. – Posting rate for a Japanese user 3.14 – Posting rate for other users 0.52 Target System Introduction Pangaea Multilingual BBS The answer rate for topics for a control forum with approximately the same amount of users is 5.7 whereas the answer rate in the Pangaea is 0.37. The Top-5 posters in a control forum account for 19.11% of all posts whereas in Pangaea Top-5 posters account for 48%.
13
Research Problems
14
The Pangaea BBS tries to tackle the problem of language in the formation of social capital (communication, common ground, social networks). There are few setbacks in the system regarding the MT. – No method to verify that the translated message was understood Users assume that the translation is understood No reason for translation correction by default. Research Problems Context sharing in multilingual environment – Interaction patterns and awareness of relation. – How does MT affect how users perceive the context? Grounding incrementally – Providing evidence that the message is understood. – How to indicate that the message is understood? How do users see the communication medium? Are the users communicating with a MT or a human? What kind of effect does it have as users are aware that the text is machine translated? Social Dynamics Language
15
Research Problems No entries with German as the source language English is most likely used even in German speaking environment. How are the users affected by the MT quality in terms of lexical entrainment – Does the environment affect the language used? Do people adjust the language they use according to their expectation of the systems performance? MT quality mediocre at best. No incentive for users to correct machine translated sentences bad translations are accepted with high frequency. Social Dynamics Language
16
Research Problems Changing a term: Machine translated sentences correction example. Correcting bad translation: しかし、ゆっくりと、彼らはじめに活動をし始めた、そして、終わりまでに は。。。 しかし、アクティビティがはじまると、彼らは徐々にアクティビティをまじめに 取り組みはじめて。。。 常に一生懸命 に。。。 常に熱心に。。。 彼らが1ヵ月活動に来なくて […] していたからであるだろう。 Changing the wording: 彼らが先月が休みのため […] し ていたためである。 There is a great need for translation correction, but since the MT quality is not up to par the work load is big. No incentive to perform cumbersome translation correction – If the meaning is understood, why bother to correct the translation? – If the MT quality is better in English, why use German?
17
Solution Discussion
18
Development of the database. Add new variables/tables to the database: – Indicate if the translated text is modified by the user and by which user. – Store the messages of all users in relation to the topic (not just the last poster). – Store the original translated messages and modified messages separately. Solution Discussion Development of the user interface. Encourage users to communicate through the system. User friendly UI for translation correction (correction in increments, collaborative translation view). Develop an incentive through the UI for translation correction (display the user name of the corrector). Add a method to display that the received message was understood. Make the machine translation as invisible as possible for the users.
19
Future Research
20
Controlled experiments in collaboration of the target organization – User questionnaire and interview on user preferences and experiences. – Translation correction in a controlled environment. – Participant observation of the interaction between the system and the users. Develop a set of variables for the log data. Develop a log-data viewer for the system. Develop ideas for the system and UI improvement. – More user friendly – Incentives to do tasks with the system – Incentive to use the system for actual communication Categorize translation corrections when more data is available. Future Research With more data on the users and the system analyze the effects of machine translation on CSCW social capital and building and intra- organizational social networks.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.