AUTOMATIC TRANSLATION UTILITY Fostering language diversity and participation Juan Dolio, DR, 11-14 November 2008 Stéphane Bruno, AHTIC/CONSORTIUM CARISNET.

Slides:



Advertisements
Similar presentations
Examination tests: not only reading and understanding
Advertisements

Different Types of Communication
KATIGSUINIQ A compilation and review of Environmental Contaminants Terminology in Inuktitut Jamal Shirley Nunavut Research Institute Presentation to the.
English Overview The Rules of Formal Writing. Common concern I know what I want to say but I can’t write it down! If you can speak English, you can write.
Introduction to Computational Linguistics
The WRITE Way To Engaged Written Communication Dr. Marie M. Schein Texas Christian University
The National Spanish Teachers’ Association Quiz Competition Prepared by: Ramonia E Smith.
How do we work in a virtual multilingual classroom? A virtual multilingual classroom with Moodle and Apertium Cultural and Linguistic Practices in the.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
R EAD & W RITE G OLD : T EXT H ELP S YSTEMS I NC.: T EXT TO S PEECH S OFTWARE By: Ashley, Kathryn, Rine, and Samantha.
 2003 CSLI Publications Ling 566 Oct 16, 2007 How the Grammar Works.
Lecture (2-1) Interpreting
The OSI Model A layered framework for the design of network systems that allows communication across all types of computer systems regardless of their.
MACHINE TRANSLATION A precious key to communicate beyond linguistic barriers 1.
WRITING EFFECTIVE S. Before writing the Make a plan! Think about the purpose of the Think about the person who will read the and.
NEW PRACTICAL CHINESE READER Learn-Chinese Textbooks for College Students.
Background on USPS mail forwarding operations Overview of PARS
Natural Language Processing Neelnavo Kar Alex Huntress-Reeve Robert Huang Dennis Li.
Word Processing Standard Grade Computing LA/LM. Word processor a computer program that allows you to manipulate text What is?
Application Protocols: ELECTRONIC MAIL (SMTP, POP) CSNB534 Semester 2, 2007/2008 Asma Shakil.
ETI 102 Introduction to Translation Translation as a process and a product.
What is Professor Crystal discussing? What are the structural elements of an ? PFnesV4
FLAVIUS Presentation of Softissimo WP1 Project Management.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
KU Chapter 1 – Introduction to Translation Basic Translation Methods
TIPS AND HINTS FOR STUDYING SPANISH. HINTS FOR LISTENING COMPREHENSION When you listen to a person speaking Spanish, you don’t have to try to understand.
Constructing Your Own Corpus from Written Language.
CHANNEL MANAGERS’ TRAINING The CIVIC web 2.0 platform Juan Dolio, DR, November 2008 Stéphane Bruno,
FishBase Summary Page about Salmo salar in the standard Language of FishBase (English) ENBI-WP-11: Multilingual Access to European Biodiversity Sites through.
CSCI 6962: Server-side Design and Programming Introduction to Java Server Faces.
Communication in Mother tongue This project has been funded with support from the European Commission. This [publication] communication reflects the views.
Teaching language means teaching the components of language Content (also called semantics) refers to the ideas or concepts being communicated. Form refers.
ELIZABETH SMITH ENGLISH LANGUAGE FELLOW English for Professional Purposes: A Linguistic Analysis of Professional Communications in English.
1 Knowledge & Knowledge Management “Knowledge is power” to “Sharing K is power” Yaseen Hayajneh, PhD.
The Direct Method has one very basic rule: No translation is allowed.
Chapter 2 Copyright © 2015 Cengage Learning Team and Intercultural Communication.
Tracking Changes in MS Word. Track Changes Allows you to keep track of the changes you make to a document Extremely helpful when more than one person.
WI Global Web System April 2012 Basic Presentation.
Interpretation: What is it? what someone does that makes it possible for you to understand what a third person is saying (or signing) when that person.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
CARISNET- Strengthening the Caribbean ICT Stakeholders Virtual Community (CIVIC ) Prepared for CARDICIS 2, Juan Dolio, Dominican Republic Dec 5-7, 2005.
1 User Interface Design Components Chapter Key Definitions The navigation mechanism provides the way for users to tell the system what to do The.
Bernd Bruegge & Allen Dutoit Object-Oriented Software Engineering: Conquering Complex and Changing Systems 1 Software Engineering November 7, 2001 Project.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Five Tips for Effective Business Writing Copyright © 2012 RedLine Language Services LLC Unauthorized reproduction and distribution prohibited.
© Prentice Hall, 2005 Business Communication Today 8eChapter Communicating Interculturally.
Module 8 Teaching English Learners
Elena Shapa Doctor, Associate Professor, Moldova State University
Web-based Front End for Kraken Jing Ai Jingfei Kong Yinghua Hu.
UNIT 4 GOOD TO HEAR FROM YOU AGAIN! Business English.
© Prentice Hall, 2007 Excellence in Business Communication, 7eChapter Communicating Interculturally.
New secondary curriculum overview Use of target language Key aspects of change to KS3 practice.
E MAIL Expectations and Assignment. W HY E MAIL ? ( IT ’ S SOOOO OLD !) is a 21 st century business communication tool that is vitally important.
Word Processing1. 2 Word Processing f What you need to know about: –entering text; –word-wrap; –alter text alignment; –line spacing –alter text style.
10 BEST APPS TO CORRECT THE SENTENCE. When you are writing anywhere else it can be frustrating to find the best app that can correct the sentences online.
ELL353 Welcome to Week #3 Dr. Holly Wilson. This Week’s Assignments 1. Readings 2. Discussion #1: Teaching Vocabulary 3. Discussion #2: Vocabulary Lesson.
Communicative Language Teaching (CLT)
How to use an Interpreter IMPROVING HEALTH OUTCOMES FOR EXPATRIATES IN AZUAY.
Netiquette for GainsNet: w Gains Net is the mailing list for network members of UN-INSTRAW-GAINS. w The purpose of this list is: w -to facilitate communication.
Markéta Tomanová. Content Duties Benefits (course)
Language Development Among Children of Linguistic Diversity.
Introduction Chomsky (1984) theorized that language is an innate ability ingrained in all humans as expressed by universal grammar. Later, Mitchell and.
Exam Practice Paper 1 AO1: Apply appropriate methods of language analysis, using associated terminology and coherent written expression. AO2: Demonstrate.
Welcome to the Year 3 and 4 English Curriculum
How Do We Translate? Methods of Translation The Process of Translation.
ICT Word Processing Lesson 5: Revising and Collaborating on Documents
Professional s Your Name.
Chapter 5 Technical Communication in a Transnational World
Functionalism: the translation process is guided by extra-linguistic factors Texts are embedded in situations or contexts that consist of non-linguistic.
Markéta Tomanová.
Presentation transcript:

AUTOMATIC TRANSLATION UTILITY Fostering language diversity and participation Juan Dolio, DR, November 2008 Stéphane Bruno, AHTIC/CONSORTIUM CARISNET

LANGUAGE STATS

FACTS English is the dominant language in CIVIC discussions Non-English speaking members that are not fluent in English (or do not speak at all) are reluctant to contribute Manual (Human) translation of all and forum communications is impossible and way too costly Systematic human translation would also delay interactions

CIVIC APPROACH TO LANGUAGE DIVERSITY Three official languages: English, French, Spanish All documents and “official” communications are translated in all three languages, (the original language document being the legally binding one?) Simultaneous translation is provided in face-to- face meetings for plenary sessions when the number of the language group and its needs justify the cost Automatic translation of s is provided to facilitate comprehension and contribution by all language groups

OBJECTIVES OF THE AUTOMATIC TRANSLATION Provide the opportunity for all members to get the essence of all communications in all three official CIVIC languages Make the translation non disruptive, as seamless and as user-friendly as possible Allow an improvement of the translation overtime Construct a contextual terminology and linguistic environment for CIVIC on its field of intervention

HOW IT WORKS

THE TRANSLATION MECHANISMS When a mail arrives, the software breaks the into paragraphs The software tries to guess the language of the paragraph If it cannot guess the language, it assumes it is English Then the software preprocess the paragraph through the knowledgebase Then each paragraph is sent to the translation service (Babelfish) and the result is retrieved for each language pair The resulting paragraph is post-processed Then the is reconstructed and sent to the mailing list manager

INPUT REQUIREMENTS Use simple language constructs Use complete sentences and correct grammar and syntax Avoid abbreviations, metaphors and idiomatic expressions Avoid proverbs and sayings Do not mix languages in same paragraph (as translation is done paragraph by paragraph, and language is guessed)

OTHER FEATURES If you want some words not to be translated, enclose them in “*”, like *CIVIC* The knowledgebase allows to enter in a database how some words are to be translated to override the translation of the translation service, for example, to say ICT is translated TIC in French and Spanish and vice cersa This allows to build a lexicon or linguistic construct in the context of CIVIC and ICT4D

LIMITATIONS The less lengthy a paragraph is, the less accurate is the guessing of the language of the text. So, introductory paragraphs like greetings or opening, single-words texts will usually be wrongly or not translated at all The current version works only with plain text messages. The final version will try to convert HTML-formatted s to plain text before processing them The utility relies on Babelfish without a formal agreement (since it is free) and for which Babelfish was not designed. So, it is vulnerable to the slightest changes on the Babelfish web site

THINGS TO RESOLVE The character encoding issues Who will manage the knowledgebase? How words are entered into the database? How it is decided?