Carnegie Mellon IRST-itc Balancing Expressiveness and Simplicity in an Interlingua for Task Based Dialogue Lori Levin, Donna Gates, Dorcas Wallace, Kay.

Slides:



Advertisements
Similar presentations
Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.
Advertisements

Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.
MT Evaluation: Human Measures and Assessment Methods : Machine Translation Alon Lavie February 23, 2011.
Social media monitoring, measurement & engagement - Copyright © 2009 Radian65/17/2015.
Using the CEFR in Catalonia Neus Figueras
J. Kunzmann, K. Choukri, E. Janke, A. Kießling, K. Knill, L. Lamel, T. Schultz, and S. Yamamoto Automatic Speech Recognition and Understanding ASRU, December.
Confidential and Proprietary. Copyright © 2010 Educational Testing Service. All rights reserved. Catherine Trapani Educational Testing Service ECOLT: October.
C SC 620 Advanced Topics in Natural Language Processing Lecture 22 4/15.
College Entrance Exams An overview of the SAT I, SAT II, and ACT.
Towards an NLP `module’ The role of an utterance-level interface.
About Hostelsclub.com Hostelsclub.com offers a wide variety of accommodations such as budget hotels, B&Bs, apartments, camping grounds and youth hostels.
Jumping Off Points Ideas of possible tasks Examples of possible tasks Categories of possible tasks.
Properties of Text CS336 Lecture 3:. 2 Information Retrieval Searching unstructured documents Typically text –Newspaper articles –Web pages Other documents.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
Machine Translation Challenges and Language Divergences Alon Lavie Language Technologies Institute Carnegie Mellon University : Machine Translation.
Language Technologies Institute School of Computer Science Carnegie Mellon University NSF August 6, 2001 NICE: Native language Interpretation and Communication.
Funded under the EU ICT Policy Support Programme Automated Solutions for Patent Translation John Tinsley Project PLuTO WIPO Symposium of.
Teaching English to Korean Students Understanding Their Particular Problems.
Speech-to-Speech MT JANUS C-STAR/Nespole! Lori Levin, Alon Lavie, Bob Frederking LTI Immigration Course September 11, 2000.
“Listen & Speak” Activities for Elementary Italian Cristina Pausini, PhD, Lecturer and Coordinator Italian Program, Tufts University May 22, 2013.
1 Global Business Writing Introduction Powerful Business Writing Skills for ESL Writers February 2013.
JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program.
Linguistic Representation of Finnish in the Medical Domain Spoken Language Translation System Marianne Santaholma, University of Geneva, TIM/ISSCO.
Dineshwari Byrappa Nagraj Rashi Gupta Shreya Modi Swati Satija Magesh Panchanathan.
SE 501 Software Development Processes Dr. Basit Qureshi College of Computer Science and Information Systems Prince Sultan University Lecture for Week 8.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
SIG IL 2000 Evaluation of a Practical Interlingua for Task-Oriented Dialogue Lori Levin, Donna Gates, Alon Lavie, Fabio Pianesi, Dorcas Wallace, Taro Watanabe,
The NESPOLE Interchange Format (IF) Lori Levin, Emanuele Pianta, Donna Gates, Kay Peterson, Dorcas Wallace, Herve Blanchon, Roldano Cattoni, Jean-Philippe.
Analysis for Spoken Language Translation Using Phrase-Level Parsing and Domain Action Classification Chad Langley Language Technologies Institute Carnegie.
Carnegie Mellon School of Computer Science Copyright © 2001, Carnegie Mellon. All Rights Reserved. JAVELIN Project Briefing 1 AQUAINT Phase I Kickoff December.
12 February 2003Southwestern Missouri State University1/28 Target Group Analysis Thomas L. Warren, Professor Technical Writing Program Oklahoma State University.
Speech-to-Speech MT in the JANUS System Lori Levin and Alon Lavie Language Technologies Institute Carnegie Mellon University.
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Speech-to-Speech MT Design and Engineering Alon Lavie and Lori Levin MT Class April
© 2011 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Stewart Venit ~ Elizabeth Drake Developing a Program.
A Multi-Perspective Evaluation of the NESPOLE! Speech-to-Speech Translation System Alon Lavie, Carnegie Mellon University Florian Metze, University of.
Web User Controls This presentation will cover the basics of defining, creating and using a web user control. Presented to Twin Cities.NET user group By.
New RCLayout. Do product layout 3 improvements All products Local databases New functionalities.
Nespole!’s Experiment on Multimodality (Summer 2001) Erica Costantini (University of Trieste) Fabio Pianesi (ITC-irst, Trento) Susanne Burger (CMU)
AVENUE Automatic Machine Translation for low-density languages Ariadna Font Llitjós Language Technologies Institute SCS Carnegie Mellon University.
MT with an Interlingua Lori Levin April 13, 2009.
Project Overview Vangelis Karkaletsis NCSR “Demokritos” Frascati, July 17, 2002 (IST )
CS221 Algorithm Basics. What is an algorithm? An algorithm is a list of instructions that transform input information into a desired output. Each instruction.
Profiling Web Archive Coverage for Top-Level Domain & Content Language Ahmed AlSum, Michele C. Weigle, Michael L. Nelson, and Herbert Van de Sompel International.
Speech-to-Speech MT in NESPOLE! Design and Engineering Alon Lavie, Lori Levin Work with: Chad Langley, Tanja Schultz, Dorcas Wallace, Donna Gates, Kay.
Designing a Machine Translation Project Lori Levin and Alon Lavie Language Technologies Institute Carnegie Mellon University CATANAL Planning Meeting Barrow,
CS562 Advanced Java and Internet Application Introduction to the Computer Warehouse Web Application. Java Server Pages (JSP) Technology. By Team Alpha.
Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
1 USC INFORMATION SCIENCES INSTITUTE EXPECT TEMPLE: TEMPLate Extension Through Knowledge Acquisition Yolanda Gil Jim Blythe Information Sciences Institute.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
11/23/00UNU/IAS/UNL Centre1 The Universal Networking Language United Nations University Institute of Advanced Studies United Networking Language ® UNU/IAS.
Recent Advances in Speech Translation Systems ESSLLI-2002 Tutorial Course August 12-16, 2002 Course Organizers: Alon Lavie – Carnegie Mellon University.
FUNCTIONAL PROGRAMING AT WORK - HASKELL AND DOMAIN SPECIFIC LANGUAGES Dr. John Peterson Western State Colorado University.
Parsing into the Interlingua Using Phrase-Level Grammars and Trainable Classifiers Alon Lavie, Chad Langley, Lori Levin, Dorcas Wallace,Donna Gates and.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Agent-Based Dialogue Management Discourse & Dialogue CMSC November 10, 2006.
Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers Chad Langley, Alon Lavie, Lori Levin, Dorcas Wallace, Donna Gates and Kay.
GBIF NODES Committee Meeting Copenhagen, Denmark 4 th October 2009 The GBIF Integrated Publishing Toolkit Alberto GONZÁLEZ-TALAVÁN Programme Officer for.
Using the Automatic Captions Feature. Objectives Learn how to use the Automatic Captions feature in YouTube  Edit the generated captions  Extract the.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
Workshop on Hotel Reservation System
International Students’ Experiences: Examining their Sociocultural Adjustment Kelly Torres, Ph.D.
Measuring Monolinguality
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Language Technologies Institute Carnegie Mellon University
Lecture 2 Introduction to Programming
Chapter 1 Introduction(1.1)
Joint Application Development (JAD)
Presentation transcript:

Carnegie Mellon IRST-itc Balancing Expressiveness and Simplicity in an Interlingua for Task Based Dialogue Lori Levin, Donna Gates, Dorcas Wallace, Kay Peterson, Alon Lavie, Fabio Pianesi, Emanuele Pianta, Roldano Cattoni, Nadia Mana

Carnegie Mellon IRST-itc Outline Overview of the Interchange Format (IF) Proposals for Evaluating Interlinguas –Measuring coverage –Measuring reliability –Measuring scalability

Carnegie Mellon IRST-itc Multilingual Translation with an Interlingua Japanese Arabic Chinese (input sentence) San1 tian1 qian2, wo3 kai1 shi3 jue2 de2 tong4 English French German Italian Korean Arabic Chinese (paraphrase) wo3 yi3 jin1 tong4 le4 san1 tian1 English (output sentence) The pain started three days ago. French German Italian Japanese Korean Analyzers Generators Spanish Catalan Interlingua give-information+onset+body-state (body-state-spec=pain, time=(interval=3d, relative=before))

Carnegie Mellon IRST-itc Advantages of Interlingua Add a new language easily –get all-ways translation to all previous languages by adding one grammar for analysis and one grammar for generation Mono-lingual development teams. Paraphrase –Generate a new source language sentence from the interlingua so that the user can confirm the meaning

Carnegie Mellon IRST-itc Disadvantages of Interlingua “Meaning” is arbitrarily deep. –What level of detail do you stop at? If it is too simple, meaning will be lost in translation. If it is too complex, analysis and generation will be too difficult. Should be applicable to all languages. Human development time.

Carnegie Mellon IRST-itc Speech Acts: Speaker intention vs literal meaning Can you pass the salt? Literal meaning: The speaker asks for information about the hearer’s ability. Speaker intention: The speaker requests the hearer to perform an action.

Carnegie Mellon IRST-itc Domain Actions: Extended, Domain-Specific Speech Acts give-information+existence+body-state It hurts. give-information+onset+body-object The rash started three days ago. Request-information+personal-data What is your name?

Carnegie Mellon IRST-itc Domain Actions: Extended, Domain-Specific Speech Acts In domain. –I sprained my ankle yesterday. –When did the headache start? Out of Domain –Yesterday I slipped in the driveway on my way to the garage. –The headache started after my boss noticed that I deleted the file.

Carnegie Mellon IRST-itc Formulaic Utterances Good night. tisbaH cala xEr waking up on good Romanization of Arabic from CallHome Egypt

Carnegie Mellon IRST-itc Same intention, different syntax rigly bitiwgacny my leg hurts candy wagac fE rigly I have pain in my leg rigly bitiClimny my leg hurts fE wagac fE rigly there is pain in my leg rigly bitinqaH calya my leg bothers on me Romanization of Arabic from CallHome Egypt.

Carnegie Mellon IRST-itc Outline Overview of the Interchange Format (IF)  Proposals for Evaluating Interlinguas –Measuring coverage –Measuring reliability –Measuring scalability

Carnegie Mellon IRST-itc Comparison of two interlinguas I would like to make a reservation for the fourth through the seventh of July. IF-1 (C-STAR II, ) c:request-action+reservation+temporal+hotel (time=(start-time=md4, end-time=(md7,july))) IF-2 (NESPOLE, ) c:give-information+disposition+reservation +accommodation (disposition=(who=I, desire), reservation-spec=(reservation, identifiability=no), accommodation-spec=hotel, object-time=(start-time=(md=4), end-time=(md=7, month=7, incl-excl=inclusive)))

Carnegie Mellon IRST-itc The Interchange Format Database olang I lang I Prv IRST “telefono per prenotare delle stanze per quattro colleghi” olang I lang E Prv IRST “I’m calling to book some rooms for four colleagues” IF Prv IRST c:request-action+reservation+features+room (for-whom= (associate, quantity=4)) comments: dial-oo5-spkB-roca0-02-3

Carnegie Mellon IRST-itc Comparison of four databases (travel domain, role playing, spontaneous speech) DB-1: C-STAR II English database tagged with IF-1 –2278 sentences DB-2: C-STAR II English database tagged with IF-2 – 2564 sentences DB-3: NESPOLE English database tagged with IF-2 – 1446 sentences –Only about 50% of the vocabulary overlaps with the C-STAR database. DB-4: Combined database tagged with IF-2 –4010 sentences Same data, different interlingua Significantly larger domain

Carnegie Mellon IRST-itc Outline Overview of the Interchange Format (IF) Proposals for Evaluating Interlinguas  Measuring coverage –Measuring reliability –Measuring scalability

Carnegie Mellon IRST-itc Measuring Coverage No-tag rate: –Can a human expert assign an interlingua representation to each sentence? –C-STAR II no-tag rate: 7.3% –NESPOLE no-tag rate: 2.4% 300 more sentences were covered in the C-STAR English database End-to-end translation performance: Measures recognizer, analyzer, and generator performance in combination with interlingua coverage.

Carnegie Mellon IRST-itc Outline Overview of the Interchange Format (IF) Proposals for Evaluating Interlinguas Measuring coverage  Measuring reliability –Measuring scalability

Carnegie Mellon IRST-itc Example of failure of reliability Input: 3:00, right? Interlingua: verify (time=3:00) Poor choice of speech act name: does it mean that the speaker is confirming the time or requesting verification from the user? Output: 3:00 is right.

Carnegie Mellon IRST-itc Measuring Reliability: Cross-site evaluations Compare performance of: –Analyzer  interlingua  generator –Where the analyzer and generator are built at the same site (or by the same person) –Where the analyzer and generator are built at different sites (or by different people who may not know each other) C-STAR II interlingua: comparable end-to-end performance within sites and across sites. –around 60% acceptable translations from speech recognizer output. NESPOLE interlingua: cross-site end-to-end performance is lower.

Carnegie Mellon IRST-itc Intercoder agreement: average of percent agreeent pairwise Speech actDomain ActionArguments IF-1: Site 1 and Site2 (exp.) 82%66%86% IF-2: Site 1 and Site 2 (4 experts) 92%75%87 % IF-2: Within Site 1 (3 experts) 94%88%90% IF-2: Site 1 vs Site 2 (3 experts and 1 experts) 89%62%83% IF-2: Site 1 and Site 2 (experts and novices) 88%63%86% IF-2: Within Site 2 (expert and novices) 89 %64 %87% IF-2: Within Site 2 (novices) 91%61%83%

Carnegie Mellon IRST-itc Workshop on InterlinguaReliability SIG-IL Association for Machine Translation in the Americas October 8, 2002 Tiburon, California Submissions by July 21: – word abstract ( to –Intent to participate in coding experiment

Carnegie Mellon IRST-itc Outline Overview of the Interchange Format (IF) Proposals for Evaluating Interlinguas Measuring coverage Measuring reliability  Measuring scalability

Carnegie Mellon IRST-itc Comparison of four databases (travel domain, role playing, spontaneous speech) DB-1: C-STAR II English database tagged with IF-1 –2278 sentences DB-2: C-STAR II English database tagged with IF-2 – 2564 sentences DB-3: NESPOLE English database tagged with IF-2 – 1446 sentences –Only about 50% of the vocabulary overlaps with the C-STAR database. DB-4: Combined database tagged with IF-2 –4010 sentences Same data, different interlingua Significantly larger domain

Carnegie Mellon IRST-itc Measuring Scalability: Coverage Rate What percent of the database is covered by the top n most frequent domain actions? Coverage of 50 most frequent domain actions C-STAR client66.7% NESPOLE client66.5% Combined client62.9% C-STAR agent67.3% NESPOLE agent71.4% Combined agent64.0%

Carnegie Mellon IRST-itc Measuring Scalability: Number of domain actions as a function of database size Sample size from 100 to 3000 sentences in increments of 25. Average number of unique domain actions over ten random samples for each sample size. Each sample includes a random selection of frequent and infrequent domain actions.

Carnegie Mellon IRST-itc

Carnegie Mellon IRST-itc Conclusions An interlingua based on domain actions is suitable for task-oriented dialogue: –Reliable –Good coverage –Scalable without explosion of domain actions It is possible to evaluate an interlingua for –Realiability –Expressivity –Scalability