SIG IL 2000 Evaluation of a Practical Interlingua for Task-Oriented Dialogue Lori Levin, Donna Gates, Alon Lavie, Fabio Pianesi, Dorcas Wallace, Taro Watanabe, Monika Woszczyna
SIG IL 2000 Interchange Format Design The CSTAR II Interchange Format was designed and developed by all of the CSTAR II partners: CMU, IRST, ETRI, UKA, CLIPS++ and ATR.
SIG IL 2000 Expressivity vs Simplicity If it is not expressive enough, components of meaning will be lost. If it is not simple enough, it can’t be used reliably across sites. If it is not simple enough, it will not be quickly portable to new domains.
SIG IL 2000 Task Oriented Sentences Perform an action in the domain. Are not descriptive. Contain fixed expressions that cannot be translated literally. Instructions: Delete sample document icon and replace with working document icons as follows: Create document in Word. Return to PowerPoint. From Insert Menu, select Object… Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked. Click OK Select icon From Slide Show Menu, Select Action Settings. Click “Object Action” and select “Edit” Click OK
SIG IL 2000 Domain Actions: Extended, Domain-Specific Speech Acts Examples: c:request-information+availability+room a:give-information+personal-data c:give-information+temporal+arrival Instructions: Delete sample document icon and replace with working document icons as follows: Create document in Word. Return to PowerPoint. From Insert Menu, select Object… Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked. Click OK Select icon From Slide Show Menu, Select Action Settings. Click “Object Action” and select “Edit” Click OK
SIG IL 2000 Components of the Interchange Format a: speaker a: (agent) give-information speech act give-information +availability+room concept* +availability+room (room-type=(single & double), argument* (room-type=(single & double), time=md12) Instructions: Delete sample document icon and replace with working document icons as follows: Create document in Word. Return to PowerPoint. From Insert Menu, select Object… Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked. Click OK Select icon From Slide Show Menu, Select Action Settings. Click “Object Action” and select “Edit” Click OK
SIG IL 2000 Examples no that’s not necessary c:negate yes I am c:affirm and I was wondering what you have in the way of rooms available during that time c:request-information+availability+room my name is alex waibel c:give-information+personal-data (person-name=(given-name=alex, family-name=waibel)) and how will you be paying for this a:request-information+payment (method=question) I have a mastercard c:give-information+payment (method=mastercard) Instructions: Delete sample document icon and replace with working document icons as follows: Create document in Word. Return to PowerPoint. From Insert Menu, select Object… Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked. Click OK Select icon From Slide Show Menu, Select Action Settings. Click “Object Action” and select “Edit” Click OK
SIG IL 2000 Not Covered or Not Represented in IF Relative clauses Comparatives (in general) Tense Number (but quantity is represented)
SIG IL 2000 Scope of the IF May 1999 Speech acts 54 Concepts 84 Arguments 118 Instructions: Delete sample document icon and replace with working document icons as follows: Create document in Word. Return to PowerPoint. From Insert Menu, select Object… Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked. Click OK Select icon From Slide Show Menu, Select Action Settings. Click “Object Action” and select “Edit” Click OK
SIG IL 2000 Expressivity: Coverage Experiment Development data was tagged with interlingua representations by human experts. Sentences that are not intended to be covered by the interlingua (as judged by human experts) were given the tag “no-tag.” Test data was tagged by human experts.
SIG IL 2000 Coverage Experiment: Development and Test Data
SIG IL 2000 The Interchange Format Database olang I lang I Prv IRST “telefono per prenotare delle stanze per quattro colleghi” olang I lang E Prv IRST “I’m calling to book some rooms for four colleagues” IF Prv IRST c:request-action+reservation+features+room (for-whom= (associate, quantity=4)) comments: dial-oo5-spkB-roca d.u.sdu olang X lang Y Prv Z “sdu-in-language-Y on one line” d.u.sdu olang X lang E Prv Z “sdu-in-English on one line” d.u.sdu IF Prv Z dialogue-act-on-one-line d.u.asdu comments: your comments d.u.asdu comments: go here
SIG IL 2000 Coverage of Top 10 Dialogue Acts in Development Data
SIG IL 2000 Coverage of Top 10 Speech Acts in Development Data
SIG IL 2000 Coverage of Top 10 Dialogue Acts in Test Data
SIG IL 2000 Coverage of Top 10 Speech Acts in Test Data
SIG IL 2000 Simplicity: Consistency of Use Across Sites Successful international demo. After testing English-Italian and English- Korean, Italian-Korean worked without extra effort. Inter-coder agreement experiment Cross-site evaluation experiment
SIG IL 2000 Inter-coder Agreement Experiment 84 DA units from Japanese-English data Some dialogue fragments and some isolated sentences Coded at CMU and IRST Results reported in percent agreement
SIG IL 2000 Inter-Coder Agreement Resuts Speech Act Concept List Dialogue Act Argument List 85.79
SIG IL 2000 Inter-Coder Agreement Error Analysis of 33 Sentences 6 are equivalent due to ambiguity in the IF specification. 16 are similar enough to produce output with equivalent meaning. –offer-search+availability: Let me check the availability –give-information+search+availability: I will check the availability 4 contain differences where the input sentence was ambiguous and taggers chose different meanings. –6 o’clock could be 6:00 or 18:00 5 contain errors by one or more taggers and would produce outputs with different meanings
SIG IL 2000 Cross-Site Evaluation Analysis and generation grammars were written at different sites (CMU and IRST). Analysis at CMU produces IF. IF is sent to IRST. Generation at IRST produces Italian sentences.
SIG IL 2000 Intra-Site Evaluation Analysis and generation are both performed at CMU by researchers in constant contact with each other. English-IF-English, English-German, and English-Japanese
SIG IL 2000 Cross Site Evaluation Data 130 utterances from a user study performed at CMU Speech input “Traveller” is a second time user. “Agent” is a system developer. Traveller and agent cannot see or hear each other. All communication is through English-IF- English paraphrase.
SIG IL 2000 Evaluation Scoring OK: meaning is preserved Perfect: meaning is preserved and the output is fluent Bad: meaning is not preserved Acceptable: Sum of Perfect and OK English-German was graded at CMU, IRST and CLIPS. English-IF-English was graded at CMU and CLIPS English-Japanese was graded at CMU. English-Italian was graded at IRST. English-French was graded at CLIPS
SIG IL 2000 End-to-End Evaluation Results
SIG IL 2000 End-to-End Evaluation Results
SIG IL 2000 Conclusions Coverage is surprisingly good for a certain type of data: role playing for flight reservations, hotel reservations, greetings, and payment. Cross-site evaluation is about as good as intra-site evaluation. Inter-coder agreement could be improved, but not all errors affect translation quality.
SIG IL 2000 Current Work Integrating the task-oriented interlingua with a more traditional frame-based interlingua for descriptive sentences. The NEPSOLE! Consortium: