The AMITIÉS Corpus up to the minute report. The GE English corpus Around 716 English dialogues were received so far from GE Leeds of which 642 are “good.

Slides:



Advertisements
Similar presentations
Author 1 | Author 2 | Author 3 (edit this list on View > Master > Slide Master. In Office 2007 use View > Slide Master) Go to View > Master > Slide Master.
Advertisements

Using Journal and Other Tablet PC Tools. Tools Bars in Journal To access all tool bars click on view and select each tool bar to activate each.
The eXtensible Markup Language (XML) An Applied Tutorial Kevin Thomas.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 15 Programming and Languages: Telling the Computer What to Do.
1 07-Apr-11 ProjTrak: Import/Export. Import / Export Basics Import Scenarios General Import Steps Summary / Tips Topics: 2 07-Apr-11.
© NCSR, Paris, December 5-6, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Enrich the lexicons for the 1 st domain based on partners remarks.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
HTML Overview - Cascading Style Sheets (CSS). Before We Begin Make a copy of one of your HTML file you have previously created Make a copy of one of your.
How To Complete the “Outcomes Assessment” Template in Microsoft Office Excel.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
ENERTECH REMODECE REMODECE MEETING September, the 24th 2007.
IDK0040 Võrgurakendused I RSS 2.0 Deniss Kumlander.
Computers: Tools for an Information Age
Database „Multilingualism“ – Perspectives for collaborative corpus construction and collaborative commentary Thomas Schmidt Sonderforschungsbereich 538.
P.6 English Lesson Greetings How do you do, I’m Peter Chan? How do you do, Mr Chan? I’m very pleased to meet you. people meet at the first time.
EXPLANATION: ULAT lesson 1.12 is a very long lesson, containing both a presentation of new vocabulary and extensive grammatical review activities. Assuming.
EDI or DIE Stuart Richler President G.T.R. Data Inc. Session E729.
FILING SYSTEMS Research Data Management. Filing is more than saving files, it’s making sure you can find them later in your project. Naming Directory.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
 Introduction to XML Introduction to XML  Features of XML Features of XML  Syntax of XML Syntax of XML  Syntax rules of XML document Syntax rules.
Creating your COLOR THEORY in Art PowerPoint Presentation This presentation was put together for you by Mr. Montgomery, art instructor George Washington.
Presented By: Tehmina Farrukh Topic: Long Report Writing
A Web Application for Customized Corpus Delivery Nancy Ide, Keith Suderman, Brian Simms Department of Computer Science Vassar College USA.
Programming Project (Last updated: August 31 st /2010) Updates: - All details of project given - Deadline: Part I: September 29 TH 2010 (in class) Part.
Session IV Chapter 9 – XML Schemas
1 The Software Development Process  Systems analysis  Systems design  Implementation  Testing  Documentation  Evaluation  Maintenance.
1/(13) Using Corpora and Evaluation Tools Diana Maynard Kalina Bontcheva
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Dialogue Annotation at SUNY Hilda Hardy Amitiés Annotation Workshop LIMSI-CNRS, France July 8-10, 2002.
An Introduction to Office  Office XP and Office 2007 look very different  This introduction should: ◦ Introduce you to some of the basic changes.
Final Paper Format And a few other tips… The Cover Page Your cover page should include: The title of your paper Use a 12 point font Center it about a.
Keep It Simple! Learn From Your Mistakes/Experiences! Think through the Lessons Learned of your previous actions/plans. Write It Down! You will never remember.
English – All Sections Mr. De Vito English – All Sections Mr. De Vito - 2 -
Data Structure: Data Modeling or XML? Metatopia 2007 Washington, D.C. November 6, 2007 David C. Hay Essential Strategies, Inc. 13 Hilshire Grove Lane,
The Games Corpus Design, implementation and annotation Agustín Gravano Spoken Language Processing Group Columbia University.
1 Chapter 7: Customizing and Organizing Project Results 7.1 Combining Results 7.2 Updating Results 7.3 Customizing the Output Style (Self-Study)
Televox 2013 Welcome to the Televox 2013 Presentation. In this presentation, we’ll be discussing the re-implementation of Televox as our preferred method.
Word Editing Tools. Word Automatic Editing Tools §Word has three features that automatically change or insert text and graphics as you type §You can easily.
FAIRTRADE FOUNDATION OCR Nationals in ICT Unit 1 ICT Skills for Business AO6.
English 174.  Conducted individually  Main points of your research project  For: ◦ your English teacher ◦ Your classmates  5 to 8 minutes.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
Copyright © 2007, Oracle. All rights reserved. Using Document Management and Collaboration Appendix B.
Shock Progress & Direction. MetaMap Tokenized words for Mohammed – Enables him to test his new models for Pattern matcher Mallet Training Data for Laura.
Unit 6 Apologies (II) Step 1: Opening Remarks Step 2: Explain listening skills in Unit 6 Part A : Micro-Listening Listening for Details.
Page ID: Course Menu Audio text: Welcome to the Metso Service Desk Customer Service training. In this first module, you will be introduced to the three.
Click on “My Courses”. Please note that only summative assignments can be uploaded on the new virtual campus. Formative assignments are now available online.
3.1 Fundamentals of algorithms
Solving Linear Equations in Two Variables
Setting up University of Reading Malaysia's Enterprise By Sam Tyler
SAP Business One B1iF Training
This is an example of a short, informative split over two line
European Computer Driving Licence
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Word Editing Tools.
How To Navigate Your Way Around GoMentum
General Partnership Click on audio to begin presentation. All other audio will begin automatically.
Chromebook Training.
Online Testing System Assessment Viewing Application (AVA)
This is an example of a short, informative split over two lines
PERFORM ECONOMIC ANALYSIS
HELLO THERE. THIS IS A TEST SLIDE SLIDE NUMBER 1.
PERFORM ECONOMIC ANALYSIS
Online Testing System Assessment Viewing Application (AVA)
Arrays .
Test Correction Guidelines
SpecialServices Administrator What’s New in 16.0
European Computer Driving Licence
Hello Hi Hello Hi Hello Hi Hello Hi Hello Hi Hello Hi.
5. SDMX: General input requirements
Classroom Accommodations
Presentation transcript:

The AMITIÉS Corpus up to the minute report

The GE English corpus Around 716 English dialogues were received so far from GE Leeds of which 642 are “good ones”. Around 716 English dialogues were received so far from GE Leeds of which 642 are “good ones”. The GE transcribers use the Transcriber tool version to deliver ( *.TRS ) documents based on an XML syntax The GE transcribers use the Transcriber tool version to deliver ( *.TRS ) documents based on an XML syntax

Good things The TRS documents being XML based are very suitable for automatic processing and delivering of the format we are interested in (DAMSL like for example ). The TRS documents being XML based are very suitable for automatic processing and delivering of the format we are interested in (DAMSL like for example ). The transcribers successfully applied the AMITIES guidelines for transcribing. The transcribers successfully applied the AMITIES guidelines for transcribing.

Issues They started to transcribe the audio files using the Turn and Utterance levels of annotation provided by the Transcriber tool. They started to transcribe the audio files using the Turn and Utterance levels of annotation provided by the Transcriber tool. We noticed that some strange situations like:overlapping, acknowledging, completion failed to be represented correctly in the received TRS documents. We noticed that some strange situations like:overlapping, acknowledging, completion failed to be represented correctly in the received TRS documents.

Solution and examples Making use of the third logical level of annotation provided by the Transcriber, called Section. Making use of the third logical level of annotation provided by the Transcriber, called Section. The transcribers were required to create a new Section level called “exception” and to use it to encapsulate all the Turns containing one of the situations described previously. The transcribers were required to create a new Section level called “exception” and to use it to encapsulate all the Turns containing one of the situations described previously.

Example of overlapping BEFORE using the “exception” section AFTER using the “exception” section DAMSL LIKE annotation A: That’s [lovely](1) my name’s Louise Mr Smith and you want to change address? C: [Hello](1) DAMSL LIKE annotation A: That’s A: [lovely](1) C: [Hello](1) A: my name’s Louise Mr Smith and you want to change address?

Example of acknowledging similar to completion BEFORE using the “exception” section AFTER using the “exception” section DAMSL LIKE annotation A: And your telephone number please? C: A: Uh hmmm C: 111 A: Uh hmmm C: DAMSL LIKE annotation A: And your telephone number please? C: [](1) 111 [](2) A: [Uh hmmm](1) [Uh hmmm](2)

Addition facts The Turns that were not considered to be exceptions were encapsulated by the default Section. The Turns that were not considered to be exceptions were encapsulated by the default Section. We trained the transcribers to use this logical level and the last 100 dialogues received are annotated with the “exception” level. We trained the transcribers to use this logical level and the last 100 dialogues received are annotated with the “exception” level. 542 dialogues are not annotated with this level. 542 dialogues are not annotated with this level.

A rough classification of the corpus English Amities corpus 716 moreThanTwoPartiesDlgs 40 oneDlgPerFile 35 Annot With Exception 11 Annot Without Exc 24 multipleDlgsPerFile 5 Annot With Exception 1 Annot Without Exc 4 twoPartiesDlgs 673 oneDlgPerFile 642 Annot With Exception 100 Annot Without Exc 542 multipleDlgsPerFile 31 Annot With Exception 6 Annot Without Exc 25 noPartiesDlgs 3 oneDlgPerFile 3 Annot Without Exc 3

Task distribution inside the 100 exception annotated dialogues

Thank you.