LREC 2016, Portoroz, 25-27 May The DialogBank Harry Bunt1, Volha Petukhova2, Andrei Malchanau2, Alex Chengyu Fang3 and Kars Wijnhoven1 1Tilburg.

Slides:



Advertisements
Similar presentations
Status on the Mapping of Metadata Standards
Advertisements

The semantics of dialogue acts Harry Bunt Oxford, IWCS 2011.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.
Click to edit the title text format Basics of Authoring TuTalk Dialogues Pamela Jordan University of Pittsburgh Learning Research and Development Center.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Multidimensional dialogue act annotation using ISO Harry Bunt Tilburg University ISO Project Leader IJCNLP 2011 tutorial, November 8, Chiang.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
What Linguists Want (we think) Helen Aristar Dry & Anthony Aristar LINGUIST List & E-MELD.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Semantic annotation framework Part 2: Dialogue acts ISO/TC37/SC4 N442 rev00 Harry Bunt Tilburg University ISO TC 37/SC 4 meeting Marrakech, May 25, 2008.
OPeNDAP and the Data Access Protocol (DAP) Original version by Dave Fulker.
Towards an integrated scheme for semantic annotation of multimodal dialogue data Volha Petukhova and Harry Bunt.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
ISO Project Semantic Annotation Framework, Part 2: Dialogue Acts Editorial Group first meeting Pisa, September 2008 TC 37/SC 4/WG 2 Kiyong.
CODA – CATCHPlus Open Document Annotation Hennie Brugman OAC II Project Review meeting Chicago – July 26-27, 2012.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
Chapter 7 Relational Algebra. Topics in this Chapter Closure Revisited The Original Algebra: Syntax and Semantics What is the Algebra For? Further Points.
Collaborative Annotation of the AMI Meeting Corpus Jean Carletta University of Edinburgh.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
TimeML compliant text analysis for Temporal Reasoning Branimir Boguraev and Rie Kubota Ando.
Object Oriented Analysis and Design Class and Object Diagrams.
Jette Viethen 20 April 2007NLGeval07 Automatic Evaluation of Referring Expression Generation is Possible.
The AMITIÉS Corpus up to the minute report. The GE English corpus Around 716 English dialogues were received so far from GE Leeds of which 642 are “good.
Domain Act Classification using a Maximum Entropy model Lee, Kim, Seo (AAAI unpublished) Yorick Wilks Oxford Internet Institute and University of Sheffield.
TypeCraft Software Evaluation 21/02/ :45 Powered by None Complete: 10 On, Partial: 0 Off, Excluded: 0 Off Country: All, Region:
SemAF – Basics: Semantic annotation framework Harry Bunt Tilburg University isa -6 Joint ISO - ACL/SIGSEM workshop Oxford, January 2011 TC 37/SC.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
CCT 333: Imagining the Audience in a Wired World Class 9: Scenarios and Requirements.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Agent-Based Dialogue Management Discourse & Dialogue CMSC November 10, 2006.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
FASTFAST All rights reserved © MEP Make programming fun again.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Introduction to DBMS Purpose of Database Systems View of Data
Chapter 3 – Describing Syntax
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
OPERATING SYSTEMS CS 3502 Fall 2017
Dependency Management
Contents 1 Focus 30 minutes Question 30 minutes 2 3
Chapter 1: Introduction
Security Issues Formalization
Structure of Homework Assignments
Databases Chapter 16.
Grounding by nodding GESPIN 2009, Poznan, Poland
Chapter 3 – Describing Syntax
Welcome to English 3 AP Language & Composition
Introduction to C Topics Compilation Using the gcc Compiler
Lesson 9 Sharing Documents
Learning to Program in Python
Tomás Murillo-Morales and Klaus Miesenberger
Introduction to DBMS Purpose of Database Systems View of Data
Session 2: Metadata and Catalogues
Language and Communication
Open Archival Information System
Semantic Markup for Semantic Web Tools:
Chapter 11 user support.
Probabilistic Databases
Language and Communication
Design and Implementation of Software for the Web
Airport Parking Space Navigation
Query Processing.
Language and Communication
Data Analysis, Interpretation, and Presentation
Конференција на МАКС, 17 јуни
Presentation transcript:

LREC 2016, Portoroz, 25-27 May The DialogBank Harry Bunt1, Volha Petukhova2, Andrei Malchanau2, Alex Chengyu Fang3 and Kars Wijnhoven1 1Tilburg University (NL), 2Universität des Saarlandes (D), 3City University of Hong Kong The DialogBank is a collection of dialogues annotated according to the ISO 24617-2 standard for dialogue act annotation. It contains newly annotated and re-annotated dialogues from various corpora, including Switchboard-DA, HCRC Map Task, TRAINS, DBOX, DIAMOND (in Dutch), OVIS (in Dutch), and Schiphol (in Dutch). https://dialogbank.uvt.nl ISO 24617-2 standard for dialogue act annotation: Multidimensional annotation: multiple communicative functions may be assigned to ‘functional segments’ including. In addition, annotations in DiAML include: Dimensions, or categories of semantic content. Nine dimensions are distinguished on empirical and theoretical grounds (inherited from DIT++). Qualifiers. for expressing that a dialogue act is performed conditionally, with uncertainty, or with a particular sentiment. Dependence relations for expressing semantic relations between dialogue acts, e.g. for indicating which question is answered by a certain answer act. Rhetorical relations between dialogue acts. DiAML abstract syntax (triples, n-tuples of concepts) supports alternative concrete representation formats: (‘ideal’ = complete and unambiguous) Example: core part of ISO 24617-2 annotation in DiAML-XML format: 1. G: go south and you’ll pass some cliffs on your right 2. F: uhm… 3. G: and some adobe huts on your left 4. F: oh okay <diaml xmlns=”http://www.iso.org/diaml”> <dialogueAct xml:id=”da1” target=”#fs1” sender=”#g” addressee=”#f” dimension=”task” communicativeFunction=”instruct”/> <dialogueAct xml:id=”da2” target=”#fs2” sender=”#f” addressee=”#f” dimension=”turnManagement” communicativeFunction=”turnTake”/> <dialogueAct xml:id=”da3” target=”#fs2” sender=”#f” addressee=”#g” dimension=”timeManagement” communicativeFunction=”stalling”/> <dialogueAct xml:id=”da4” target=”#fs3” sender=”#g” addressee=”#f” dimension=”task” communicativeFunction=”inform”/> <rhetoricalLink dact=”#da4” rhetoAntecedent=”#da1” rhetoRel=”elaborate”/> <dialogueAct xml:id=”da5” target=”#fs4” sender=”#f” addressee=”#g” dimension=”autoFeedback” communicativeFunction=”autoPositive” feedbackDependence=”#da1” ”#da4”/> </diaml> Full XML representation is very hard to read, inspect or correct. Equivalent, better human-readable tabular representation formats: DiAML-TabSW and DiAML-MultiTab. DiAML-TabSW format, derived from Switchboard-DA format: Func.segment ID DA-ID Dialogue acts Sp Functional segment text Turn transcript sw01-0105-fs.1 da1 Ta:setQuestion A Jimmy, so how do you get most of your news? Jimmy, {D so } how do you get most of your news?/ B {D Well, } [ I kind of, + {F uh, } I ] watch the, national news every day, for one. / I also read one or two papers a day / {C and } [ I’m a, + I’m pretty much a ] news junkie /{C and } I tune in to CNN a lot. / sw01-0105-fs.2 da2 da3 TiM:stalling TuM:turnTake Well, sw01-0105-fs.3 da4 OCM:selfCorrection I kind of, I sw01-0105-fs.4 da5 TiM;stalling uh sw01-0105-fs.5 da6 Ta:answer (da1) I watch the national news every day, for one sw01-0105-fs.6 da7 sw01-0105-fs.7 da8 Ta:answer (da1) (Expansion da6) I also read on or two papers a day sw01-0105-fs.8 da9 TuM:turnKeep and sw01-0105-fs.9 da10 Ta:inform I’m pretty much a news junkie sw01-0105-fs.10 da11 I’m a, I’m pretty much a sw01-0105-fs.11 da12 sw01-0105-fs.12 da13 Ta:answer (da1) (Expansion da6, da8] I tune in to CNN a lot sw01-0105-fs.13 da14 AuF:autoPositive Oh, wow. Switchboard-DAMSL annotation: Slash unit ID Function Transcript sw01-0105-0001-A001-01 qw A.1 utt1: Jimmy, {D so } how do you get most of your news? / sw01-0105-0002-B002-01 sd B.1 utt1: {D Well, [ I kind of, + fF uh, } I ] watch the {F uh, } national news every day, for one / sw01-0105-0003-B002-02 B.2 utt1 I also read one or two papers a day / sw01-0105-0004-B002-03 B.3 utt1: {C and } [ I’m a, + I’m pretty much a ] news junkie / sw01-0105-0005-B002-04 B.4 utt1: {C and } I tune in to CNN a lot. / sw01-0105-0006-A003-01 ba A.3 utt1: {F Oh, } wow. /  DiAML-MultiTab representation format: Func. segment ID Sp Functional segment text Turn transcript Task AutoFeedback Turn Management Time Management Discourse Structuring Social Obligation Management hello can I help you TR1-fs.1 s hello da1:Initial Greeting TR1-fs.2 can I help you da2:Offer uhm, yes hello, maybe, I’d like to take a tanker with orange juice from... TR1-fs.3 u uhm da3:Turn Take da4:Stalling TR1-fs.4 yes hello da5: Positive(da1) da6: Return Greeting (da1) TR1-fs.5 yes maybe da7: Accept Offer(da2) [uncertain] TR1-fs.6 I’d like to take a tanker with …. da8: Inform DialogBank contents: Origin Language Original representation Original annotation DiAML format of ISO 24617-2 annotation HCRC MapTask English NITE XML HCRC MapTask communicative functions DiAML-XML Switchboard 3-column tabular SWBD-DAMSL communicative functions DiAML-TabSW TRAINS 13-column tabular DAMSL communicative functions DiAML-MultiTab DBOX ISO 24617-2 annotations Dutch MapTask Dutch plain text transcript no dialogue act annotation DIAMOND DIT++ communicative functions and dimensions OVIS Schiphol Airport Ongoing and future work: Addition of more dialogues with gold standard ISO 24617-2 annotations in various DiAML formats Experimental investigation of usability of alternative DiAML representation formats Integration of ISO 24617-2 annotation with discourse relation annotation according to ISO 24617-8 Implementation of convertors between alternative DiAML formats, allowing users to view annotations in optimally convenient ways https://dialogbank.uvt.nl