The Role and Identification of Dialog Acts in Online Chat AAAI-11 Workshop on Analyzing Microtext August 8, 2011 Tamitha Carpenter, Emi Fujioka Stottler.

Slides:



Advertisements
Similar presentations
Towel: Towards an Intelligent ToDo List Ken Conley Jim Carpenter SRI International AAAI Spring Symposium 2007.
Advertisements

PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Center for Computational Learning Systems Independent research center within the Engineering School NLP people at CCLS: Mona Diab, Nizar Habash, Martin.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
HTL-ACTS Workshop, June 2006, New York City Improving Speech Acts Analysis via N-gram Selection Vitor R. Carvalho & William W. Cohen Carnegie Mellon.
Conversational Agent 1.Two layers: Dialogue manager and Conversational agent. 2.Rule-Based Translator (ELIZA and PARRY) 3. Layer one: Dialogue Manager.
Information Technology Center Hany Abdelwahab Computer Specialist.
Part of speech (POS) tagging
Lecture 3: Shared Workspace Awareness Dr. Xiangyu WANG 11 th August 2008.
1 Chapter 19: Dialogue and Conversational Agents Nadia Hamrouni and Ahmed Abbasi 12/5/2006.
Presented by Eroika Jeniffer.  What are we going to learn? - the use of chat in classroom - the most likely application on chat. And many more….. So,
Towards a semantic extraction of named entities Diana Maynard, Kalina Bontcheva, Hamish Cunningham University of Sheffield, UK.
Designing Communication Objectives Supporting Language Acquisition and Global Competencies.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
WEB FORUM MINING BASED ON USER SATISFACTION PAGE 1 WEB FORUM MINING BASED ON USER SATISFACTION By: Suresh Pokharel Information and Communications Technologies.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
A New Approach for HMM Based Chunking for Hindi Ashish Tiwari Arnab Sinha Under the guidance of Dr. Sudeshna Sarkar Department of Computer Science and.
NERIL: Named Entity Recognition for Indian FIRE 2013.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
ANNIC ANNotations In Context GATE Training Course October 2006 Kalina Bontcheva (with help from Niraj Aswani)
Discourse Markers Discourse & Dialogue CS November 25, 2006.
Theories of Discourse and Dialogue. Discourse Any set of connected sentences This set of sentences gives context to the discourse Some language phenomena.
Chapter 6 : Software Metrics
Presented by Abirami Poonkundran.  Introduction  Current Work  Current Tools  Solution  Tesseract  Tesseract Usage Scenarios  Information Flow.
© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.
1 Computational Linguistics Ling 200 Spring 2006.
CHATS IN THE CLASSROOM: EVALUATIONS FROM THE PERSPECTIVES OF STUDENTS AND TUTORS AT CHEMNITZ UNIVERSITY OF TECHNOLOGY, COMMUNICATION ON TECHNOLOGY AND.
Computer Programming TCP1224 Chapter 3 Completing the Problem-Solving Process and Getting Started with C++
Ngoc Minh Le - ePi Technology Bich Ngoc Do – ePi Technology
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Semantic on the Social Semantic Desktop.
Introduction to CL & NLP CMSC April 1, 2003.
Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.
ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Issues in Multiparty Dialogues Ronak Patel. Current Trend  Only two-party case (a person and a Dialog system  Multi party (more than two persons Ex.
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Computational Linguistics. The Subject Computational Linguistics is a branch of linguistics that concerns with the statistical and rule-based natural.
Diagnostic Assessment: Salvia, Ysseldyke & Bolt: Ch. 1 and 13 Dr. Julie Esparza Brown Sped 512/Fall 2010 Portland State University.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
MedKAT Medical Knowledge Analysis Tool December 2009.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Discourse & Dialogue CS 359 November 13, 2001
EEL 5937 Agent communication EEL 5937 Multi Agent Systems Lotzi Bölöni.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
PET Writing Part 2 Writing Short Notes or Messages PET Writing Part 3 Writing Longer Texts.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
CS223: Software Engineering
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Week 2: Interviews. Definition and Types  What is an interview? Conversation with a purpose  Types of interviews 1. Unstructured 2. Structured 3. Focus.
Dr. Chen, Management Information Systems 1 Chapter 2 Collaboration Information Systems - Case & Exercise Jason C. H. Chen, Ph.D. Professor of MIS School.
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content Kalina Bontcheva, Diana Maynard, Hamish Cunningham, Horacio.
ACITA 12 demo outline v0 Dr David Mott (IBM UK) International Technology Alliance In Network & Information Sciences International Technology Alliance In.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
13 YEARS 11/2000 – 11/2013 Automated Privilege Detection, De-Threading & Automated Priv Logs 1st Quarter 2014 Confidential.
INAGO Project Automatic Knowledge Base Generation from Text for Interactive Question Answering.
Social Knowledge Mining
Item 1: This task required students to evaluate search results to choose the most appropriate one for a specified topic. This task illustrates achievement.
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

The Role and Identification of Dialog Acts in Online Chat AAAI-11 Workshop on Analyzing Microtext August 8, 2011 Tamitha Carpenter, Emi Fujioka Stottler Henke Associates Inc NE 45th St., Suite 310, Seattle, WA FAX:

Overview  Problem: Analyze task-supporting chat to enable situation awareness processing  Domain: Software development  Corpus 1111 messages, collected from an IRC chat room over a 6 week period  Approach Chat-IE – Context-aware, event driven, collection of experts Includes tokenizer, POS tagger, dialog act type identifiers, and dialog pattern matcher

Software Development Team I've finished one task (in review now) and one review what defect is it? meeting tomorrow at noon to discuss ideas on how to do this. so how do you know how to read the value if the file hasn't changes? changed Domain Term Recognition Shallow Parsing Historical Phrase Matching Dialog Act Splitting/Merging Fragment Tagger so how do you know how to read the value if the file hasn't changed? I've finished one task (in review now) and one review what defect is it? meeting tomorrow at noon to discuss ideas on how to do this. Directive Action Wh-question Context Source Code Bug Tracking Wiki Pages

Dialog Act Types, most common first statement non opinionstatement opinion action descriptionyes no question action directivecommit agree acceptother wh questionthanking affirmative answercompletion declarative y/n questionhmm response acknowledgeapology appreciationnegative answer offercorrection hedgemaybe accept part open questionreject hold before agreementother answer summarize restaterhetorical question conventional closingquotation downplayeroption or clauseself talk abandonedack backchannel (mm hmm) attentionbackchannel question conventional openingdeclarative wh question repeat phrasesignal non understanding tag question Most commonly self-completion Example: Speaker1: I’m working on defect 567 Speaker1: I meant 568 For messages directed at specific person Describe ongoing and completed activities

Uses  Triage – Identify critical events mid-conversation  Threading – Use patterns of dialogs to detangle multiple conversations  Filtering – Direct topically relevant conversations to interested users  Extraction – Use sequences of dialog act types to structure IE rules

Dialog Act Identification (1)  Historical Phrase Matching Identify Dialog Act Types based on past messages –Raw text –Text tagged with parts of speech Uses variation of a String B-tree for fast matching over a large corpus Obtained about 60% accuracy on common dialog act types

Dialog Act Identification (2)  Boosted performance to near 90% accuracy Example rules: –Wh-questions – Messages starting with wh-words (what, which, why, etc.). –Statement-opinion – Messages containing one of: “might”, “maybe”, “should”, “seems”, “i think”, “looks like”, “look like”, “probably”, or “i'm sure”. –Action-directive – Messages starting with infinitive verbs. –Action-description – Messages starting with “i”, “i just”, “i have”, “i’m”, etc., followed by a past tense or “-ing” verb. –Commit – Messages starting with “i will”, “i’ll”, “i’m going to”, or “i am going to”. Also, messages starting with “will” followed by an infinitive verb.

Dialog Patterns  Status updates – An action-directive or wh-question, followed by any number of action-descriptions.  Directed request with acknowledge – An attention followed by any number of utterances, followed by a response-acknowledge by the person mentioned in the first utterance.  Confirmed expertise (1) – An action-description followed by a thanking or a response-acknowledge (preferably mentioning the initial speaker). (First speaker demonstrated expertise.)  Confirmed expertise (2) – A yes-no-question or wh- question followed by a describe-other. (Second speaker demonstrated expertise.)

Lessons Learned  Users have very specific needs for chat analysis. Filter chat dialogs and messages/threads into topics or “bins”. Monitor chat rooms for triggering events.  Everything hinges on the tokenizer. Users combine characters in novel ways (e.g., ?!?!,, :-), etc.) Domains may have special tokens (e.g., “/usr/bin/chatLogs”, “65.4N”).  Partial dialogs may need to be retired without being “finished”.

References Cohen & Levesque, Rational interaction as the basis for communication. In Intentions in Communication. Creswick, Fujioka, & Goan, Pedigree tracking in the face of ancillary content. In Proceedings of the Second Workshop on Uncovering Plagiarism, Authorship, and Software Misuse (PAN). Cunningham, Maynard, Bontcheva, & Tablan, GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02). Grice, Logic and conversation. In Syntax and semantics 3: Dialog acts. Hepple, Independence and Commitment: Assumptions for Rapid Training and Execution of Rule-based Part-of-Speech Taggers. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000). Stolcke, Ries, Coccaro, Shriberg, Bates, Jurafsky, Taylor, Martin, Van Ess-Dykema, & Meteer, Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech. In Computational Linguistics 26(3).