SIGDIAL 2007, Antwerpen1 Measuring Adaptation Between Dialogs Svetlana Stoyancheva Amanda Stent SUNY, Stony Brook.

Slides:



Advertisements
Similar presentations
An investigation into Corpus-based learning about language inin the primary-school: CLLIP Corpus evidence of the features of childrens literature.
Advertisements

Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Variation and regularities in translation: insights from multiple translation corpora Sara Castagnoli (University of Bologna at Forlì – University of Pisa)
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Language Use and Understanding BCS 261 LIN 241 PSY 261 CLASS 12: BRANIGAN ET AL.: PRIMING.
Dynamics of Social Cognition Drew Abney and Christopher Kello Cognitive and Information Sciences.
The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.
Natural Language Processing COLLOCATIONS Updated 16/11/2005.
Outline What is a collocation?
1 BASIC NOTIONS OF PROBABILITY THEORY. NLE 2 What probability theory is for Suppose that we have a fair dice, with six faces, and that we keep throwing.
Language Use and Understanding BCS 261 LIN 241 PSY 261 CLASS 12: SNEDEKER ET AL.: PROSODY.
Generation of Referring Expressions: Modeling Partner Effects Surabhi Gupta Advisor: Amanda Stent Department of Computer Science.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Bi-correlation clustering algorithm for determining a set of co- regulated genes BIOINFORMATICS vol. 25 no Anindya Bhattacharya and Rajat K. De.
Corpus 06 Discourse Characteristics. Reasons why discourse studies are not corpus-based: 1. Many discourse features cannot be identified automatically.
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
September SOME BASIC NOTIONS OF PROBABILITY THEORY Universita’ di Venezia 29 Settembre 2003.
High Frequency Word Entrainment in Spoken Dialogue ACL, June Columbus, OH Department of Computer and Information Science University of Pennsylvania.
Click on image for full.pdf article Links in article to access datasets.
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Collocations 09/23/2004 Reading: Chap 5, Manning & Schutze (note: this chapter is available online from the book’s page
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
PSY 369: Psycholinguistics Language Production & Comprehension: Conversation & Dialog.
Chapter 2 The Research Enterprise in Psychology. n Basic assumption: events are governed by some lawful order  Goals: Measurement and description Understanding.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Statistical Natural Language Processing Diana Trandabăț
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
The use of machine translation tools for cross-lingual text-mining Blaz Fortuna Jozef Stefan Institute, Ljubljana John Shawe-Taylor Southampton University.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Age of acquisition and frequency of occurrence: Implications for experience based models of word processing and sentence parsing Marc Brysbaert.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
Style & Topic Language Model Adaptation Using HMM-LDA Bo-June (Paul) Hsu, James Glass.
PSY 369: Psycholinguistics Conversation & Dialog: Language Production and Comprehension in conjoined action.
VIII Language and society V. Language and society  1.Language exchange information  maintain social relationship 2 ** The kind of language one chooses.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
1 Statistics in Research & Things to Consider for Your Proposal May 2, 2007.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Chapter 23: Probabilistic Language Models April 13, 2004.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
CSKGOI'08 Commonsense Knowledge and Goal Oriented Interfaces.
E BERHARD- K ARLS- U NIVERSITÄT T ÜBINGEN SFB 441 Coordinate Structures: On the Relationship between Parsing Preferences and Corpus Frequencies Ilona Steiner.
National Taiwan University, Taiwan
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Collocations and Terminology Vasileios Hatzivassiloglou University of Texas at Dallas.
The Effect of on By:. Purpose The purpose of this project was to.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Investigating the combined effects of word frequency and contextual predictability on eye movements during reading Christopher J. Hand Glasgow Language.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
 explain expected stages and patterns of language development as related to first and second language acquisition (critical period hypothesis– Proficiency.
S TATISTICAL R EASONING IN E VERYDAY L IFE. In descriptive, correlational, and experimental research, statistics are tools that help us see and interpret.
Statistical Tests We propose a novel test that takes into account both the genes conserved in all three regions ( x 123 ) and in only pairs of regions.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Conversational role assignment problem in multi-party dialogues Natasa Jovanovic Dennis Reidsma Rutger Rienks TKI group University of Twente.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
NLP Midterm Solution #1 bilingual corpora –parallel corpus (document-aligned, sentence-aligned, word-aligned) (4) –comparable corpus (4) Source.
E303 Part II The Context of Language Research
Computational Models of Discourse Analysis
Statistical NLP: Lecture 7
CSC 594 Topics in AI – Natural Language Processing
The interactive alignment model
High Frequency Word Entrainment in Spoken Dialogue
Process Capability.
Presentation transcript:

SIGDIAL 2007, Antwerpen1 Measuring Adaptation Between Dialogs Svetlana Stoyancheva Amanda Stent SUNY, Stony Brook

SIGDIAL 2007, Antwerpen2 Adaptation in Dialog Change in the communication pattern over time Shortening of referential expressions Prosody Accent Hand-gestures Convergence on lexical and syntactic choices “lexical choice variability is high between conversations while it is relatively low within a conversation” ( Brennan 1996)

SIGDIAL 2007, Antwerpen3 Examples of Lexical Variation  “Teacher”  “Instructor”  “Professor”  “Lecturer”  “Dog”  “Irish Setter”  “Red Irish Setter”  “Creature”

SIGDIAL 2007, Antwerpen4 Examples of Lexical Variation  “Teacher”  “Instructor”  “Professor”  “Lecturer”

SIGDIAL 2007, Antwerpen5 Examples of Syntactic Variation Dative/benefactive  “He gave the book to Mary”  “He gave Mary the book”

SIGDIAL 2007, Antwerpen6 Evidence of Adaptation in Dialog Evidence from controlled experiments:  “lexical choice variability is high between conversations while it is relatively low withing a conversation”  Referring expressions  Syntactic choices (Bortfeld and Brennan 1997; Brennan and Clark 1996; Garrod and Anderson 1987)

SIGDIAL 2007, Antwerpen7 Causes of Adaptation  Recency (Brown and Dell, 1987; Pickering and Garrod, 2004; Chartrand and Bargh, 1999).  Partner adaptation (Brennan and Clark, 1996; Horton and Gerrig, 2002) These theories are competing but not necessarily contradicting

SIGDIAL 2007, Antwerpen8 Recency  Words are activated during language production  Also called: convergence, priming, alignment output/input coordination principle (Garrod and Anderson's 1987) “people formulate their current utterance according to the same model and semantic rules used to interpret their partner's most recent utterance”

SIGDIAL 2007, Antwerpen9 Partner Adaptation  Based on the model of a partner  Also called: entrainment, audience design Conceptual pact (Brennan) “a temporary agreement about how the referent is to be conceptualized”. New addressee:  new conceptual pacts  may not be the same as with previous addressees

SIGDIAL 2007, Antwerpen10 Corpus Studies On Recency Adaptation  [Church 2000] measured lexical adaptation “within document” in corpora of written news  [Dubey et.al.2006] applied this measure to study syntactic adaptation in dialogs and written text  [Reitter et.al 2006] studies short-term priming effect in Maptask using logistic regression  In our work we identify and compare partner-specific and recency adaptation

SIGDIAL 2007, Antwerpen11 Setup 3 speakers: A, B, and C 1. A -> BB is primed by A 2. B -> C B may show recency effect 3. B -> AB may show partner effect  Compare B in 2 to A in 1  Compare B in 3 to A in 1

SIGDIAL 2007, Antwerpen12 Maptask Corpus Structure Dlg#giverfollowerset1set2 1a1b1prime 2b2a2prime 3a2a1recency 4b1b2recency 5a2b2partner 6b1a1partner 7a1a2 8b2b1 Hypothesis: recency adaptation happens between (1-4) and (2-3) partner adaptation happens between (1-6) and (2-5)

SIGDIAL 2007, Antwerpen13 Church’s measure for adaptation

SIGDIAL 2007, Antwerpen14 Church’s measure for adaptation  With small datasets random fluctuation of the values. The measure is reliable only for “large” datasets  High probability features “the”, “a”, occure in almost all documents = N, then Prior = Positive Adapt = 1 if

SIGDIAL 2007, Antwerpen15 Proposed Adaptation Measure 1. Adaptation ratio - measures adaptation prevalence. Allows comparison of adaptation between features and between dialog pairs Applicable to small datasets 2. Distance measure Investigate how frequency in the prime affects the frequency in the target Define: Feature is ‘adapted’ if its adaptation ratio > 1 or if a feature is more likely to occur frequently after it was ‘primed’ than without priming

SIGDIAL 2007, Antwerpen16 Proposed Adaptation Measures 1. Adaptation ratio 2. Distance measure

SIGDIAL 2007, Antwerpen17 Terminology A shortcut for: Frequency of feature f in document D is greater than the baseline Baseline for feature f: average frequency of feature f in all documents Feature f is primed if it occurs in prime dialog with frequency greater than the baseline Document: maptask dialog

SIGDIAL 2007, Antwerpen18 Adaptation Ratio PT P∩T Chance: probability that f co-occurs in prime and target by chance: N – total number of (prime, target) dialog pairs P – number of prime dialogs where freq of f > b T – number of target dialogs where freq of f > b ƒєPrime ƒєTarget ƒєPrime and ƒє Target All Dialog pairs N

SIGDIAL 2007, Antwerpen19 Adaptation Ratio Adaptation Ratio = +adapt / chance Positive Adaptation: P T P∩T N – total number of (prime, target) dialog pairs P – number of prime dialogs where freq of f > b T – number of target dialogs where freq of f > b ƒєPrime ƒєTarget ƒєPrime and ƒє Target N - All Dialog pairs Chance probability that f co- occurs in prime and target by chance

SIGDIAL 2007, Antwerpen20 Adaptation Ratio Adaptation Ratio = +adapt / chance Positive Adaptation: PT P∩T N – total number of (prime, target) dialog pairs P – number of prime dialogs where freq of f > b T – number of target dialogs where freq of f > b ƒєPrime ƒєTarget ƒєPrime and ƒє Target N - All Dialog pairs

SIGDIAL 2007, Antwerpen21 Adaptation Ratio measures adaptation prevalence Allows comparison of adaptation between different features Applicable to features of various frequencies Define: Feature is ‘adapted’ if its adaptation ratio > 1 or if a feature is more likely to occur frequently after it was ‘primed’ than without priming

SIGDIAL 2007, Antwerpen22 Distance Measure Investigate how priming affects the frequency in the target: If a feature is primed and adapted, what is its expected frequency in the target?

SIGDIAL 2007, Antwerpen23 Distance Measure For a feature f in a dialog pair (Prime, Target): t – frequency of feature f in target dialog p – frequency of feature f in prime dialog b – average frequency of f Feature f is “adapted” in a pair of dialogs if distance >0 Strength of adaptation is proportional to the distance

SIGDIAL 2007, Antwerpen24 Experimental Questions 1. Identify features that exhibit partner and recency adaptation. Can they be clustered? 2. Which type of adaptation is more prevalent: partner or recency? 3. Does the feature frequency in the prime dialog affect the feature frequency in the target?

SIGDIAL 2007, Antwerpen25 Features  Word stemmed with POS tags – to help distinguish between senses  Bigrams stemmed with POS tags  Syntactic (from Maptask annotations)

SIGDIAL 2007, Antwerpen26 Word-Stems with Adaptation Ratio > 1 and significant partnerrecency ADJright-handbottom, right-hand ADVwhen, diagonalright, well, about CONJiftill, that, so DETyou, across, on, what, that my, i, just, that INTJSorri, erruh NOUNbottommap PREPacross, through, along, from from, by, to VERBknow, got, take, passsay Only features occuring in > 30% of prime dialogs with freq > baseline Relative direction Contains ‘you’ Contains ‘I’ or ‘my

SIGDIAL 2007, Antwerpen27 Bigrams with Adaptation Ratio > 1 and significant your left, right-hand side, come to, you come, about the, when you, go round, and round, you got, if you, up toward, a wee, you just, round the, right you, just abov, abov the no no, my map, okay and, you just, on my,down about, yeah i, you got, down to, have a, i mean, ’til you, just below, just to, now you, no you PartnerRecency Relative direction Contains ‘you’ Contains ‘I’ or ‘my Only features occuring in > 30% of prime dialogs with freq > baseline

SIGDIAL 2007, Antwerpen28 Comparing Partner and Recency Adaptation adaptation ratio and adaptation strength are averaged over all features for each feature type Adaptation Ratio = +adapt / chance Distance measure ~ Adaptation strength **** * Indicates significance (p<.05)

SIGDIAL 2007, Antwerpen29 Adaptation Ratio for Selected Syntactic Features The features examined by Dubey et.al 2006 on Switchboard corpus Found adaptation between speakers

SIGDIAL 2007, Antwerpen30 Adaptation for Selected Syntactic Features NP->NN PP rule: adaptation ratio is stronger for recency means that if primed, speaker is more likely to use this rule in the very next conversation Adaptation Strength is higher for the partner scenario means that the “adapting speaker” in partner-scenario will use this rule with higher frequency than the “adapting speaker” in recency-scenario Adaptation Ratio = +adapt / chance Distance measure ~ Adaptation strength

SIGDIAL 2007, Antwerpen31 Dubey’s Results

SIGDIAL 2007, Antwerpen32 Effect of Frequency in the Prime Adaptation Strength% of features adapted Compute adaptation ratio for all features Consider the feature to be “Adapted” if Adaptation Ratio >1 Frequency of the feature in prime does not affect the chance of adaptation But affects the strength of adaptation

SIGDIAL 2007, Antwerpen33 Conclusions  Identified types of features affecting adaptation, difficult to cluster them  Found evidence for partner and for recency adaptation between dialogs  Adaptation ratio is stronger in the recency scenario  Adaptation ratio and adaptation strength are not always proportional

SIGDIAL 2007, Antwerpen34 Implications Evidence for partner adaptation suggests benefits of  Tuning parsing models (rule and vocabulary probabilities) of a dialog system to a particular user  Sharing information between parsing and language generation modules

SIGDIAL 2007, Antwerpen35 Future Work  We are open to suggestions for other measures that would help differentiate between recency and partner adaptation.

SIGDIAL 2007, Antwerpen36 Future Work  Do the analysis with low-frequency words only  Consider adaptation of the ‘taker’  Measure within-dialog adaptation  Consider a setup with no interleaving dialog for partner scenario: 1) A->B, 2) B->A partner 1) A->B, 2) B->C recency  Take into account whether the conversation partners know each other.  Take into account the eye gaze condition

SIGDIAL 2007, Antwerpen37 Questions? Svetlana Stoyancheva Thank you