Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute
Warm-Up Read the posts and be ready to discuss what you see as the take aways for computationization of discourse analysis from today’s readings… What are the computational implications of the debate between DA and CA? Note: These preparatory activities were rated as least beneficial to students, so… We will start the lecture/discussion at exactly 12:05pm. Please be on time and ready to discuss!
Early Course Evaluation Good news: everyone rated the lectures/discussions as valuable and engaging Things to improve: Decrease preparation time Change: only one discussion thread per week, but continue to use it throughout the week, include different options for response that require different amounts of time Change: frontload readings for Monday, further divide readings into required, extra, and supplemental More focus on fewer concepts for the remainder of the semester
This week: Sections 7.2-7.7 are most important * Not required!!!
Chicken and Egg… Main issue for this week: Exploring sequencing and linking between speech acts in conversation Operationalization Computationalization * Where do the ordering constraints come from? Is it the language? Or is it what is behind the language (e.g., intentions, task structure)? If the latter, how do we computationalize that?
Reminder from last time RE Constraint from Ordering Inform is the most common class (37.4%) Next most frequent is Assess (18.5%) With bigrams, if we look for conditional probabilities above 25% The only case where the most likely next class is not Inform is Elicit-Assessment, which is followed by Assessment 36% of the time It is followed by Inform 33% of the time It only occurs about 1% of the time Trigrams might be better, but this makes ordering information look pretty useless
More on what was least valuable (student quotes) The forum prompt mini-assignments seem unbalanced in proportion to the homework - by the time the "real" homework came along, I felt I had done ten times more work on my posts already. The Homework Nice job on the homeworks!!! I saw SO much improvement over the several posts and finally the assignment.
Assignment 2 (not due til Feb23) Look at the Maptask dataset and Negotiation coding that is provided Think about what distinguishes the codes at a linguistic level Do an error analysis on the dataset using a simple unigram baseline, and from that propose one or a few new types of features motivated by your linguistic understanding of the Negotiation framework Due on Week 7 lecture 2 Turn in data your feature extractors (documented code) and a formal write up of your experimentation Have a 5 minute powerpoint presentation ready for class on Week 7 lecture 2
Interesting Observation! Responses can address either illocutions or perlocutions Perlocutions are much less constrained Accounts for some of the difficulty in imposing ordering constraints Argues in favor for thinking about conversation as organized around intentions and tasks rather than linguistic categories Wednesday’s readings will argue just the opposite!! Are illocutions just the wrong categories??
Discourse Analysis vs Conversation Analysis (according to Levinson) Rules, formulas, more typical of linguistics and philosophers Categories, contingencies, grammars Use of a small but strategic amount of data Accused of “premature” theory construction Martin & Rose, Levinson More rigorously empirical and inductive Focus on what is found in data, not on what is expected to be found or would sound odd Hesitant to make generalizations/ Accused of being atheoretical Questions about whether the rules “work” on real data Do you see a connection with semisupervised learning? * Is it a question about the nature of language (is there a fundamental segmentation difference between utterances and acts?), or is it a question about research methodology? Are these linked?
Qualitative observations, anthrooplogy style Rules, like speech acts The nature of what we are modeling What we can know about it and how certain we can be How we learn what we know Qualitative observations, anthrooplogy style Rules, like speech acts
… An now for Elijah’s SIDE presentation
Questions?