Download presentation
Presentation is loading. Please wait.
Published byAldous Allen Modified over 6 years ago
1
Lecture 29 Word Sense Disambiguation-Revision Dialogues
October 19, 2005 11/29/2018
2
Word-Sense Disambiguation
Word sense disambiguation refers to the process of selecting the right sense for a word from among the senses that the word is known to have Semantic selection restrictions can be used to disambiguate Ambiguous arguments to unambiguous predicates Ambiguous predicates with unambiguous arguments Ambiguity all around 11/29/2018
3
Word-Sense Disambiguation
We can use selectional restrictions for disambiguation. He cooked simple dishes. He broke the dishes. But sometimes, selectional restrictions will not be enough to disambiguate. What kind of dishes do you recommend? -- we cannot know what sense is used. There can be two lexemes (or more) with multiple senses. They serve vegetarian dishes. Selectional restrictions may block the finding of meaning. If you want to kill Turkey, eat its banks. These situations leave the system with no possible meanings, and they can indicate a metaphor. 11/29/2018
4
WSD and Selection Restrictions
Ambiguous arguments Prepare a dish Wash a dish Ambiguous predicates Serve Denver Serve breakfast Both Serves vegetarian dishes 11/29/2018
5
WSD and Selection Restrictions
This approach is complementary to the compositional analysis approach. You need a parse tree and some form of predicate-argument analysis derived from The tree and its attachments All the word senses coming up from the lexemes at the leaves of the tree Ill-formed analyses are eliminated by noting any selection restriction violations 11/29/2018
6
Problems Selection restrictions are violated all the time.
This doesn’t mean that the sentences are ill-formed or preferred less than others. This approach needs some way of categorizing and dealing with the various ways that restrictions can be violated 11/29/2018
7
WSD Tags A dictionary sense? What’s a tag?
For example, for WordNet an instance of “bass” in a text has 8 possible tags or labels (bass1 through bass8). 11/29/2018
8
WordNet Bass The noun ``bass'' has 8 senses in WordNet
bass - (the lowest part of the musical range) bass, bass part - (the lowest part in polyphonic music) bass, basso - (an adult male singer with the lowest voice) sea bass, bass - (flesh of lean-fleshed saltwater fish of the family Serranidae) freshwater bass, bass - (any of various North American lean-fleshed freshwater fishes especially of the genus Micropterus) bass, bass voice, basso - (the lowest adult male singing voice) bass - (the member with the lowest range of a family of musical instruments) bass -(nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes) 11/29/2018
9
Representations Vectors of sets of feature/value pairs
Most supervised ML approaches require a very simple representation for the input training data. Vectors of sets of feature/value pairs I.e. files of comma-separated values So our first task is to extract training data from a corpus with respect to a particular instance of a target word This typically consists of a characterization of the window of text surrounding the target 11/29/2018
10
Representations This is where ML and NLP intersect If you stick to trivial surface features that are easy to extract from a text, then most of the work is in the ML system If you decide to use features that require more analysis (say parse trees) then the ML part may be doing less work (relatively) if these features are truly informative 11/29/2018
11
Surface Representations
Collocational and co-occurrence information Collocational Encode features about the words that appear in specific positions to the right and left of the target word Often limited to the words themselves as well as they’re part of speech Co-occurrence Features characterizing the words that occur anywhere in the window regardless of position Typically limited to frequency counts 11/29/2018
12
Collocational [guitar, NN, and, CJC, player, NN, stand, VVB]
Position-specific information about the words in the window guitar and bass player stand [guitar, NN, and, CJC, player, NN, stand, VVB] In other words, a vector consisting of [position n word, position n part-of-speech…] 11/29/2018
13
Co-occurrence Information about the words that occur within the window. First derive a set of terms to place in the vector. Then note how often each of those terms occurs in a given window. 11/29/2018
14
Classifiers Naïve Bayes (the right thing to try first) Decision lists
Once we cast the WSD problem as a classification problem, then all sorts of techniques are possible Naïve Bayes (the right thing to try first) Decision lists Decision trees Neural nets Support vector machines Nearest neighbor methods… 11/29/2018
15
Classifiers The choice of technique, in part, depends on the set of features that have been used Some techniques work better/worse with features with numerical values Some techniques work better/worse with features that have large numbers of possible values For example, the feature the word to the left has a fairly large number of possible values 11/29/2018
16
Statistical Word-Sense Disambiguation
Where s is a vector of senses, V is the vector representation of the input By Bayesian rule By making independence assumption of meanings. This means that the result is the product of the probabilities of its individual features given that its sense 11/29/2018
17
Problems One for each ambiguous word in the language
Given these general ML approaches, how many classifiers do I need to perform WSD robustly One for each ambiguous word in the language How do you decide what set of tags/labels/senses to use for a given word? Depends on the application 11/29/2018
18
Lecture 3: October 5, 2004 Dan Jurafsky
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Lecture 3: October 5, 2004 Dan Jurafsky 11/29/2018
19
Week 2: Dialogue and Conversational Agents
Examples of spoken language systems Components of a dialogue system, focus on these 3: ASR NLU Dialogue management VoiceXML Grounding and Confirmation 11/29/2018
20
Conversational Agents
AKA: Spoken Language Systems Dialogue Systems Speech Dialogue Systems Applications: Travel arrangements (Amtrak, United airlines) Telephone call routing Tutoring Communicating with robots Anything with limited screen/keyboard 11/29/2018
21
A travel dialog: Communicator
11/29/2018
22
Call routing: ATT HMIHY
11/29/2018
23
A tutorial dialogue: ITSPOKE
11/29/2018
24
Dialogue System Architecture
Simplest possible architecture: ELIZA Read-search/replace-print loop We’ll need something with more sophisticated dialogue control And speech 11/29/2018
25
Dialogue System Architecture
11/29/2018
26
ASR engine ASR = Automatic Speech Recognition
Job of ASR system is to go from speech (telephone or microphone) to words We will be studying this in a few weeks 11/29/2018
27
ASR Overview (pic from Yook 2003)
11/29/2018
28
ASR in Dialogue Systems
ASR systems work better if can constrain what words the speaker is likely to say. A dialogue system often has these constraints: System: What city are you departing from? Can expect sentences of the form I want to (leave|depart) from [CITYNAME] From [CITYNAME] [CITYNAME] etc 11/29/2018
29
ASR in Dialogue Systems
Also, can adapt to speaker But!! ASR is errorful So unlike ELIZA, can’t count on the words being correct As we will see, this fact about error plays a huge role in dialogue system design 11/29/2018
30
Natural Language Understanding
Also called NLU We will discuss this later in the quarter There are many ways to represent the meaning of sentences For speech dialogue systems, perhaps the most common is a simple one called “Frame and slot semantics”. Semantics = meaning 11/29/2018
31
An example of a frame Show me morning flights from Boston to SF on Tuesday. SHOW: FLIGHTS: ORIGIN: CITY: Boston DATE: Tuesday TIME: morning DEST: CITY: San Francisco 11/29/2018
32
How to generate this semantics?
Many methods Simplest: semantic grammars LIST -> show me | I want | can I see|… DEPARTTIME -> (after|around|before) HOUR | morning | afternoon | evening HOUR -> one|two|three…|twelve (am|pm) FLIGHTS -> (a) flight|flights ORIGIN -> from CITY DESTINATION -> to CITY CITY -> Boston | San Francisco | Denver | Washington 11/29/2018
33
Semantics for a sentence
LIST FLIGHTS ORIGIN Show me flights from Boston DESTINATION DEPARTDATE to San Francisco on Tuesday DEPARTTIME morning 11/29/2018
34
Frame-filling We use a parser to take these rules and apply them to the sentence. Resulting in a semantics for the sentence We can then write some simple code That takes the semantically labeled sentence And fills in the frame. 11/29/2018
35
Other NLU Approaches Cascade of Finite-State-Transducers
Instead of a parser, we could use FSTs, which are very fast, to create the semantics. Or we could use “Syntactic rules with semantic attachments” This latter is what is done in VoiceXML, so we will see that today. 11/29/2018
36
Generation and TTS Generation: two main approaches
Simple templates (prescripted sentences) Unification: use similar grammar rules as for parsing, but run them backwards! 11/29/2018
37
Generation : The generation component of a conversational agent
chooses the concept to express to the user, plans how to express the concepts in words Assigns necessary prosody to the words TTS: takes these words and their prosodic annotations synthesizes an waveform 11/29/2018
38
Generation Content planner – what to say
Language generation – how to say it Template based generation What time do you want to leave CITY_ORIG? Will you return to CITY_ORIG from CITY_DEST? Natural language generation Sentence planner Surface realizer Prosody assigner 11/29/2018
39
Dialogue Manager Controls the architecture and the structure of the dialogue. Takes input from ASR/NLU component Maintains some sort of state Interfaces with the task manager Passes output to the NLG/TTS modules. 11/29/2018
40
Dialogue Manager Eliza was simplest dialogue manager
Read - search/replace - print loop No state was kept; system did the same thing on every sentence A real dialogue manager needs to keep state Ability to model structures of dialogue above the level of a single response. 11/29/2018
41
Architectures for dialogue management
Finite State Frame-based Information state based – including probabilistic Planning Agents 11/29/2018
42
Finite State Dialogue Manager
11/29/2018
43
Finite-state dialogue managers
System completely controls the conversation with the user. It asks the user a series of question Ignoring (or misinterpreting) anything the user says that is not a direct answer to the system’s questions 11/29/2018
44
Dialogue Initiative “Initiative” means who has control of the conversation at any point Single initiative System User Mixed initiative 11/29/2018
45
System Initiative Systems which completely control the conversation at all times are called system initiative. Advantages: Simple to build User always knows what they can say next System always knows what user can say next Known words: Better performance from ASR Known topic: Better performance from NLU Disadvantage: Too limited 11/29/2018
46
User Initiative User directs the system
Generally, user asks a single question, system answers System can’t ask questions back, engage in clarification dialogue, confirmation dialogue Used for simple database queries User asks question, system gives answer Web search is user initiative dialogue. 11/29/2018
47
Problems with System Initiative
Real dialogue involves give and take! In travel planning, users might want to say something that is not the direct answer to the question. For example answering more than one question in a sentence: Hi, I’d like to fly from Seattle Tuesday morning I want a flight from Milwaukee to Orlando one way leaving after 5 p.m. on Wednesday. 11/29/2018
48
Single initiative + universals
We can give users a little more flexibility by adding universal commands Universals: commands you can say anywhere As if we augmented every state of FSA with these Help Correct This describes many implemented systems But still doesn’t deal with mixed initiative 11/29/2018
49
Mixed Initiative Conversational initiative can shift between system and user Simplest kind of mixed initiative: use the structure of the frame itself to guide dialogue Slot Question ORIGIN What city are you leaving from? DEST Where are you going? DEPT DATE What day would you like to leave? DEPT TIME What time would you like to leave? AIRLINE What is your preferred airline? 11/29/2018
50
Frames are mixed-initiative
User can answer multiple questions at once. System asks questions of user, filling any slots that user specifies When frame is filled, do database query If user answers 3 questions at once, system has to fill slots and not ask these questions again! Anyhow, we avoid the strict constraints on order of the finite-state architecture. 11/29/2018
51
Multiple frames flights, hotels, rental cars
Flight legs: Each flight can have multiple legs, which might need to be discussed separately Presenting the flights (If there are multiple flights meeting users constraints) It has slots like 1ST_FLIGHT or 2ND_FLIGHT so use can ask “how much is the second one” General route information: Which airlines fly from Boston to San Francisco Airfare practices: Do I have to stay over Saturday to get a decent airfare? 11/29/2018
52
Multiple Frames Need to be able to switch from frame to frame
Based on what user says. Disambiguate which slot of which frame an input is supposed to fill, then switch dialogue control to that frame. 11/29/2018
53
VoiceXML Voice eXtensible Markup Language
An XML-based dialogue design language Makes use of ASR and TTS Deals well with simple, frame-based mixed initiative dialogue. Most common in commercial world (too limited for research systems) But useful to get a handle on the concepts. 11/29/2018
54
Voice XML Each dialogue is a <form>. (Form is the VoiceXML word for frame) Each <form> generally consists of a sequence of <field>s, with other commands 11/29/2018
55
Sample vxml doc <form> <field name="transporttype">
<prompt> Please choose airline, hotel, or rental car. </prompt> <grammar type="application/x=nuance-gsl"> [airline hotel "rental car"] </grammar> </field> <block> You have chosen <value expr="transporttype">. </prompt> </block> </form> 11/29/2018
56
VoiceXML interpreter Walks through a VXML form in document order
Iteratively selecting each item If multiple fields, visit each one in order. Special commands for events 11/29/2018
57
Another vxml doc (1) noinput>
I'm sorry, I didn't hear you. <reprompt/> </noinput> <nomatch> I'm sorry, I didn't understand that. <reprompt/> </nomatch> 11/29/2018
58
Another vxml doc (2) <form>
<block> Welcome to the air travel consultant. </block> <field name="origin"> <prompt> Which city do you want to leave from? </prompt> <grammar type="application/x=nuance-gsl"> [(san francisco) denver (new york) barcelona] </grammar> <filled> <prompt> OK, from <value expr="origin"> </prompt> </filled> </field> 11/29/2018
59
Another vxml doc (3) <field name="destination">
<prompt> And which city do you want to go to? </prompt> <grammar type="application/x=nuance-gsl"> [(san francisco) denver (new york) barcelona] </grammar> <filled> <prompt> OK, to <value expr="destination"> </prompt> </filled> </field> <field name="departdate" type="date"> <prompt> And what date do you want to leave? </prompt> <prompt> OK, on <value expr="departdate"> </prompt> 11/29/2018
60
Another vxml doc (4) <block>
<prompt> OK, I have you are departing from <value expr="origin”> to <value expr="destination”> on <value expr="departdate"> </prompt> send the info to book a flight... </block> </form> 11/29/2018
61
A mixed initiative VXML doc
Mixed initiative: user might answer a different question So VoiceXML interpreter can’t just evaluate each field of form in order User might answer field2 when system asked field1 So need grammar which can handle all sorts of input: Field1 Field2 Field 1 and field 2 etc 11/29/2018
62
VXML Nuance-style grammars
Rewrite rules Wantsentence -> I want to (fly|go) Nuance VXML format is: () for concatenation, [] for disjunction Each rule has a name: Wantsentence (I want to [fly go]) Airports [(san francisco) denver] 11/29/2018
63
Mixed-init VXML example (3)
<noinput> I'm sorry, I didn't hear you. <reprompt/> </noinput> <nomatch> I'm sorry, I didn't understand that. <reprompt/> </nomatch> <form> <grammar type="application/x=nuance-gsl"> <![ CDATA[ 11/29/2018
64
Grammar Flight ( ?[ (i [wanna (want to)] [fly go])
(i'd like to [fly go]) ([(i wanna)(i'd like a)] flight) ] [ ( [from leaving departing] City:x) {<origin $x>} ( [(?going to)(arriving in)] City:x) {<dest $x>} ( [from leaving departing] City:x [(?going to)(arriving in)] City:y) {<origin $x> <dest $y>} ?please ) 11/29/2018
65
Grammar City [ [(san francisco) (s f o)] {return( "san francisco, california")} [(denver) (d e n)] {return( "denver, colorado")} [(seattle) (s t x)] {return( "seattle, washington")} ] ]]> </grammar> 11/29/2018
66
Grammar <initial name="init">
<prompt> Welcome to the air travel consultant. What are your travel plans? </prompt> </initial> <field name="origin"> <prompt> Which city do you want to leave from? </prompt> <filled> <prompt> OK, from <value expr="origin"> </prompt> </filled> </field> 11/29/2018
67
Grammar <field name="dest">
<prompt> And which city do you want to go to? </prompt> <filled> <prompt> OK, to <value expr="dest"> </prompt> </filled> </field> <block> <prompt> OK, I have you are departing from <value expr="origin"> to <value expr="dest">. </prompt> send the info to book a flight... </block> </form> 11/29/2018
68
Grounding and Confirmation
Dialogue is a collective act performed by speaker and hearer Common ground: set of things mutually believed by both speaker and hearer Need to achieve common ground, so hearer must ground or acknowledge speakers utterance. Clark (1996): Principle of closure. Agents performing an action require evidence, sufficient for current purposes, that they have succeeded in performing it 11/29/2018
69
Clark and Schaefer: Grounding
Continued attention: B continues attending to A Relevant next contribution: B starts in on next relevant contribution Acknowledgement: B nods or says continuer like uh-huh, yeah, assessment (great!) Demonstration: B demonstrates understanding A by paraphrasing or reformulating A’s contribution, or by collaboratively completing A’s utterance Display: B displays verbatim all or part of A’s presentation 11/29/2018
70
11/29/2018
71
Grounding examples Display: Acknowledgement C: I need to travel in May
A: And, what day in May did you want to travel? Acknowledgement C: He wants to fly from Boston A: mm-hmm C: to Baltimore Washington International 11/29/2018
72
Grounding Examples (2) Acknowledgement + next relevant contribution
And, what day in May did you want to travel? And you’re flying into what city? And what time would you like to leave? 11/29/2018
73
Grounding and Dialogue Systems
Grounding is not just a tidbit about humans Is key to design of conversational agent Why? 11/29/2018
74
Grounding and Dialogue Systems
Grounding is not just a tidbit about humans Is key to design of conversational agent Why? HCI researchers find users of speech-based interfaces are confused when system doesn’t give them an explicit acknowledgement signal Experiment with this 11/29/2018
75
Confirmation Another reason for grounding
Speech is a pretty errorful channel Hearer could misinterpret the speaker This is important in Conv. Agents Since we are using ASR, which is still really buggy. So we need to do lots of grounding and confirmation 11/29/2018
76
Explicit confirmation
S: Which city do you want to leave from? U: Baltimore S: Do you want to leave from Baltimore? U: Yes 11/29/2018
77
Explicit confirmation
U: I’d like to fly from Denver Colorado to New York City on September 21st in the morning on United Airlines S: Let’s see then. I have you going from Denver Colorado to New York on September 21st. Is that correct? U: Yes 11/29/2018
78
Implicit confirmation: display
U: I’d like to travel to Berlin S: When do you want to travel to Berlin? U: Hi I’d like to fly to Seattle Tuesday morning S: Traveling to Seattle on Tuesday, August eleventh in the morning. Your name? 11/29/2018
79
Implicit vs. Explicit Complementary strengths
Explicit: easier for users to correct systems’s mistakes (can just say “no”) But explicit is cumbersome and long Implicit: much more natural, quicker, simpler (if system guesses right). 11/29/2018
80
Implicit and Explicit Early systems: all-implicit or all-explicit
Modern systems: adaptive How to decide? ASR system can give confidence metric. This expresses how convinced system is of its transcription of the speech If high confidence, use implicit confirmation If low confidence, use explicit confirmation 11/29/2018
81
More on design of dialogue agents Evaluation of dialogue agents
Dialogue acts More on VXML More on design of dialogue agents Evaluation of dialogue agents 11/29/2018
82
Lecture 4: October 7, 2004 Dan Jurafsky
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Lecture 4: October 7, 2004 Dan Jurafsky 11/29/2018
83
Week 2: Dialogue and Conversational Agents
Speech Acts and Dialogue Acts VoiceXML, continued More on design of dialogue agents Evaluation of dialogue agents 11/29/2018
84
Review Finite-state dialogue management
Frame-based dialogue management Semantic grammars ASR System, User, and Mixed-initiative Voice XML Explicit and implicit confirmation Grounding 11/29/2018
85
We want more complex dialogue
We saw finite-state and frame-based dialogues They could only handle simple dialogues In particular, neither could handle unexpected questions from user In fact, not clear in what we’ve seen so far how to even tell that the user has just asked us a question!!! 11/29/2018
86
Speech Acts Austin (1962): An utterance is a kind of action
Clear case: performatives I name this ship the Titanic I second that motion I bet you five dollars it will snow tomorrow Performative verbs (name, second) Austin’s idea: not just these verbs 11/29/2018
87
Each utterance is 3 acts Locutionary act: the utterance of a sentence with a particular meaning Illocutionary act: the act of asking, answering, promising, etc., in uttering a sentence. Perlocutionary act: the (often intentional) production of certain effects upon the thoughts, feelings, or actions of addressee in uttering a sentence. 11/29/2018
88
Locutionary and illocutionary
“You can’t do that!” Illocutionary force: Protesting Perlocutionary force: Intent to annoy addressee Intent to stop addressee from doing something 11/29/2018
89
The 3 levels of act revisited
Locutionary Force Illocutionary Perlocutionary Can I have the rest of your sandwich? Question Request You give me sandwich I want the rest of your sandwich Declarative Give me your sandwich! Imperative 11/29/2018
90
Illocutionary Acts What are they? 11/29/2018
91
5 classes of speech acts: Searle (1975)
Assertives: committing the speaker to something’s being the case (suggesting, putting forward, swearing, boasting, concluding) Directives: attempts by the speaker to get the addressee to do something (asking, ordering, requesting, inviting, advising, begging) Commissives:Committing the speaker to some future course of action (promising, planning, vowing, betting, opposing). Expressives: expressing the psychological state of the speaker about a state of affairs (thanking, apologizing, welcoming, deploring). Declarations: bringing about a different state of the world via the utterance (I resign; You’re fired) 11/29/2018
92
Dialogue acts An act with (internal) structure related specifically to its dialogue function Incorporates ideas of grounding Incorporates other dialogue and conversational functions that Austin and Searle didn’t seem interested in 11/29/2018
93
Verbmobil Dialogue Acts
THANK thanks GREET Hello Dan INTRODUCE It’s me again BYE Allright, bye REQUEST-COMMENT How does that look? SUGGEST June 13th through 17th REJECT No, Friday I’m booked all day ACCEPT Saturday sounds fine REQUEST-SUGGEST What is a good day of the week for you? INIT I wanted to make an appointment with you GIVE_REASON Because I have meetings all afternoon FEEDBACK Okay DELIBERATE Let me check my calendar here CONFIRM Okay, that would be wonderful CLARIFY Okay, do you mean Tuesday the 23rd? 11/29/2018
94
Verbmobil Dialogue 11/29/2018
95
DAMSL: forward looking func.
STATEMENT a claim made by the speaker INFO-REQUEST a question by the speaker CHECK a question for confirming information INFLUENCE-ON-ADDRESSEE (=Searle's directives) OPEN-OPTION a weak suggestion or listing of options ACTION-DIRECTIVE an actual command INFLUENCE-ON-SPEAKER (=Austin's commissives) OFFER speaker offers to do something COMMIT speaker is committed to doing something CONVENTIONAL other OPENING greetings CLOSING farewells THANKING thanking and responding to thanks 11/29/2018
96
DAMSL: backward looking func.
AGREEMENT speaker's response to previous proposal ACCEPT accepting the proposal ACCEPT-PART accepting some part of the proposal MAYBE neither accepting nor rejecting the proposal REJECT-PART rejecting some part of the proposal REJECT rejecting the proposal HOLD putting off response, usually via subdialogue ANSWER answering a question UNDERSTANDING whether speaker understood previous SIGNAL-NON-UNDER. speaker didn't understand SIGNAL-UNDER. speaker did understand ACK demonstrated via continuer or assessment REPEAT-REPHRASE demonstrated via repetition or reformulation COMPLETION demonstrated via collaborative completion 11/29/2018
97
11/29/2018
98
Automatic Interpretation of Dialogue Acts
How do we automatically identify dialogue acts? Given an utterance: Decide whether it is a QUESTION, STATEMENT, SUGGEST, or ACK Perhaps we can just look at the form of the utterance to decide? 11/29/2018
99
Can we just use the surface syntactic form?
YES-NO-Q’s have auxiliary-before-subject syntax: Will breakfast be served on USAir 1557? STATEMENTs have declarative syntax: I don’t care about lunch COMMAND’s have imperative syntax: Show me flights from Milwaukee to Orlando on Thursday night 11/29/2018
100
Surface form != speech act type
Locutionary Force Illocutionary Can I have the rest of your sandwich? Question Request I want the rest of your sandwich Declarative Give me your sandwich! Imperative 11/29/2018
101
Dialogue act disambiguation is hard!
Who’s on First - Abbott and Costello routine 11/29/2018
102
Dialogue act ambiguity
Who’s on first? INFO-REQUEST or STATEMENT 11/29/2018
103
Dialogue Act ambiguity
Can you give me a list of the flights from Atlanta to Boston? This looks like an INFO-REQUEST. If so, the answer is: YES. But really it’s a DIRECTIVE or REQUEST, a polite form of: Please give me a list of the flights… What looks like a QUESTION can be a REQUEST 11/29/2018
104
Dialogue Act ambiguity
Similarly, what looks like a STATEMENT can be a QUESTION: Us OPEN-OPTION I was wanting to make some arrangements for a trip that I’m going to be taking uh to LA uh beginnning of the week after next Ag HOLD OK uh let me pull up your profile and I’ll be right with you here. [pause] CHECK And you said you wanted to travel next week? ACCEPT Uh yes. 11/29/2018
105
Indirect speech acts Utterances which use a surface statement to ask a question Utterances which use a surface question to issue a request 11/29/2018
106
DA interpretation as statistical classification
Lots of clues in each sentence that can tell us which DA it is: Words and Collocations: Please or would you: good cue for REQUEST Are you: good cue for INFO-REQUEST Prosody: Rising pitch is a good cue for INFO-REQUEST Loudness/stress can help distinguish yeah/AGREEMENT from yeah/BACKCHANNEL Conversational Structure Yeah following a proposal is probably AGREEMENT; yeah following an INFORM probably a BACKCHANNEL 11/29/2018
107
Example: CHECKs Tag questions:
And it’s gonna take us also an hour to load boxcars, right? Right Declarative questions with rising intonation And you said you want to travel next week? Fragment questions Um, curve round slightly to your right To my right? yes 11/29/2018
108
Building a “CHECK”-detector
Checks: Most often have declarative sentence structure Most likely to have rising intonation Often have a following question tag (“right?”) Often are realized as fragments Often have the word “you”, often begin with “so” or “oh” 11/29/2018
109
How to build a CHECK detector
First build detectors for various features Parsers can tell you if it has declarative structure or not. Word or N-gram detectors for specific words/phrases. Speech software for extracting frequency (pitch) and energy (loudness) for the utterance. Then either: Hand-written rules “If it has three of the above 5 features, it’s a CHECK” or Supervised machine learning Create a training set, label each sentence CHECK or NOT Run “feature extraction” software as above Train a classifier (regression, decision tree, Naïve Bayes, maximum entropy, SVM, etc) to predict class 11/29/2018
110
Prosodic Decision Tree for making S/QY/QW/QD decision
11/29/2018
111
Review: VoiceXML Voice eXtensible Markup Language
An XML-based dialogue design language Makes use of ASR and TTS Deals well with simple, frame-based mixed initiative dialogue. Most common in commercial world (too limited for research systems) But useful to get a handle on the concepts. 11/29/2018
112
Review: sample vxml doc
<form> <field name="transporttype"> <prompt> Please choose airline, hotel, or rental car. </prompt> <grammar type="application/x=nuance-gsl"> [airline hotel (rental car)] </grammar> </field> <block> You have chosen <value expr="transporttype">. </prompt> </block> </form> 11/29/2018
113
Review: a mixed initiative VXML doc
Mixed initiative: user might answer a different question So VoiceXML interpreter can’t just evaluate each field of form in order User might answer field2 when system asked field1 So need grammar which can handle all sorts of input: Field1 Field2 Field 1 and field 2 etc 11/29/2018
114
VXML Nuance-style grammars
Rewrite rules Wantsentence -> I want to (fly|go) Nuance VXML format is: () for concatenation, [] for disjunction Each rule has a name: Wantsentence (I want to [fly go]) Airports [(san francisco) denver] 11/29/2018
115
Mixed-init VXML example (3)
<noinput> I'm sorry, I didn't hear you. <reprompt/> </noinput> <nomatch> I'm sorry, I didn't understand that. <reprompt/> </nomatch> <form> <grammar type="application/x=nuance-gsl"> <![ CDATA[ 11/29/2018
116
Grammar Flight ( ?[ (i [wanna (want to)] [fly go])
(i'd like to [fly go]) ([(i wanna)(i'd like a)] flight) ] [ ( [from leaving departing] City:x) {<origin $x>} ( [(?going to)(arriving in)] City:x) {<dest $x>} ( [from leaving departing] City:x [(?going to)(arriving in)] City:y) {<origin $x> <dest $y>} ?please ) 11/29/2018
117
Grammar City [ [(san francisco) (s f o)] {return( "san francisco, california")} [(denver) (d e n)] {return( "denver, colorado")} [(seattle) (s t x)] {return( "seattle, washington")} ] ]]> </grammar> 11/29/2018
118
An example of a frame Show me morning flights from Boston to SF on Tuesday. SHOW: FLIGHTS: ORIGIN: CITY: Boston DATE: Tuesday TIME: morning DEST: CITY: San Francisco 11/29/2018
119
How to generate this semantics?
Many methods, as we will see in week 9 Simplest: semantic grammars LIST -> show me | I want | can I see|… DEPARTTIME -> (after|around|before) HOUR | morning | afternoon | evening HOUR -> one|two|three…|twelve (am|pm) FLIGHTS -> (a) flight|flights ORIGIN -> from CITY DESTINATION -> to CITY CITY -> Boston | San Francisco | Denver | Washington 11/29/2018
120
Semantics for a sentence
LIST FLIGHTS ORIGIN Show me flights from Boston DESTINATION DEPARTDATE to San Francisco on Tuesday DEPARTTIME morning 11/29/2018
121
Mixed Init dialogue (cont)
<initial name="init"> <prompt> Welcome to the air travel consultant. What are your travel plans? </prompt> </initial> <field name="origin"> <prompt> Which city do you want to leave from? </prompt> <filled> <prompt> OK, from <value expr="origin"> </prompt> </filled> </field> 11/29/2018
122
Mixed init dialogue continued
<field name="dest"> <prompt> And which city do you want to go to? </prompt> <filled> <prompt> OK, to <value expr="dest"> </prompt> </filled> </field> <block> <prompt> OK, I have you are departing from <value expr="origin"> to <value expr="dest">. </prompt> send the info to book a flight... </block> </form> 11/29/2018
123
Dialogue system Evaluation
Whenever we design a new algorithm or build a new application, need to evaluate it How to evaluate a dialogue system? What constitutes success or failure for a dialogue system? 11/29/2018
124
Task Completion Success
% of subtasks completed Correctness of each questions/answer/error msg Correctness of total solution 11/29/2018
125
Task Completion Cost Completion time in turns/seconds
Number of queries Turn correction ration: number of system or user turns used solely to correct errors, divided by total number of turns Inappropriateness (verbose, ambiguous) of system’s questions, answers, error messages 11/29/2018
126
User Satisfaction Were answers provided quickly enough?
Did the system understand your requests the first time? Do you think a person unfamiliar with computers could use the system easily? 11/29/2018
127
User-centered dialogue system design
Early focus on users and task: interviews, study of human-human task, etc. Build prototypes: Wizard of Oz systems Iterative Design: iterative design cycle with embedded user testing 11/29/2018
128
On the way to more powerful dialogue systems
Grounding Performing grounding Recognizing user’s grounding Dialogue Acts Using correct dialogue acts Recognizing user’s dialogue acts Intention Recognizing user’s intentions 11/29/2018
129
Conversational Implicature
A: And, what day in May did you want to travel? C: OK, uh, I need to be there for a meeting that’s from the 12th to the 15th. Note that client did not answer question. Meaning of client’s sentence: Meeting Start-of-meeting: 12th End-of-meeting: 15th Doesn’t say anything about flying!!!!! What is it that licenses agent to infer that client is mentioning this meeting so as to inform the agent of the travel dates? 11/29/2018
130
Conversational Implicature (2)
A: … there’s 3 non-stops today. This would still be true if 7 non-stops today. But no, the agent means: 3 and only 3. How can client infer that agent means: only 3 11/29/2018
131
Grice: conversational implicature
Implicature means a particular class of licensed inferences. Grice (1975) proposed that what enables hearers to draw correct inferences is: Cooperative Principle This is a tacit agreement by speakers and listeners to cooperate in communication 11/29/2018
132
4 Gricean Maxims Relevance: Be relevant
Quantity: Do not make your contribution more or less informative than required Quality: try to make your contribution one that is true (don’t say things that are false or for which you lack adequate evidence) Manner: Avoid ambiguity and obscurity; be brief and orderly 11/29/2018
133
Relevance A: Is Regina here? B: Her car is outside. Implication: yes
Hearer thinks: why would he mention the car? It must be relevant. How could it be relevant? It could since if her car is here she is probably here. Client: I need to be there for a meeting that’s from the 12th to the 15th Hearer thinks: Speaker is following maxims, would only have mentioned meeting if it was relevant. How could meeting be relevant? If client meant me to understand that he had to depart in time for the mtg. 11/29/2018
134
Quantity A:How much money do you have on you? B: I have 5 dollars
Implication: not 6 dollars Similarly, 3 non stops can’t mean 7 non-stops (hearer thinks: if speaker meant 7 non-stops she would have said 7 non-stops A: Did you do the reading for today’s class? B: I intended to Implication: No B’s answer would be true if B intended to do the reading AND did the reading, but would then violate maxim 11/29/2018
135
Planning-based Conversational Agents
How to do the kind of Gricean inference that could solve the problems we’ve discussed? Researchers who work on this use sophisticated AI models of planning and reasoning. Involves planning, plus various extensions to logic to create logic for Belief, Desire, Intention. These are called BDI models (belief, desire, intention) 11/29/2018
136
BDI Logic B(S,P) = “speaker S believes proposition P”
KNOW(S,P) = P and B(S,P) KNOWIF(S,P) =“S knows whether P” = KNOW(S,P) or KNOW(S,notP) W(S,P) “S wants P to be true”, where P is a state or the execution of some action W(S,ACT(H)) = S wants H to do ACT 11/29/2018
137
How to represent actions
Preconditions: Conditions that must already be true in order to successfully perform the action Effects: conditions that become true as a result of successfully performing the action Body: A set of partially ordered goal states that must be achieved in performing the action 11/29/2018
138
How to represent the action of going to the beach
GOTOBEACH(P,B) Constraints: Person(P) & Beach(B) & Car(C) Precondition: Know(P,location(B)) & Have(A, C) & working(C) & Want(P,AtBeach(P,B)) &… Effect: AtBeach(P,B) Body: Drive(P,C) 11/29/2018
139
How to represent the action of booking a flight
BOOK-FLIGHT(A,C,F) Constraints: Agent(A) & Flight(F) & Client(C) Precondition: Know(A,dep-date(F)) & Know(A,dep-time(F)) & Know(A,origin(F)) & Has-Seats(F) & W(C,BOOK,A,C,F) & … Effect: Flight-Booked(A,C,F) Body: Make-Reservation(A,F,C) 11/29/2018
140
Speech acts INFORM(S,H,P)
Constraints: Speaker(S) & Hearer(H) & Proposition(P) Precondition: Know(S,P) & W(S,INFORM(S,H,P)) Effect: Know(H,P) Body: B(H(W(S,Know(H,P)))) 11/29/2018
141
Speech acts REQUEST-INFORM(A,C,I) Constraints: Agent(A) & Client(C)
Precondition: Know(C,I) Effect: Know(A,I) Body: B(C(W(A,Know(A,I)))) 11/29/2018
142
How a plan-based conversational agent works
While conversation is not finished If user has completed a turn Then interpret user’s utterance If system has obligations Then address obligations Else if system has turn Then if system has intended conversation acts Then call generator to produce utterances Else if some material is ungrounded Then address grounding situation Else if high-level goals are unsatisfied Then address goals Else release turn or attempt to end conversation Else if no one has turn or long pause Then take turn 11/29/2018
143
Plan-based agent data Queue of conversation acts it needs to generate, based on: Grounding: need to ground previous utterance Dialogue obligations: answer questions, perform commands Goals: agent must reason about its own goals 11/29/2018
144
A made-up example C: I want to go to Pittsburgh in May
System current state: Discourse obligations: NONE Turn holder: system Intended speech acts: NONE Unacknowledged speech acts: INFORM-1 Discourse goals: get-travel-goal, create-travel-plan 11/29/2018
145
A made-up example System decides to add 2 conversation acts to queue:
Acknoweldge user’s inform act Ask next travel-goal question of user How? Given goal “get-travel-goal” Request-info action scheme tells system that asking the user something is one way of finding out. 11/29/2018
146
A made-up example System current state:
Discourse obligations: NONE Turn holder: system Intended speech acts: REQUEST-INFORM-1, ACKNOWLEDGE-1 Unacknowledged speech acts: INFORM-1 Discourse goals: get-travel-goal, create-travel-plan This would be combined by clever generator: And, what day in May did you want to travel 11/29/2018
147
A made-up example C .. I don’t think there’s many options for non-stop. Assume DA interpreter correctly interprets this as REQUEST-INFORM3 Discourse obligations: address(REQUEST-INFORM3) Turn holder: system Intended speech acts: NONE Unacknowledged speech acts: REQUEST-INFORM-3 Discourse goals: get-travel-goal, create-travel-plan Manager would address discourse goal by calling planner to find out how many non-stop flights there are. Also needs to ground. 11/29/2018
148
A made-up example C .. I don’t think there’s many options for non-stop. Since this was in the form of a indirect request, we can do an ACKNOWLEDGEMENT (if a direct request, we would do ANSWER-YES). Also need to answer the question: Right. There’s three non-stops today. 11/29/2018
149
Summary 3 kinds of conversational agents Dialogue Phenomema
Finite-state: VoiceXML Form-based: VoiceXML Planning: Only in the research lab Dialogue Phenomema Grounding Dialogue Acts Implicature 11/29/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.