Presentation is loading. Please wait.

Presentation is loading. Please wait.

Teaching Machines to Converse

Similar presentations


Presentation on theme: "Teaching Machines to Converse"— Presentation transcript:

1 Teaching Machines to Converse
Jiwei Li Computer Science Department Stanford University

2 Collaborators Bill Dolan Microsoft Research Dan Jurafsky Stanford
Alan Ritter Ohio State University Chris Brockett Microsoft Research Jason Weston Facebook AI Research Alexander Miller Facebook AI Research Sumit Chopra Facebook AI Research Marc'aurelio Ranzato Facebook AI Research Michel Galley Microsoft Research Will Monroe Stanford Jianfeng Gao Microsoft Research

3

4 Borrowed From Bill MacCartney’s slides
Many people consider siri as a bg breakthrough in AI Borrowed From Bill MacCartney’s slides

5 Does Siri really understand language ?
Many of have seen this Borrowed From Bill MacCartney’s slides

6 Does Siri really understand language ?
Cole-bear Borrowed From Bill MacCartney’s slides

7 Does Siri really understand language ?
Cole-bear Borrowed From Bill MacCartney’s slides

8 Does Siri really understand language ?
Many of have seen this Borrowed From Bill MacCartney’s slides

9 Does Siri really understand language ?
Many of have seen this Borrowed From Bill MacCartney’s slides

10 Does Siri really understand language ?
Many of have seen this Borrowed From Bill MacCartney’s slides

11 Does Siri really understand language ?
Many of have seen this Borrowed From Bill MacCartney’s slides

12 Does Siri really understand language ?
Many of have seen this Borrowed From Bill MacCartney’s slides

13 Does Siri really understand language ?
Many of have seen this Slide Borrowed From Bill MacCartney

14 Slide From Bill MacCartney
How well a machine can talk with humans has been associted with general sucuessfulness of AI for a long time. The attempt to develop a chatbot dates back to the early days of AI. Slide From Bill MacCartney

15 Why is building a chatbot hard ?
Computers need to understand what you ask.

16 Why is building a chatbot hard ?
Computers need to understand what you ask. Computers need to generate coherent, meaningful sequences in response to what you ask,

17 Why is building a chatbot hard ?
Computers need to understand what you ask. Computers need to generate coherent, meaningful sequences in response to what you ask, that require domain knowledge, discourse knowledge, world knowledge

18 Background

19 Background Goal Oriented Tasks
They are expensive and hard to be extended to open domain senarios (Ritter et al., 2010, Sordoni, et al., 2015, Vinyals and Le, 2015)

20 Background Goal Oriented Tasks (Levin et al., 1997;
They are expensive and hard to be extended to open domain senarios (Levin et al., 1997; Young et al., 2013; Walker 2000) (Ritter et al., 2010; Sordoni, et al., 2015; Vinyals and Le, 2015)

21 Outline Mutual Information for Response Generation. (Chitchat)
How to preserve Speaker Consistency (Chitchat) Reinforcement learning for Response Generation (Chitchat) Teaching a bot to ask questions (Goal-oriented)

22 Mutual Information for Response Generation.

23 Seq2Seq Models for Response Generation
(Sutskever et al., 2014; Jean et al., 2014; Luong et al., 2015) Source : Input Messages Target : Responses fine . I’m EOS We can adapt this framework to response generation, in which input messages are sources and output responses are mtargets Encoding Decoding how are you ? eos I’m fine .

24 Seq2Seq Models for Response Generation
how are you ?

25 Seq2Seq Models for Response Generation
Encoding how are you ?

26 Seq2Seq Models for Response Generation
Encoding how are you ?

27 Seq2Seq Models for Response Generation
Encoding how are you ?

28 Seq2Seq Models for Response Generation
Encoding how are you ?

29 Seq2Seq Models for Response Generation
Encoding Decoding how are you ?

30 Seq2Seq Models for Response Generation
I’m Encoding Decoding how are you ? eos

31 Seq2Seq Models for Response Generation
I’m fine Encoding Decoding how are you ? eos I’m

32 Seq2Seq Models for Response Generation
I’m fine . Encoding Decoding how are you ? eos I’m fine

33 Seq2Seq Models for Response Generation
I’m fine . EOS Encoding Decoding how are you ? eos I’m fine .

34 Seq2Seq Models as a Backbone
I’m fine . EOS Encoding Decoding how are you ? eos I’m fine .

35 Mutual Information for Response Generation.
Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015)

36 Mutual Information for Response Generation.
Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) How old are you ? I don’t know .

37 I don’t know what you are talking about.
Mutual Information for Response Generation. Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) How is life ? 30% percent. This behavior is ascribed I don’t know what you are talking about.

38 I don’t know what you are talking about.
Mutual Information for Response Generation. Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) Do you love me ? 30% percent. This behavior is ascribed I don’t know what you are talking about. 30% percent of all generated responses

39 Mutual Information for Response Generation.
Def ChatBot(string): if string[len(string)-1] == “?”: return “I don’t know” else: return “I don’t know what you are talking about” It is not unreasoble to generate . Developing a chatbot is not just about generating reasonable response

40 Mutual Information for Response Generation.
Solution #1: Adding Rules

41 Mutual Information for Response Generation.
Solution #1: Adding Rules I don’t know . I don’t know .. I don’t know … ... I don’t know ! I don’t know ! ! I don’t know ! ! ! Regular expression matching

42 Mutual Information for Response Generation.
Solution #1: Adding Rules I don’t have the foggiest idea what you are talking about . I have no idea . I don’t know . I don’t know .. I don’t know … ... I don’t know ! I don’t know ! ! I don’t know ! ! ! I don’t have a clue. I don’t have the lightest idea what you are talking about . Unfortunately, deep learning model manage to cluster all phrases that are semantically related to I don’t know. Most comprehensive “I don’t know” list I have ever seen … I haven’t the faintest idea How should I know ?

43 Mutual Information for Response Generation.
Solution #1: Adding Rules I don’t have the foggiest idea what you are talking about . I have no idea . I don’t know . I don’t know .. I don’t know … ... I don’t know ! I don’t know ! ! I don’t know ! ! ! I don’t have a clue. I don’t have the lightest idea what you are talking about . Unfortunately, deep learning model manage to cluster all phrases that are semantically related to I don’t know. Most comprehensive “I don’t know” list I have ever seen … I haven’t the faintest idea How should I know ? Rules don’t work !!

44 Mutual Information for Response Generation.

45 Mutual Information for Response Generation.

46 Mutual Information for Response Generation.
“I don’t know” Whatever one asks

47 Mutual Information for Response Generation.
“I don’t know” What one asks

48 Mutual Information for Response Generation.
“I don’t know” What one asks The other way around “I don’t know” What one asks

49 Mutual Information for Response Generation.

50 Mutual Information for Response Generation.

51 Mutual Information for Response Generation.

52 Mutual Information for Response Generation.
Bayesian Rule

53 Mutual Information for Response Generation.
Bayesian Rule Standard Seq2Seq model

54 Mutual Information for Response Generation.
Bayesian Rule

55 Mutual Information for Response Generation.
Bayesian Rule

56 Mutual Information for Response Generation.
Bayesian Rule

57 Datasets and Evaluations
Datasets: Twitter conversational Dataset (23M pairs) Opensubtitle movie scripts dataset (80M pairs)

58 Datasets and Evaluations
Datasets: Twitter conversational Dataset (23M pairs) Opensubtitle movie scripts dataset (80M pairs) Evaluations : BLEU (Papineni et al., 2003) #Distinct tokens Human Evaluation (1000 samples, each output is evaluated by 7 judges) Example

59 Datasets and Evaluations
BLEU +26.4% +51.3% +12.7% +35.0% +22.5% Font too small

60 Datasets and Evaluations
BLEU on Twitter Dataset +21.1% +12.7%

61 Datasets and Evaluations
# Distinct Tokens in generated targets (divided by total #) on Opensubtitle dataset +385% +122%

62 Human Evaluation

63 Human Evaluation

64 Sampled Results Too small Standard Seq2Seq p(t|s) Mutual Information

65 Outlines Mutual Information for Response Generation.
Speaker Consistency Reinforcement learning for Response Generation Teaching a bot to ask questions

66 Speaker Consistency Li et al., A Persona-Based Neural Conversation Model,

67 Speaker Consistency How old are you ? I’m 8 .

68 Speaker Consistency How old are you ? I’m 8 . What’s your age? 18

69 Speaker Consistency Where do you live now? I live in Los Angeles.

70 In which city do you live now?
Speaker Consistency Where do you live now? I live in Los Angeles. In which city do you live now? I live in Paris.

71 Speaker Consistency Where do you live now? I live in Los Angeles.
In which city do you live now? I live in Paris. In which country do you live now? England, you?

72 Speaker Consistency How old are you ? I’m 8.

73 How many kids do you have ?
Speaker Consistency How old are you ? I’m 8. How many kids do you have ? 4, you ?

74 Speaker Consistency When were you born ? In 1942.

75 When was your mother born ?
Speaker Consistency When were you born ? In 1942. When was your mother born ? In 1966.

76 How to represent users Persona embeddings (70k) Bob

77 How to represent users uk london sydney great Word embeddings (50k)
good stay live okay monday tuesday Persona embeddings (70k) Bob

78 Persona seq2seq model Encoding Decoding where do you live EOS

79 Persona seq2seq model Encoding Decoding Bob where do you live EOS Bob
Persona embeddings (70k) Bob

80 Persona seq2seq model Encoding Decoding Bob in where do you live EOS
Persona embeddings (70k) Bob

81 Persona seq2seq model Encoding Decoding Bob Bob in uk where do you
live EOS in Bob Persona embeddings (70k)

82 Persona seq2seq model Encoding Decoding Bob Bob Bob in uk . where do
you live EOS in uk Bob Persona embeddings (70k)

83 Persona seq2seq model Encoding Decoding Bob Bob Bob Bob in uk . EOS
where do you live EOS in uk . If you ask one user 100 questions, the 100 responses you will generate are not independent because the same user representation will be incoproaated Word embeddings (50k) uk london sydney great good stay live okay monday tuesday Persona embeddings (70k) Bob

84 Interaction Seq2Seq model
Encoding where do you live speaker-addressee interaction patterns within the conversation. Add speaker, addresse

85 Interaction Seq2Seq model
Encoding where do you live tanh(W* )

86 Interaction Seq2Seq model
Encoding Decoding where do you live EOS tanh(W* )

87 Interaction Seq2Seq model
uk Encoding Decoding where do you live EOS in tanh(W* )

88 Interaction Seq2Seq model
uk . Encoding Decoding where do you live EOS in uk

89 Datasets and Evaluations
Conversation from Twitter 28M turns 74,003 Users minimum of 60 conversational turns Perplexity BLEU (4,000 single reference) Human Evaluation

90 Quantitative Results Seq2Seq Speaker Model Perplexity 47.2
42.2 (-10.6%) BLEU (without MMI) 0.92 1.12 (+21.7%) BLEU (with MMI) 1.41 1.66 (+11.7%)

91 Human Evaluation Question Pairs

92 Human Evaluation Question Pairs What city do you live in ?
What country do you live in ?

93 Human Evaluation Question Pairs What city do you live in ?
What country do you live in ? Show it !!! Are you vegan or vegetarian ? Do you eat beaf ?

94 Human Evaluation Question Pairs What city do you live in ?
What country do you live in ? London/UK London/US

95 Human Evaluation Which Model produces more consistent answers ?
Each item is given to 5 judges. Ties are discarded Seq2Seq Model Persona Model Item1 +1 Item2

96 Human Evaluation Seq2Seq Model Persona Model 0.84 1.33 (+34.7%)

97 Results (No cherry-picking)

98 Results (No cherry-picking)

99 Results (No cherry-picking)

100 Results (No cherry-picking)

101 Issues How do we handle long-term dialogue success?

102 Outlines Mutual Information for Response Generation.
Speaker Consistency Reinforcement learning for Response Generation Teaching a bot to ask questions

103 Issues How do we handle long-term dialogue success?
Problem 1: Dull and generic responses.

104 I don’t know what you are talking about.
Issues Problem 1: Dull and generic responses. “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) Do you love me ? I don’t know what you are talking about.

105 Issues How do we handle long-term dialogue success?
Problem 1: Dull and generic responses. Problem 2: Repetitive responses.

106 Problem 2: Repetitive responses.
Shut up !

107 Problem 2: Repetitive responses.
Shut up ! No, you shut up !

108 Problem 2: Repetitive responses.
Shut up ! No, you shut up ! No, you shut up !

109 Problem 2: Repetitive responses.
Shut up ! No, you shut up ! No, you shut up ! No, you shut up !

110 …… Problem 2: Repetitive responses. Shut up ! No, you shut up !

111 …… Problem 2: Repetitive responses. See you later ! See you later !

112 Issues How do we handle long-term dialogue success?
Problem 1: Dull and generic responses. Problem 2: Repetitive responses. Problem 3: Short-sighted conversation decisions.

113 Problem 3: Short-sighted conversation decisions.
How old are you ?

114 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 .

115 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ?

116 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about

117 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about

118 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying

119 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about

120 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying

121 Problem 3: Short-sighted conversation decisions.
Bad Action How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying

122 Problem 3: Short-sighted conversation decisions.
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying Outcome

123 Can reinforcement learning handle this?
How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying Outcome does not emerge until a few turns later

124 Can reinforcement learning handle this?

125 Notations for Reinforcement Learning

126 Notations: State How old are you ? how old are you Encoding

127 Notations: Action How old are you ? i 'm 16 .

128 Notations: Reward How old are you ? i 'm 16 .

129 Notations: Reward 1. Ease of answering

130 Notations: Reward 1. Ease of answering

131 Notations: Reward 1. Ease of answering
We propose to measure the ease of answering a generated turn by using the negative log likelihood of reponding to that utterance with a dull response S: ”I don’t know what you are talking about”

132 Notations: Reward 2. Information Flow

133 Notations: Reward 2. Information Flow See you later ! See you later !

134 Notations: Reward 2. Information Flow See you later ! S1

135 Notations: Reward 3. Meaningfulness S1 How old are you ? S2 i 'm 16 .

136 Notations: Reward Easy to answer R1 Information Flow R2
Meaningfulness R3

137 A message from training set
Simulation A message from training set

138 A message from training set
Simulation Encode A message from training set

139 A message from training set
Simulation Encode A message from training set Decode r1

140 A message from training set
Simulation Encode A message from training set Encode Decode r1

141 A message from training set
Simulation Encode A message from training set Encode Decode r1 Decode r2

142 … Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode
Sn

143 … Compute Accumulated Reward R(S1,S2,…,Sn) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn Compute Accumulated Reward R(S1,S2,…,Sn)

144 … Easy to answer Turn 1 Turn 2 Turn N Input Message Encode Decode
Sn Easy to answer

145 … Easy to answer R1 Information Flow R2 Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2

146 … Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1
Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3

147 … Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1
Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3

148 … Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1
Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3

149 … Easy to answer R1 Information Flow R2 Meaningfulness R3
Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3 Compute Accumulated Reward R(S1,S2,…,Sn)

150 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

151 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

152 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

153 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

154 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

155 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

156 … REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input
Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992) What we want to learn

157 Details 1. Initialize policy using a pre-trained Sequence to Sequence model

158 Details 1. Initialize policy using a pre-trained Sequence to Sequence model 2. Curriculum Learning: gradually increases the number of simulated turns.

159 Evaluation Baselines: Vanilla Seq2seq model
Mutual information Seq2seq model

160 Evaluation 1. Number of Simulated Turns

161 Evaluation 2. Diversity

162 Evaluation Evaluation

163 Evaluation

164 Evaluation

165 Evaluation

166 Results Input Mutual Information The proposed model How old are you ?
I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

167 Results Input Mutual Information The proposed model How old are you ?
I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

168 Results Input Mutual Information The proposed model How old are you ?
I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

169 Results Input Mutual Information The proposed model How old are you ?
I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

170 Simulation How old are you ?

171 Simulation How old are you ? i 'm 16, why are you asking ?

172 Simulation How old are you ? i 'm 16, why are you asking ?
I thought you were 12 .

173 Simulation How old are you ? i 'm 16, why are you asking ?
I thought you were 12 . What made you think so ?

174 I don’t know what you are talking about .
Simulation How old are you ? i 'm 16. why are you asking ? I thought you were 12 . What made you think so ? I don’t know what you are talking about .

175 Simulation How old are you ? i 'm 16, why are you asking ?
I thought you were 12 . What made you think so ? I don’t know what you are talking about . You don’t know what you are saying.

176 Simulation How old are you ? i 'm 16, why are you asking ?
I thought you were 12 . What made you think so ? I don’t know what you are talking about . You don’t know what you are saying. I don’t know what you are talking about .

177 Simulation Survive 4 turns !! How old are you ?
i 'm 16, why are you asking ? I thought you were 12 . What made you think so ? I don’t know what you are talking about . You don’t know what you are saying. I don’t know what you are talking about .

178 Ongoing Work Better automatic evaluation metrics. (BLEU ? Perplexity ?)

179 Ongoing Work Turing Test
Better automatic evaluation metrics. (BLEU ? Perplexity ?) Turing Test Keep an on

180 Future Work Turing Test Generative Adversarial Nets
Better automatic evaluation metrics. (BLEU ? Perplexity ?) Better automatic evaluation metrics. Turing Test Generative Adversarial Nets

181 Outline Mutual Information for Response Generation. (Chitchat)
How to preserve Speaker Consistency (Chitchat) Reinforcement learning for Response Generation (Chitchat) Teaching a bot to ask questions (Goal-oriented)

182 How do you like Hom Tanks?
Introduction How do you like Hom Tanks?

183 How do you like Hom Tansk?
Introduction Case 1 How do you like Hom Tansk? Who is Hom Tanks?

184 How do you like Hom Tanks?
Introduction Case 1 How do you like Hom Tanks? Who is Hom Tanks ? Do you mean Tom Hanks ?

185 Introduction Case 1 How do you like Hom Tanks? Who is Hom Tanks?
Hom Tanks is the leading actor in Forest Gump.

186 Introduction Case 1 How do you like Hom Tanks? Who is Hom Tanks?
Hom Tanks is the leading actor in Forest Gump. Oh. Yeah. I like him a lot.

187 How do you like Hom Tanks?
Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks?

188 Introduction What will Current Chatbot Systems Do ?
How do you like Hom Tanks? How do you like Hom Tanks? UNK

189 How do you like Hom Tanks?
Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ?

190 How do you like Hom Tanks?
Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ? Give an output anyway

191 How do you like Hom Tanks?
Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ? Forward Backward softmax

192 Introduction What will Current Chatbot Systems Do ?
How do you like Hom Tanks? How do you like UNK ? output I hate him. He’s such a jerk. Forward Backward softmax

193 Introduction What will Current Chatbot Systems Do ?
How do you like Hom Tanks? Searching the Web for “how do you like Hom Tanks”

194 MovieQA Domain

195 MovieQA Domain Template

196 MovieQA Domain Template

197 In what scenarios does a bot need to ask questions ?

198 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

199 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

200 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

201 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

202 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

203 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

204 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

205 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification

206 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification Task 1

207 In what scenarios does a bot need to ask questions ?
Case 1: Question Clarification Task 1 Task 2

208 In what scenarios does a bot need to ask questions ?
Question Templates Case 1: Question Clarification Task 1

209 In what scenarios does a bot need to ask questions ?
Questioning Asking Templates Case 1: Question Clarification Task 1

210 In what scenarios does a bot need to ask questions ?
Case 2: Knowledge Operation.

211 Case 2: Knowledge Operation.
Questioning Asking Templates

212 Case 2: Knowledge Operation.
Task 3 Task 4

213 In what scenarios does a bot need to ask questions ?
Case 3: Knowledge Acquisition .

214 In what scenarios does a bot need to ask questions ?
Case 3: Knowledge Acquisition . Not in the KB … Other questions/ Other answers

215 Settings

216 1. Off-line supervised settings

217 Training Input … Other questions/ Other answers

218 Training Input … Other questions/ Other answers The teacher’s question

219 Training Input … Other questions/ Other answers Dialogue History

220 Training Input … Other questions/ Other answers KB facts

221 Training Input … Other questions/ Other answers Output

222 Input Output … Other questions/ Other answers
End-to-End Memory Networks Output

223 Training Settings

224 Training Settings: 1. Never Asking Questions (TrainQA)
Each training setting corresponds a way to generate a kind of dataset

225 Training Settings: 1. Never Asking Questions (TrainQA)

226 Training Settings: 1. Never Asking Questions (TrainQA)

227 Training Settings: 1. Never Asking Questions (TrainQA)
2. Always Asking Question (TrainAQ)

228 Training Settings: 1. Never Asking Questions (TrainQA)
2. Always Asking Question (TrainAQ)

229 Test Settings: 1. Never Asking Questions (TrainQA)
2. Always Asking Question (TrainAQ)

230 Make Predictions Test Settings: 1. Never Asking Questions (TestQA)
2. Always Asking Question (TestAQ) ????? Make Predictions ?????

231 Task 1-9 TestQA TestAQ TrainQA TrainAQ

232 Results

233 Results Asking questions always helps at test time.

234 Results Asking questions always helps at test time. Only asking questions at training time does not help

235 Results Asking questions always helps at test time. Only asking questions at training time does not help TrainAQ+TrainAQ performs the best

236 Setting2: Reinforcement Learning
Shall I ask a question ???

237 Setting2: Reinforcement Learning

238 Setting2: Reinforcement Learning
Ask a question or not …..

239 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes

240 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes Get Penalized by Cost(AQ)

241 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes

242 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes

243 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes Get Penalized by Cost(AQ)

244 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes Get Penalized by Cost(AQ) +1

245 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes Get Penalized by Cost(AQ) -1

246 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes If No Get Penalized by Cost(AQ)

247 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes If No Get Penalized by Cost(AQ)

248 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes If No +1 Get Penalized by Cost(AQ)

249 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes If No -1 Get Penalized by Cost(AQ)

250 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes If No -1 Get Penalized by Cost(AQ)

251 Setting2: Reinforcement Learning
Ask a question or not ….. If Yes If No -1 Get Penalized by Cost(AQ)

252 Setting2: Reinforcement Learning
Memory Network Ask a question or not ….. Get Penalized by r

253 Setting2: Reinforcement Learning
Memory Network Ask a question or not ….. Get Penalized by r Memory Network Memory Network

254 Setting2: Reinforcement Learning
Memory Network Ask a question or not ….. Get Penalized by r Memory Network Memory Network

255 Policy Gradient Setting2: Reinforcement Learning Memory Network
Ask a question or not ….. Policy Gradient

256 Policy Gradient Setting2: Reinforcement Learning Memory Network
Ask a question or not ….. Policy Gradient

257 Policy Gradient Setting2: Reinforcement Learning Baseline
Memory Network Ask a question or not ….. Policy Gradient Baseline

258 Setting2: Reinforcement Learning
Bad Student

259 Setting2: Reinforcement Learning

260 Setting2: Reinforcement Learning

261 Setting2: Reinforcement Learning
Conclusions: Asking questions helps improve performance.

262 Conclusion We explored multiple strategies to develop better chit-chat style chatbot

263 Conclusion We explored multiple strategies to develop better chit-chat style chatbot (mutual information, speaker consistency, reinforcement learning)

264 Conclusion We explored multiple strategies to develop better chit-chat style chatbot (mutual information, speaker consistency, reinforcement learning) We explored how a bot can interact with users by asking questions to better complete a goal

265 Q&A

266 Mutual Information for Response Generation.
Bayes Rule Anti-language Model

267 Mutual Information for Response Generation.

268 Mutual Information for Response Generation.
Training P(T|S) and P(T) Decoding

269 Mutual Information for Response Generation.
Anti-language Model Training P(T|S) and P(T) Decoding

270 Mutual Information for Response Generation.
Ungrammatical Responses !

271 Mutual Information for Response Generation.

272 Mutual Information for Response Generation.
Solution 1

273 Mutual Information for Response Generation.
Solution 1

274 Mutual Information for Response Generation.
Solution 1

275 Mutual Information for Response Generation.
Solution 1 Only penalize first few words

276 Mutual Information for Response Generation.
Direct Decoding is infeasible

277 Mutual Information for Response Generation.
Direct Decoding is infeasible Train P(T|S), P(S|T)

278 Mutual Information for Response Generation.
Direct Decoding is infeasible Train P(T|S), P(S|T) Generate N-best list using P(T|S)

279 Mutual Information for Response Generation.
Direct Decoding is infeasible Train P(T|S), P(S|T) Generate N-best list using P(T|S) Rerank the N-best list using P(S|T) Font too small

280 Results (No cherry-picking)
Two slides bigger

281 Results (No cherry-picking)

282 Persona seq2seq model Tradeoff Encoding Decoding Bob Bob Bob Bob in uk
. EOS Encoding Decoding Bob Bob Bob Bob where do you live EOS in uk . Tradeoff


Download ppt "Teaching Machines to Converse"

Similar presentations


Ads by Google