Teaching Machines to Converse

Slides:



Advertisements
Similar presentations
Listening skills: how to untangle the noise and find the message Judy Copage.
Advertisements

Chapter 10 Artificial Intelligence © 2007 Pearson Addison-Wesley. All rights reserved.
1 Lecture 33 Introduction to Artificial Intelligence (AI) Overview  Lecture Objectives.  Introduction to AI.  The Turing Test for Intelligence.  Main.
Introduction to AI Michael J. Watts
Artificial Intelligence Introduction (2). What is Artificial Intelligence ?  making computers that think?  the automation of activities we associate.
Machine Learning.
Towards a Method For Evaluating Naturalness in Conversational Dialog Systems Victor Hung, Miguel Elvir, Avelino Gonzalez & Ronald DeMara Intelligent Systems.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Chapter 15: KNOWLEDGE-BASED INFORMATION SYSTEMS. What is Knowledge? Data: Raw facts, e.g., Annual Expenses = $2 million Information: Data given context,
What is Communicative Language Teaching??. Communicative Language: Blends listening, speaking, reading, and writing. Is the expression, interpretation,
A Roadmap towards Machine Intelligence
Dating Behaviors “The chief cause of unhappiness and failure is sacrificing what is wanted most for what is wanted at the moment.”
Brief Intro to Machine Learning CS539
R-NET: Machine Reading Comprehension With Self-Matching Networks
2/13/2018 4:38 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Lecture 10 SeqGAN, Chatbot, Reinforcement Learning
(End-to-End Methods for) Dialogue, Interaction and Learning
Bumble Bee Class Supporting Your Child with Reading 2nd February 2017
Neural Response Generation via GAN with an Approximate Embedding Layer
Unsupervised Learning of Video Representations using LSTMs
Deep Learning Methods For Automated Discourse CIS 700-7
Algorithms and Problem Solving
Approaches to Machine Translation
End-To-End Memory Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Introduction of Reinforcement Learning
Machine Learning overview Chapter 18, 21
PRAGMATICS 3.
Topics Question answering at Bing
CS Fall 2015 (Shavlik©), Midterm Topics
PART IV: The Potential of Algorithmic Machines.
Deep Learning in Open Domain Dialogue Generation
Adversarial Learning for Neural Dialogue Generation
Done Done Course Overview What is AI? What are the Major Challenges?
Neural Machine Translation by Jointly Learning to Align and Translate
Matt Gormley Lecture 16 October 24, 2016
I used to be afraid of the dark.
Unit 1 Greeting and Introducing People
Introduction to Programmng in Python
Life is a by Jack London.
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.
AV Autonomous Vehicles.
Deep Belief Networks Psychology 209 February 22, 2013.
Reinforcement Learning
Basic Intro Tutorial on Machine Learning and Data Mining
Deep Learning based Machine Translation
Distributed Representation of Words, Sentences and Paragraphs
Paraphrase Generation Using Deep Learning
Reinforcement Learning with Partially Known World Dynamics
Period 1 Listening and Speaking
Introduction to Text Generation
Approaches to Machine Translation
Overview of Group Presentations & Counterarguments
3.1.1 Introduction to Machine Learning
Introduction to Machine Reading Comprehension
Advisor: Jia-Ling Koh Presenter: Yin-Hsiang Liao Source: ACL 2018
Algorithms and Problem Solving
ConvAI2 Competition The task
Privacy Protection for Social Network Services
ConvAI2 Competition: FUTURE WORK
Natural Language to SQL(nl2sql)
Machine Learning in Practice Lecture 27
A User study on Conversational Software
实习生汇报 ——北邮 张安迪.
Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.
Neural Machine Translation
Period 1 Listening and Speaking
Neural Machine Translation by Jointly Learning to Align and Translate
CS249: Neural Language Model
Reinforcement Learning
Presentation transcript:

Teaching Machines to Converse Jiwei Li Computer Science Department Stanford University

Collaborators Bill Dolan Microsoft Research Dan Jurafsky Stanford Alan Ritter Ohio State University Chris Brockett Microsoft Research Jason Weston Facebook AI Research Alexander Miller Facebook AI Research Sumit Chopra Facebook AI Research Marc'aurelio Ranzato Facebook AI Research Michel Galley Microsoft Research Will Monroe Stanford Jianfeng Gao Microsoft Research

Borrowed From Bill MacCartney’s slides Many people consider siri as a bg breakthrough in AI Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Cole-bear Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Cole-bear Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Borrowed From Bill MacCartney’s slides

Does Siri really understand language ? Many of have seen this Slide Borrowed From Bill MacCartney

Slide From Bill MacCartney How well a machine can talk with humans has been associted with general sucuessfulness of AI for a long time. The attempt to develop a chatbot dates back to the early days of AI. Slide From Bill MacCartney

Why is building a chatbot hard ? Computers need to understand what you ask.

Why is building a chatbot hard ? Computers need to understand what you ask. Computers need to generate coherent, meaningful sequences in response to what you ask,

Why is building a chatbot hard ? Computers need to understand what you ask. Computers need to generate coherent, meaningful sequences in response to what you ask, that require domain knowledge, discourse knowledge, world knowledge

Background

Background Goal Oriented Tasks They are expensive and hard to be extended to open domain senarios (Ritter et al., 2010, Sordoni, et al., 2015, Vinyals and Le, 2015)

Background Goal Oriented Tasks (Levin et al., 1997; They are expensive and hard to be extended to open domain senarios (Levin et al., 1997; Young et al., 2013; Walker 2000) (Ritter et al., 2010; Sordoni, et al., 2015; Vinyals and Le, 2015)

Outline Mutual Information for Response Generation. (Chitchat) How to preserve Speaker Consistency (Chitchat) Reinforcement learning for Response Generation (Chitchat) Teaching a bot to ask questions (Goal-oriented)

Mutual Information for Response Generation.

Seq2Seq Models for Response Generation (Sutskever et al., 2014; Jean et al., 2014; Luong et al., 2015) Source : Input Messages Target : Responses fine . I’m EOS We can adapt this framework to response generation, in which input messages are sources and output responses are mtargets Encoding Decoding how are you ? eos I’m fine .

Seq2Seq Models for Response Generation how are you ?

Seq2Seq Models for Response Generation Encoding how are you ?

Seq2Seq Models for Response Generation Encoding how are you ?

Seq2Seq Models for Response Generation Encoding how are you ?

Seq2Seq Models for Response Generation Encoding how are you ?

Seq2Seq Models for Response Generation Encoding Decoding how are you ?

Seq2Seq Models for Response Generation I’m Encoding Decoding how are you ? eos

Seq2Seq Models for Response Generation I’m fine Encoding Decoding how are you ? eos I’m

Seq2Seq Models for Response Generation I’m fine . Encoding Decoding how are you ? eos I’m fine

Seq2Seq Models for Response Generation I’m fine . EOS Encoding Decoding how are you ? eos I’m fine .

Seq2Seq Models as a Backbone I’m fine . EOS Encoding Decoding how are you ? eos I’m fine .

Mutual Information for Response Generation. Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015)

Mutual Information for Response Generation. Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) How old are you ? I don’t know .

I don’t know what you are talking about. Mutual Information for Response Generation. Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) How is life ? 30% percent. This behavior is ascribed I don’t know what you are talking about.

I don’t know what you are talking about. Mutual Information for Response Generation. Li et al., A Diversity-Promoting Objective Function for Neural Conversation Models (to appear, NAACL,2016) “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) Do you love me ? 30% percent. This behavior is ascribed I don’t know what you are talking about. 30% percent of all generated responses

Mutual Information for Response Generation. Def ChatBot(string): if string[len(string)-1] == “?”: return “I don’t know” else: return “I don’t know what you are talking about” It is not unreasoble to generate . Developing a chatbot is not just about generating reasonable response

Mutual Information for Response Generation. Solution #1: Adding Rules

Mutual Information for Response Generation. Solution #1: Adding Rules I don’t know . I don’t know .. I don’t know … ... I don’t know ! I don’t know ! ! I don’t know ! ! ! Regular expression matching

Mutual Information for Response Generation. Solution #1: Adding Rules I don’t have the foggiest idea what you are talking about . I have no idea . I don’t know . I don’t know .. I don’t know … ... I don’t know ! I don’t know ! ! I don’t know ! ! ! I don’t have a clue. I don’t have the lightest idea what you are talking about . Unfortunately, deep learning model manage to cluster all phrases that are semantically related to I don’t know. Most comprehensive “I don’t know” list I have ever seen … I haven’t the faintest idea How should I know ?

Mutual Information for Response Generation. Solution #1: Adding Rules I don’t have the foggiest idea what you are talking about . I have no idea . I don’t know . I don’t know .. I don’t know … ... I don’t know ! I don’t know ! ! I don’t know ! ! ! I don’t have a clue. I don’t have the lightest idea what you are talking about . Unfortunately, deep learning model manage to cluster all phrases that are semantically related to I don’t know. Most comprehensive “I don’t know” list I have ever seen … I haven’t the faintest idea How should I know ? Rules don’t work !!

Mutual Information for Response Generation.

Mutual Information for Response Generation.

Mutual Information for Response Generation. “I don’t know” Whatever one asks

Mutual Information for Response Generation. “I don’t know” What one asks

Mutual Information for Response Generation. “I don’t know” What one asks The other way around “I don’t know” What one asks

Mutual Information for Response Generation.

Mutual Information for Response Generation.

Mutual Information for Response Generation.

Mutual Information for Response Generation. Bayesian Rule

Mutual Information for Response Generation. Bayesian Rule Standard Seq2Seq model

Mutual Information for Response Generation. Bayesian Rule

Mutual Information for Response Generation. Bayesian Rule

Mutual Information for Response Generation. Bayesian Rule

Datasets and Evaluations Datasets: Twitter conversational Dataset (23M pairs) Opensubtitle movie scripts dataset (80M pairs)

Datasets and Evaluations Datasets: Twitter conversational Dataset (23M pairs) Opensubtitle movie scripts dataset (80M pairs) Evaluations : BLEU (Papineni et al., 2003) #Distinct tokens Human Evaluation (1000 samples, each output is evaluated by 7 judges) Example

Datasets and Evaluations BLEU +26.4% +51.3% +12.7% +35.0% +22.5% Font too small

Datasets and Evaluations BLEU on Twitter Dataset +21.1% +12.7%

Datasets and Evaluations # Distinct Tokens in generated targets (divided by total #) on Opensubtitle dataset +385% +122%

Human Evaluation

Human Evaluation

Sampled Results Too small Standard Seq2Seq p(t|s) Mutual Information

Outlines Mutual Information for Response Generation. Speaker Consistency Reinforcement learning for Response Generation Teaching a bot to ask questions

Speaker Consistency Li et al., 2016. A Persona-Based Neural Conversation Model,

Speaker Consistency How old are you ? I’m 8 .

Speaker Consistency How old are you ? I’m 8 . What’s your age? 18

Speaker Consistency Where do you live now? I live in Los Angeles.

In which city do you live now? Speaker Consistency Where do you live now? I live in Los Angeles. In which city do you live now? I live in Paris.

Speaker Consistency Where do you live now? I live in Los Angeles. In which city do you live now? I live in Paris. In which country do you live now? England, you?

Speaker Consistency How old are you ? I’m 8.

How many kids do you have ? Speaker Consistency How old are you ? I’m 8. How many kids do you have ? 4, you ?

Speaker Consistency When were you born ? In 1942.

When was your mother born ? Speaker Consistency When were you born ? In 1942. When was your mother born ? In 1966.

How to represent users Persona embeddings (70k) Bob

How to represent users uk london sydney great Word embeddings (50k) good stay live okay monday tuesday Persona embeddings (70k) Bob

Persona seq2seq model Encoding Decoding where do you live EOS

Persona seq2seq model Encoding Decoding Bob where do you live EOS Bob Persona embeddings (70k) Bob

Persona seq2seq model Encoding Decoding Bob in where do you live EOS Persona embeddings (70k) Bob

Persona seq2seq model Encoding Decoding Bob Bob in uk where do you live EOS in Bob Persona embeddings (70k)

Persona seq2seq model Encoding Decoding Bob Bob Bob in uk . where do you live EOS in uk Bob Persona embeddings (70k)

Persona seq2seq model Encoding Decoding Bob Bob Bob Bob in uk . EOS where do you live EOS in uk . If you ask one user 100 questions, the 100 responses you will generate are not independent because the same user representation will be incoproaated Word embeddings (50k) uk london sydney great good stay live okay monday tuesday Persona embeddings (70k) Bob

Interaction Seq2Seq model Encoding where do you live speaker-addressee interaction patterns within the conversation. Add speaker, addresse

Interaction Seq2Seq model Encoding where do you live tanh(W* )

Interaction Seq2Seq model Encoding Decoding where do you live EOS tanh(W* )

Interaction Seq2Seq model uk Encoding Decoding where do you live EOS in tanh(W* )

Interaction Seq2Seq model uk . Encoding Decoding where do you live EOS in uk

Datasets and Evaluations Conversation from Twitter 28M turns 74,003 Users minimum of 60 conversational turns Perplexity BLEU (4,000 single reference) Human Evaluation

Quantitative Results Seq2Seq Speaker Model Perplexity 47.2 42.2 (-10.6%) BLEU (without MMI) 0.92 1.12 (+21.7%) BLEU (with MMI) 1.41 1.66 (+11.7%)

Human Evaluation Question Pairs

Human Evaluation Question Pairs What city do you live in ? What country do you live in ?

Human Evaluation Question Pairs What city do you live in ? What country do you live in ? Show it !!! Are you vegan or vegetarian ? Do you eat beaf ?

Human Evaluation Question Pairs What city do you live in ? What country do you live in ? London/UK London/US

Human Evaluation Which Model produces more consistent answers ? Each item is given to 5 judges. Ties are discarded Seq2Seq Model Persona Model Item1 +1 Item2

Human Evaluation Seq2Seq Model Persona Model 0.84 1.33 (+34.7%)

Results (No cherry-picking)

Results (No cherry-picking)

Results (No cherry-picking)

Results (No cherry-picking)

Issues How do we handle long-term dialogue success?

Outlines Mutual Information for Response Generation. Speaker Consistency Reinforcement learning for Response Generation Teaching a bot to ask questions

Issues How do we handle long-term dialogue success? Problem 1: Dull and generic responses.

I don’t know what you are talking about. Issues Problem 1: Dull and generic responses. “I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; ) Do you love me ? I don’t know what you are talking about.

Issues How do we handle long-term dialogue success? Problem 1: Dull and generic responses. Problem 2: Repetitive responses.

Problem 2: Repetitive responses. Shut up !

Problem 2: Repetitive responses. Shut up ! No, you shut up !

Problem 2: Repetitive responses. Shut up ! No, you shut up ! No, you shut up !

Problem 2: Repetitive responses. Shut up ! No, you shut up ! No, you shut up ! No, you shut up !

…… Problem 2: Repetitive responses. Shut up ! No, you shut up !

…… Problem 2: Repetitive responses. See you later ! See you later !

Issues How do we handle long-term dialogue success? Problem 1: Dull and generic responses. Problem 2: Repetitive responses. Problem 3: Short-sighted conversation decisions.

Problem 3: Short-sighted conversation decisions. How old are you ?

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 .

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ?

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying

Problem 3: Short-sighted conversation decisions. Bad Action How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying

Problem 3: Short-sighted conversation decisions. How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying Outcome

Can reinforcement learning handle this? How old are you ? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about you don 't know what you 're saying Outcome does not emerge until a few turns later

Can reinforcement learning handle this?

Notations for Reinforcement Learning

Notations: State How old are you ? how old are you Encoding

Notations: Action How old are you ? i 'm 16 .

Notations: Reward How old are you ? i 'm 16 .

Notations: Reward 1. Ease of answering

Notations: Reward 1. Ease of answering

Notations: Reward 1. Ease of answering We propose to measure the ease of answering a generated turn by using the negative log likelihood of reponding to that utterance with a dull response S: ”I don’t know what you are talking about”

Notations: Reward 2. Information Flow

Notations: Reward 2. Information Flow See you later ! See you later !

Notations: Reward 2. Information Flow See you later ! S1

Notations: Reward 3. Meaningfulness S1 How old are you ? S2 i 'm 16 .

Notations: Reward Easy to answer R1 Information Flow R2 Meaningfulness R3

A message from training set Simulation A message from training set

A message from training set Simulation Encode A message from training set

A message from training set Simulation Encode A message from training set Decode r1 …

A message from training set Simulation Encode A message from training set Encode Decode r1 …

A message from training set Simulation Encode A message from training set Encode Decode r1 Decode r2 …

… Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Sn

… Compute Accumulated Reward R(S1,S2,…,Sn) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn Compute Accumulated Reward R(S1,S2,…,Sn)

… Easy to answer Turn 1 Turn 2 Turn N Input Message Encode Decode Sn Easy to answer

… Easy to answer R1 Information Flow R2 Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2

… Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1 Turn N … Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3

… Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1 Turn N … Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3

… Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1 Turn N … Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3

… Easy to answer R1 Information Flow R2 Meaningfulness R3 Turn 1 Turn 2 Turn N … Input Message Encode Decode Encode Decode Encode Decode S1 S2 S3 Easy to answer R1 Information Flow R2 Meaningfulness R3 Compute Accumulated Reward R(S1,S2,…,Sn)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992)

… REINFORCE Algorithm (William,1992) Turn 1 Turn 2 Turn N Input Message Encode Decode Encode Decode Encode Decode S1 S2 Sn REINFORCE Algorithm (William,1992) What we want to learn

Details 1. Initialize policy using a pre-trained Sequence to Sequence model

Details 1. Initialize policy using a pre-trained Sequence to Sequence model 2. Curriculum Learning: gradually increases the number of simulated turns.

Evaluation Baselines: Vanilla Seq2seq model Mutual information Seq2seq model

Evaluation 1. Number of Simulated Turns

Evaluation 2. Diversity

Evaluation Evaluation

Evaluation

Evaluation

Evaluation

Results Input Mutual Information The proposed model How old are you ? I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

Results Input Mutual Information The proposed model How old are you ? I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

Results Input Mutual Information The proposed model How old are you ? I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

Results Input Mutual Information The proposed model How old are you ? I’m 16. I’m 16. why are you asking ? what is your full name ? i have no idea what 's yours ? I don 't want to go home tonight . Really ? Why ? Do you have any feelings for me ? I don’t know what you are talking about. Would I see you if I didn 't ?

Simulation How old are you ?

Simulation How old are you ? i 'm 16, why are you asking ?

Simulation How old are you ? i 'm 16, why are you asking ? I thought you were 12 .

Simulation How old are you ? i 'm 16, why are you asking ? I thought you were 12 . What made you think so ?

I don’t know what you are talking about . Simulation How old are you ? i 'm 16. why are you asking ? I thought you were 12 . What made you think so ? I don’t know what you are talking about .

Simulation How old are you ? i 'm 16, why are you asking ? I thought you were 12 . What made you think so ? I don’t know what you are talking about . You don’t know what you are saying.

Simulation How old are you ? i 'm 16, why are you asking ? I thought you were 12 . What made you think so ? I don’t know what you are talking about . You don’t know what you are saying. I don’t know what you are talking about .

Simulation Survive 4 turns !! How old are you ? i 'm 16, why are you asking ? I thought you were 12 . What made you think so ? I don’t know what you are talking about . You don’t know what you are saying. I don’t know what you are talking about .

Ongoing Work Better automatic evaluation metrics. (BLEU ? Perplexity ?)

Ongoing Work Turing Test Better automatic evaluation metrics. (BLEU ? Perplexity ?) Turing Test Keep an on

Future Work Turing Test Generative Adversarial Nets Better automatic evaluation metrics. (BLEU ? Perplexity ?) Better automatic evaluation metrics. Turing Test Generative Adversarial Nets

Outline Mutual Information for Response Generation. (Chitchat) How to preserve Speaker Consistency (Chitchat) Reinforcement learning for Response Generation (Chitchat) Teaching a bot to ask questions (Goal-oriented)

How do you like Hom Tanks? Introduction How do you like Hom Tanks?

How do you like Hom Tansk? Introduction Case 1 How do you like Hom Tansk? Who is Hom Tanks?

How do you like Hom Tanks? Introduction Case 1 How do you like Hom Tanks? Who is Hom Tanks ? Do you mean Tom Hanks ?

Introduction Case 1 How do you like Hom Tanks? Who is Hom Tanks? Hom Tanks is the leading actor in Forest Gump.

Introduction Case 1 How do you like Hom Tanks? Who is Hom Tanks? Hom Tanks is the leading actor in Forest Gump. Oh. Yeah. I like him a lot.

How do you like Hom Tanks? Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks?

Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like Hom Tanks? UNK

How do you like Hom Tanks? Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ?

How do you like Hom Tanks? Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ? Give an output anyway

How do you like Hom Tanks? Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ? Forward Backward softmax

Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? How do you like UNK ? output I hate him. He’s such a jerk. Forward Backward softmax

Introduction What will Current Chatbot Systems Do ? How do you like Hom Tanks? Searching the Web for “how do you like Hom Tanks”

MovieQA Domain

MovieQA Domain Template

MovieQA Domain Template

In what scenarios does a bot need to ask questions ?

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification Task 1

In what scenarios does a bot need to ask questions ? Case 1: Question Clarification Task 1 Task 2

In what scenarios does a bot need to ask questions ? Question Templates Case 1: Question Clarification Task 1

In what scenarios does a bot need to ask questions ? Questioning Asking Templates Case 1: Question Clarification Task 1

In what scenarios does a bot need to ask questions ? Case 2: Knowledge Operation.

Case 2: Knowledge Operation. Questioning Asking Templates

Case 2: Knowledge Operation. Task 3 Task 4

In what scenarios does a bot need to ask questions ? Case 3: Knowledge Acquisition .

In what scenarios does a bot need to ask questions ? Case 3: Knowledge Acquisition . Not in the KB … Other questions/ Other answers

Settings

1. Off-line supervised settings

Training Input … Other questions/ Other answers

Training Input … Other questions/ Other answers The teacher’s question

Training Input … Other questions/ Other answers Dialogue History

Training Input … Other questions/ Other answers KB facts

Training Input … Other questions/ Other answers Output

Input Output … Other questions/ Other answers End-to-End Memory Networks Output

Training Settings

Training Settings: 1. Never Asking Questions (TrainQA) Each training setting corresponds a way to generate a kind of dataset

Training Settings: 1. Never Asking Questions (TrainQA)

Training Settings: 1. Never Asking Questions (TrainQA)

Training Settings: 1. Never Asking Questions (TrainQA) 2. Always Asking Question (TrainAQ)

Training Settings: 1. Never Asking Questions (TrainQA) 2. Always Asking Question (TrainAQ)

Test Settings: 1. Never Asking Questions (TrainQA) 2. Always Asking Question (TrainAQ)

Make Predictions Test Settings: 1. Never Asking Questions (TestQA) 2. Always Asking Question (TestAQ) ????? Make Predictions ?????

Task 1-9 TestQA TestAQ TrainQA TrainAQ

Results

Results Asking questions always helps at test time.

Results Asking questions always helps at test time. Only asking questions at training time does not help

Results Asking questions always helps at test time. Only asking questions at training time does not help TrainAQ+TrainAQ performs the best

Setting2: Reinforcement Learning Shall I ask a question ???

Setting2: Reinforcement Learning

Setting2: Reinforcement Learning Ask a question or not …..

Setting2: Reinforcement Learning Ask a question or not ….. If Yes

Setting2: Reinforcement Learning Ask a question or not ….. If Yes Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes

Setting2: Reinforcement Learning Ask a question or not ….. If Yes

Setting2: Reinforcement Learning Ask a question or not ….. If Yes Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes Get Penalized by Cost(AQ) +1

Setting2: Reinforcement Learning Ask a question or not ….. If Yes Get Penalized by Cost(AQ) -1

Setting2: Reinforcement Learning Ask a question or not ….. If Yes If No Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes If No Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes If No +1 Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes If No -1 Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes If No -1 Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Ask a question or not ….. If Yes If No -1 Get Penalized by Cost(AQ)

Setting2: Reinforcement Learning Memory Network Ask a question or not ….. Get Penalized by r

Setting2: Reinforcement Learning Memory Network Ask a question or not ….. Get Penalized by r Memory Network Memory Network

Setting2: Reinforcement Learning Memory Network Ask a question or not ….. Get Penalized by r Memory Network Memory Network

Policy Gradient Setting2: Reinforcement Learning Memory Network Ask a question or not ….. Policy Gradient

Policy Gradient Setting2: Reinforcement Learning Memory Network Ask a question or not ….. Policy Gradient

Policy Gradient Setting2: Reinforcement Learning Baseline Memory Network Ask a question or not ….. Policy Gradient Baseline

Setting2: Reinforcement Learning Bad Student

Setting2: Reinforcement Learning

Setting2: Reinforcement Learning

Setting2: Reinforcement Learning Conclusions: Asking questions helps improve performance.

Conclusion We explored multiple strategies to develop better chit-chat style chatbot

Conclusion We explored multiple strategies to develop better chit-chat style chatbot (mutual information, speaker consistency, reinforcement learning)

Conclusion We explored multiple strategies to develop better chit-chat style chatbot (mutual information, speaker consistency, reinforcement learning) We explored how a bot can interact with users by asking questions to better complete a goal

Q&A

Mutual Information for Response Generation. Bayes Rule Anti-language Model

Mutual Information for Response Generation.

Mutual Information for Response Generation. Training P(T|S) and P(T) Decoding

Mutual Information for Response Generation. Anti-language Model Training P(T|S) and P(T) Decoding

Mutual Information for Response Generation. Ungrammatical Responses !

Mutual Information for Response Generation.

Mutual Information for Response Generation. Solution 1

Mutual Information for Response Generation. Solution 1

Mutual Information for Response Generation. Solution 1

Mutual Information for Response Generation. Solution 1 Only penalize first few words

Mutual Information for Response Generation. Direct Decoding is infeasible

Mutual Information for Response Generation. Direct Decoding is infeasible Train P(T|S), P(S|T)

Mutual Information for Response Generation. Direct Decoding is infeasible Train P(T|S), P(S|T) Generate N-best list using P(T|S)

Mutual Information for Response Generation. Direct Decoding is infeasible Train P(T|S), P(S|T) Generate N-best list using P(T|S) Rerank the N-best list using P(S|T) Font too small

Results (No cherry-picking) Two slides bigger

Results (No cherry-picking)

Persona seq2seq model Tradeoff Encoding Decoding Bob Bob Bob Bob in uk . EOS Encoding Decoding Bob Bob Bob Bob where do you live EOS in uk . Tradeoff