Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences 000
Outline Task Introduction Our Proposed Approach Experiments 000
Outline Task Introduction Our Proposed Approach Experiments 000
Natural Question Answering Question answering (QA) devote to providing exact answers, often in the form of phrases and entities focus on analyzing questions, retrieving facts from text snippets or KBs, and predicting the answers through ranking and reasoning Real-world environments people prefer the correct answer replied with a more natural way Siri will reply a natural answer “Jet Li is 1.64m in height.” for the question “How tall is Jet Li?”, rather than only answering one entity “1.64m”.
Natural Question Answering
For the question “Where was Tagore born?” Natura-QA vs. Chatbot Chatbots are able to generate natural replies Chatbots could only generate fluent responses for ‘Chatting-type’messages Chatbots are often unable to answer factual questions which need to interact with external KBs. For the question “Where was Tagore born?”
Natural Question Answering The system first needs to recognize the topic entity Then extract multiple related facts Finally generate the natural answer
Outline Task Introduction Our Proposed Approach Experiments 000
Generating natural answer engineering methods Natural Language Processing (NLP) tools POS tagging, parsing , etc. Pattern engineering “where was %entity from?” “%entity was born in %birthplace, %pronoun is %nationality citizen.” Weaknesses: Suffers from high costs of manual annotations for training data and patterns. Low coverage that cannot flexibly deal with variable linguistic phenomena in different domains.
Generating natural answers end-to-end manner This paradigm tries to consider question answering in an end-to-end framework. The complicated QA process, including analyzing question, retrieving relevant facts from KB, and generating correct, coherent, natural answers, could be resolved jointly. Challenges: the words in a natural answer may be generated by different ways. we even need to deal with some morphological variants(e.g. “Singapore” in KB but “Singaporean” in answer) . Knowledge-driven & Data-driven KBs are symbol systems End-to-end models are numerical systems
Generating natural answers end-to-end manner Words in a natural answer may be generated by different ways The common words usually are predicted using a (conditional) language model the major entities/phrases are selected from the source question the answering entities/phrases are retrieved from the corresponding KB
Related Work GenQA (Yin et al., 2016) is able to retrieve facts from KBs with neural models cannot copy SUs from the question in generating answers could not deal with complex questions which need to utilize multiple facts
Related Work CopyNet (Gu et al., 2016) is able to copy words from the original source in generating the target cannot retrieve SUs from external memory (e.g. KBs, Texts, etc.)
seq2seq Vocabulary (e.g. 30,000 words) an encoding RNN an decoding RNN
Outline Task Introduction Our Proposed Approach Experiments 000
COREQA Incorporating Copying and Retrieving Mechanisms in QA To generate natural answers for information inquired questions recognize key topics in the question extract related facts from KB generate natural answer with question and the retrieved facts generate a coherent reply fusion those instance-level knowledge with some global-level “smooth” and“glue” words
COREQA Incorporating Copying and Retrieving Mechanisms in QA
Encoder Question Encoding : Bi-RNN { 𝒉 1 → ,…, 𝒉 𝐿 𝑋 → } { 𝒉 𝐿 𝑋 ← ,…, 𝒉 1 ← } 𝐌 𝑄 = 𝐡 𝑡 , 𝐡 𝑡 =[ 𝒉 𝑡 → , 𝒉 𝐿 𝑋 −𝑡+1 ← ] 𝐪=[ 𝒉 𝐿 𝑋 → , 𝒉 1 ← ] Knowledge Base Encoding : Memory Network Fact: (s, p, o) 𝐟=[𝐬,𝐩,𝐨] 𝐌 𝐾𝐵 = 𝒇 𝑖 ={ 𝒇 1 ,…, 𝒇 𝐿 𝐹 }
Decoder Answer words prediction Predicts SUs based on a mixed probabilistic model of three modes, namely the predict-mode, the copy-mode and the retrieve-mode State update st, yt-1, ct st yt-1 [e(yt-1); rqt-1; rkbt-1] Reading short-Memory MQ and MKB attentive models accumulated attentive vectors
Answer words prediction three correlative output layer: shortlist prediction layer, question location copying layer and candidate-facts location retrieving layer we have adopted the instance-specific vocabulary : V∪ 𝑋 𝑄 ∪ 𝑋 𝐾𝐵 the probabilistic function for generating any target SU yt is a “mixture” model p 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝑴 𝑄 , 𝑴 𝐾𝐵 = 𝑝 𝑝𝑟 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝒄 𝑡 ∙ 𝑝 𝑚 𝑝𝑟| 𝒔 𝑡 , 𝑦 𝑡−1 + 𝑝 𝑐𝑜 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝑴 𝑄 ∙ 𝑝 𝑚 𝑐𝑜| 𝒔 𝑡 , 𝑦 𝑡−1 + 𝑝 𝑟𝑒 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝑴 𝐾𝐵 ∙ 𝑝 𝑚 𝑟𝑒| 𝒔 𝑡 , 𝑦 𝑡−1
Answer words prediction Predict-mode 𝑝 𝑝𝑟 𝑦 𝑡 ∙ = 1 𝑍 𝑒 𝜑 𝑝𝑟 ( 𝑦 𝑡 ) Copy-mode 𝑝 𝑐𝑜 𝑦 𝑡 ∙ = 1 𝑍 𝑗: 𝑄 𝑗 = 𝑦 𝑡 𝑒 𝜑 𝑐𝑜 ( 𝑦 𝑡 ) Retrieve-mode 𝑝 𝑟𝑒 𝑦 𝑡 ∙ = 1 𝑍 𝑗: 𝑓 𝑗 = 𝑦 𝑡 𝑒 𝜑 𝑟𝑒 ( 𝑦 𝑡 )
Outline Task Introduction Our Proposed Approach Experiments 000
Natural QA in Restricted Domain The QA systems need to answer questions involving 4 concrete properties of birthdate (including year, month and day) and gender). There are plenty of QA patterns which focus on different aspects of birthdate, for example, “What year were you born?” touches on “year”, but “When is your birthday?” touches on “month and day”.
Experimental Results
For unseen entities We re-construct 2,000 new person entities and their corresponding facts about four known properties Obtain Q-A pairs through matching the sampling patterns
Natural QA in Open Domain Dataset : CQA Question-Answering Pairs Grounding Q-A pairs with ILP (Yin et al., 2016) Q A T1 T2 T3 敢死队主演有谁? 李连杰 史泰龙 阿诺施瓦辛格 敢死队 编剧 主演 史泰龙 李连杰 梅西在什么级联赛 西班牙足球甲级联赛 梅西 职业 国籍 足球 西班牙 段誉是谁的儿子 段延庆和刀白凤 段誉 生父 父母 段延庆 刀白凤
Experimental Results Automatic evaluation (AE) Manual evaluation (ME)
Case Study
Conclusion We propose an end-to-end system to generate natural answers through incorporating copying and retrieving mechanisms in sequence-to-sequence learning. The sequences of SUs in the generated answer may be predicted from the vocabulary, copied from the given question and retrieved from the corresponding KB.
Email: shizhu.he@nlpr.ia.ac.cn Thanks Email: shizhu.he@nlpr.ia.ac.cn