Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.

Slides:



Advertisements
Similar presentations
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Advertisements

Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Introduction to Machine Learning Approach Lecture 5.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
Thien Anh Dinh1, Tomi Silander1, Bolan Su1, Tianxia Gong
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
A New Method for Automatic Clothing Tagging Utilizing Image-Click-Ads Introduction Conclusion Can We Do Better to Reduce Workload?
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences Recurrent Neural Network-based Language Modeling for an Automatic.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Hierarchical Motion Evolution for Action Recognition Authors: Hongsong Wang, Wei Wang, Liang Wang Center for Research on Intelligent Perception and Computing,
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
Neural Machine Translation
Automatically Labeled Data Generation for Large Scale Event Extraction
Faster R-CNN – Concepts
Concept Grounding to Multiple Knowledge Bases via Indirect Supervision
End-To-End Memory Networks
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Topics Question answering at Bing
A Brief Introduction to Distant Supervision
2 Research Department, iFLYTEK Co. LTD.
Bag-of-Visual-Words Based Feature Extraction
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Neural Machine Translation by Jointly Learning to Align and Translate
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
Are End-to-end Systems the Ultimate Solutions for NLP?
Lecture 24: NER & Entity Linking
Social Knowledge Mining
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong
Paraphrase Generation Using Deep Learning
Chinese Poetry Generation with Planning based Neural Network
Expandable Group Identification in Spreadsheets
Table Cell Search for Question Answering Huan Sun
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)
Block Matching for Ontologies
Introduction to Machine Reading Comprehension
Memory-augmented Chinese-Uyghur Neural Machine Translation
Deep Cross-media Knowledge Transfer
Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences 1, Zhizhong.
Machine Translation(MT)
Principles of Computing – UFCFA3-30-1
Automated Analysis and Code Generation for Domain-Specific Models
Natural Language to SQL(nl2sql)
Report by: 陆纪圆.
Attention.
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Word embeddings (continued)
Attention for translation
The Winograd Schema Challenge Hector J. Levesque AAAI, 2011
Kostas Kolomvatsos, Christos Anagnostopoulos
Human-object interaction
Dan Roth Department of Computer Science
Multiple DAGs Learning with Non-negative Matrix Factorization
Ask and Answer Questions
Information Retrieval
Topic: Semantic Text Mining
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Question Answering System
Extracting Why Text Segment from Web Based on Grammar-gram
Neural Machine Translation by Jointly Learning to Align and Translate
Presentation transcript:

Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences 000

Outline Task Introduction Our Proposed Approach Experiments 000

Outline Task Introduction Our Proposed Approach Experiments 000

Natural Question Answering Question answering (QA) devote to providing exact answers, often in the form of phrases and entities focus on analyzing questions, retrieving facts from text snippets or KBs, and predicting the answers through ranking and reasoning Real-world environments people prefer the correct answer replied with a more natural way Siri will reply a natural answer “Jet Li is 1.64m in height.” for the question “How tall is Jet Li?”, rather than only answering one entity “1.64m”.

Natural Question Answering

For the question “Where was Tagore born?” Natura-QA vs. Chatbot Chatbots are able to generate natural replies Chatbots could only generate fluent responses for ‘Chatting-type’messages Chatbots are often unable to answer factual questions which need to interact with external KBs. For the question “Where was Tagore born?”

Natural Question Answering The system first needs to recognize the topic entity Then extract multiple related facts Finally generate the natural answer

Outline Task Introduction Our Proposed Approach Experiments 000

Generating natural answer engineering methods Natural Language Processing (NLP) tools POS tagging, parsing , etc. Pattern engineering “where was %entity from?” “%entity was born in %birthplace, %pronoun is %nationality citizen.” Weaknesses: Suffers from high costs of manual annotations for training data and patterns. Low coverage that cannot flexibly deal with variable linguistic phenomena in different domains.

Generating natural answers end-to-end manner This paradigm tries to consider question answering in an end-to-end framework. The complicated QA process, including analyzing question, retrieving relevant facts from KB, and generating correct, coherent, natural answers, could be resolved jointly. Challenges: the words in a natural answer may be generated by different ways. we even need to deal with some morphological variants(e.g. “Singapore” in KB but “Singaporean” in answer) . Knowledge-driven & Data-driven KBs are symbol systems End-to-end models are numerical systems

Generating natural answers end-to-end manner Words in a natural answer may be generated by different ways The common words usually are predicted using a (conditional) language model the major entities/phrases are selected from the source question the answering entities/phrases are retrieved from the corresponding KB

Related Work GenQA (Yin et al., 2016) is able to retrieve facts from KBs with neural models cannot copy SUs from the question in generating answers could not deal with complex questions which need to utilize multiple facts

Related Work CopyNet (Gu et al., 2016) is able to copy words from the original source in generating the target cannot retrieve SUs from external memory (e.g. KBs, Texts, etc.)

seq2seq Vocabulary (e.g. 30,000 words) an encoding RNN an decoding RNN

Outline Task Introduction Our Proposed Approach Experiments 000

COREQA Incorporating Copying and Retrieving Mechanisms in QA To generate natural answers for information inquired questions recognize key topics in the question extract related facts from KB generate natural answer with question and the retrieved facts generate a coherent reply fusion those instance-level knowledge with some global-level “smooth” and“glue” words

COREQA Incorporating Copying and Retrieving Mechanisms in QA

Encoder Question Encoding : Bi-RNN { 𝒉 1 → ,…, 𝒉 𝐿 𝑋 → } { 𝒉 𝐿 𝑋 ← ,…, 𝒉 1 ← } 𝐌 𝑄 = 𝐡 𝑡 , 𝐡 𝑡 =[ 𝒉 𝑡 → , 𝒉 𝐿 𝑋 −𝑡+1 ← ] 𝐪=[ 𝒉 𝐿 𝑋 → , 𝒉 1 ← ] Knowledge Base Encoding : Memory Network Fact: (s, p, o) 𝐟=[𝐬,𝐩,𝐨] 𝐌 𝐾𝐵 = 𝒇 𝑖 ={ 𝒇 1 ,…, 𝒇 𝐿 𝐹 }

Decoder Answer words prediction Predicts SUs based on a mixed probabilistic model of three modes, namely the predict-mode, the copy-mode and the retrieve-mode State update st, yt-1, ct  st yt-1  [e(yt-1); rqt-1; rkbt-1] Reading short-Memory MQ and MKB attentive models accumulated attentive vectors

Answer words prediction three correlative output layer: shortlist prediction layer, question location copying layer and candidate-facts location retrieving layer we have adopted the instance-specific vocabulary : V∪ 𝑋 𝑄 ∪ 𝑋 𝐾𝐵 the probabilistic function for generating any target SU yt is a “mixture” model p 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝑴 𝑄 , 𝑴 𝐾𝐵 = 𝑝 𝑝𝑟 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝒄 𝑡 ∙ 𝑝 𝑚 𝑝𝑟| 𝒔 𝑡 , 𝑦 𝑡−1 + 𝑝 𝑐𝑜 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝑴 𝑄 ∙ 𝑝 𝑚 𝑐𝑜| 𝒔 𝑡 , 𝑦 𝑡−1 + 𝑝 𝑟𝑒 𝑦 𝑡 𝒔 𝑡 , 𝑦 𝑡−1 , 𝑴 𝐾𝐵 ∙ 𝑝 𝑚 𝑟𝑒| 𝒔 𝑡 , 𝑦 𝑡−1

Answer words prediction Predict-mode 𝑝 𝑝𝑟 𝑦 𝑡 ∙ = 1 𝑍 𝑒 𝜑 𝑝𝑟 ( 𝑦 𝑡 ) Copy-mode 𝑝 𝑐𝑜 𝑦 𝑡 ∙ = 1 𝑍 𝑗: 𝑄 𝑗 = 𝑦 𝑡 𝑒 𝜑 𝑐𝑜 ( 𝑦 𝑡 ) Retrieve-mode 𝑝 𝑟𝑒 𝑦 𝑡 ∙ = 1 𝑍 𝑗: 𝑓 𝑗 = 𝑦 𝑡 𝑒 𝜑 𝑟𝑒 ( 𝑦 𝑡 )

Outline Task Introduction Our Proposed Approach Experiments 000

Natural QA in Restricted Domain The QA systems need to answer questions involving 4 concrete properties of birthdate (including year, month and day) and gender). There are plenty of QA patterns which focus on different aspects of birthdate, for example, “What year were you born?” touches on “year”, but “When is your birthday?” touches on “month and day”.

Experimental Results

For unseen entities We re-construct 2,000 new person entities and their corresponding facts about four known properties Obtain Q-A pairs through matching the sampling patterns

Natural QA in Open Domain Dataset : CQA Question-Answering Pairs Grounding Q-A pairs with ILP (Yin et al., 2016) Q A T1 T2 T3 敢死队主演有谁? 李连杰 史泰龙 阿诺施瓦辛格 敢死队 编剧 主演 史泰龙 李连杰 梅西在什么级联赛 西班牙足球甲级联赛 梅西 职业 国籍 足球 西班牙 段誉是谁的儿子 段延庆和刀白凤 段誉 生父 父母 段延庆 刀白凤

Experimental Results Automatic evaluation (AE) Manual evaluation (ME)

Case Study

Conclusion We propose an end-to-end system to generate natural answers through incorporating copying and retrieving mechanisms in sequence-to-sequence learning. The sequences of SUs in the generated answer may be predicted from the vocabulary, copied from the given question and retrieved from the corresponding KB.

Email: shizhu.he@nlpr.ia.ac.cn Thanks Email: shizhu.he@nlpr.ia.ac.cn