ComplQA: Complex Question Answering over Knowledge Base Yawei Sun, Lingling Zhang, Gong Cheng, and Yuzhong Qu
Outline Introduction Approach Experiments Conclusion
Introduction Simple Question Complex Question Simple Relation e.g., Who is the wife of Barack Obama ? Complex Question Multiple Relations e.g., What type of music featured on the album Epica was composed by Mozart ?
Introduction - existing works and challenge Siva: transform dependency tree to logical form. (emnlp17) Abu: recognize sub-questions from dependency tree. (www17) Zou: construct query graph based on dependency tree. (tkde18) Challenge 1: parser error problem Question parser error leads to their ill-formed representations.
Introduction - existing works and challenge Cheng: structurally isomorphic.(acl17) Siva: not isomorphic, two operations: contract, and expand. (emnlp17) Zou: not isomorphic, spanning tree of super graph. (tkde18) Challenge 2: structure mismatch problem
Introduction - our solution To alleviate structure mismatch problem Expand existing work by ‘Insert Node’ operation. To alleviate parser error problem Propose Span Tree What type was composed ? by Mozart of music featured on the album Epica
Approach - pipeline (1) Parsing (2) Grounding Question 1.1 Span Tree Grounded Graph Question Ungrounded Graph 1.1 Span Tree 2.1 Node Linking 1.2 Graph Generation 2.2 Structure Mapping
Example: What type of music featured on the album Epica was composed by Mozart ? Class: what type of music ?music.genre 1 1 featured album composed music.album.genre music.artist.genre 2 3 2 3 Entity: Epica Entity: Mozart m.010rcgyt m.082db Ungrounded Graph Grounded Graph
1. Parsing p ungrounded graph question)= p ungrounded graph span tree)∗p span tree question) 1.1 Span Tree: p span tree question) 1.2 Graph Generation: p ungrounded graph span tree)
1.1 Span Tree - p span tree question) Definition: a skeleton with modifier spans (allowed nested) Algorithm 1. Seq2List: Input: question; Output: span tree tokens = [token1, token2,…tokenn] num_layer_list = [0…0], epoch=0 While not finished do epoch += 1 Predict redundancy_span Update(num_layer_list, redundancy_span, epoch) Shield redundancy_span in tokens Update(num_layer_list, skeleton, epoch+1) Return List2Tree(num_layer_list) Algorithm 2. List2Lree: Input: num_layer_list, tokens Output: span tree span_tree = tree() While i <= max(num_layer_list) do span_tree.add(current_span) span_tree.add(child_span) hd = find_headword (child_span, current_span) span_tree.addedge(span,child_span, hd) Return span_tree
1.1 Span Tree - example epoch What type of music featured on the album Epica was composed by Mozart 1 2 3 What type was composed ? by Mozart [4, 4, 2, 2, 1, 1, 1, 1, 1, 4, 4, 3, 3, 4] of music featured on the album Epica
1.2 Graph Generation p ungrounded graph span tree) 1.2.1 Node Recognition 1.2.2 Relation Extraction Class: what type of music 1 What type was composed ? by Mozart featured album composed 2 3 of music featured on the album Epica Entity: Epica Entity: Mozart Ungrounded Graph
2. Grounding p grounded graph ungrounded graph) 2.1 Node Linking 2.2 Structure Mapping
2.1 Node Linking Entity Linking Class Linking A dictionary-based method E.g., Mozart -- m.082db Class Linking A word-embedding similarity method E.g,, what type of music -- music.genre ?music.genre 1 2 3 m.010rcgyt m.082db
2.2 Structure Mapping 2.2.1 Maping 2.2.2 Path Match 2.2.3 Ranking Method 1: Instance-Level Method 2: Schema-Level 2.2.2 Path Match 2.2.3 Ranking Maximum-entropy model ?music.genre 1 music.album.genre music.artist.genre 2 3 m.010rcgyt m.082db
2.2.1 Maping Instance-Level Schema-Level ?music.genre ?music.genre Insert Node Mediator m.010rcgyt --{}-- ?x m.082db --{}-- ?x Schema-Level Insert Node m.082db --{}-- ?x 1 2 3 m.010rcgyt m.082db 2 3 m.010rcgyt m.082db
2.2.2 Path Match
Experiments - GraphQuestions Baselines F1 SEMPRE 13 10.80 PARASEMPRE 14 12.79 JACANA 14 5.08 Siva 17 17.6 Cheng 17 17.02 Dong 17 20.4 ComplQA 26.16
Experiments - ComplexWebQuestions Baselines Precision SIMPQA + PRETRAINED 19.9 SPLITQA + PRETRAINED 25.9 MHQA-GRN 30.1 SplitQA + data augmentation 34.2 Human 63 ComplQA
Conclusion Parsing Grounding Challenge 1: parser error problem Solution: Span Tree Grounding Challenge 2: structure mismatch problem Solution: Insert Node
Thank you Q & A