Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams and Jeremy Bensley LCC TREC 2003 Question Answering Track
Abstract Information Extraction Technique: –Axiomatic knowledge derived from WordNet for justifying answers extracted from the AQUAINT text collection CICERO LITE: –Named entity recognizer –Recognize precisely a large set of entities that ranged over an extended set of semantic categories Theorem Prover: –Produce abductive justifications of the answers when it had access to the axiomatic transformations of the WordNet glosses
Introduction TREC-2003: Main task & Passage task Main task: –Factoids –Lists –Definitions Main_task_score = ½ * factoid_score + ¼ * list_score + ¼ *definition_score
Factoid questions: –Seek short, fact-based answers –Ex. ”What are pennies made of?”
List questions: –Requests a set of instances of specified types –Ex. “What grapes are used in making wine?” –Final answer set was created form the participants & assessors –IR = #instances judged correct and distinct / #answers in the final set –IP = #instances judged correct and distinct / #instances returned –F = (2 * IP * IR) / (IP + IR)
Definition questions: –Assessor created a list of acceptable info nuggets, some of which are deemed essential –NR (Nugget Recall) = #essential nuggets returned in response / #essential nuggets –NP (Nugget Precision) Allowance = 100 * #essential and acceptable nuggets returned Length = total #non-white space characters in answer strings
Definition questions: –NP = 1, if length < allowance –NP = 1 – (length – allowance) / length, otherwise –F = (26 * NP * NR) / (25 * NP + NR)
TREC-2003: –Factoids: 413 –Lists: 37 –Definition: 50 Answer TypeCount Answers to Factoid383 NIL-answers to Factoid30 Answer instances in List final set549 Essential nuggets for Definition207 Total nuggets for Definition417
The Architecture of the QA System
Question Processing Factoid or List questions: –Identify the expected answer type encoded as Semantic class recognized by CICERO LITE or In a hierarchy of semantic concepts using the WordNet hierarchies for verbs and nouns –Ex. “What American revolutionary general turned over West Point to the British?” Expected answer type is PERSON due to the noun general found in the hierarchy of humans in WordNet
Definition questions: –Parsed for detecting the NPs and matched against a set of patterns –Ex. “What is Iqra?” Matched against the pattern Associated with the answer pattern
Document Processing Retrieve relevant passages based on the keywords provided by question processing Factoid questions: –Ranks the candidate passages List questions: –Ranks better passages having multiple occurrences of concepts of the expected answer type Definition questions: –Allows multiple matches of keywords
Answer Extraction Factoid: –Answers first extracted based on the answer phrase provided by CICERO LITE –If the answer is not a named entity, it is justified abductively by using a theorem prover that makes user of axioms derived form WordNet –Ex. “What apostle was crucified?”
List: –Extracted by using the ranked set of extracted questions –Then determining a cutoff measure based on the semantic similarity of answers
Definition –Relies on pattern matching
Extracting Answers for Factoid Questions 289 correct answers –234: identified by the CICERO LITE or recognizing it from the Answer Type Hierarchy –65: due to theorem prover reported in Moldovan et al The role of theorem prover is to boost the precision by filtering out incorrect answers that are not supported by an abductive justification
Ex. “what country does Greenland belong to?” –Answered by “Greenland, which is a territory of Denmark” –The gloss of the synset of {territory, dominion, province} is “a territorial possession controlled by a ruling state”
Ex. “what country does Greenland belong to?” –The logical transformation for this gloss: control:v#1(e,x1,x2) & country:n#1(x1) & ruling:a#1(x1) & possession:n#2(x2) & territorial:a#1(x2) –Refined expression: process:v#2(e,x1,x2) & COUNTRY:n#1(x1) & ruling:a#1(x1) & territory:n#2(x2)
Extracting Answers for Definition Questions 50 definition questions evaluated 207 essential nuggets 417 total nuggets 485 answers extracted by this system –Two runs: Exact answers & Corresponding sentence- type answers –Vital matches: 68(exact) & 86(sentence) form 207 –110(exact ) & 114(sentence) from final set 417
38 patterns –23 patterns had at least a match for the tested questions
Extracting Answers for List Questions 37 list questions A threshold-based cutoff of the answers extracted Decided on the threshold value by using concept similarities between candidate answers
Given N list answers –First computes the similarity between the first and the last answer –Similarity of a pair of answers –Consider a window of three noun or verb concepts to the left and to the right of the exact answer
Given N list answers: –Separate the concepts in nouns and verbs obtaining –Similarity formula:
Given N list answers:
Performance Evaluation Two different runs: –Exact answers & whole sentence containing the answer
Conclusion Second submission was slightly higher than first submission Definition question gets higher score: –An entire sentence allowed more vital nuggets to be identified by the assessors Factoid questions in the main task were slightly better than in the passage task –Passage might have contained multiple concepts similar to the answer, and thus produced a more vague evaluation context