Slide 1 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Increasing the coverage of answer extraction by applying anaphora resolution Tabu-dag June Jori Mur Humanities Computing University of Groningen
Slide 2 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Research Question To what extent could anaphora resolution help to improve the coverage of answer extraction?
Slide 3 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Outline Background Question Answering (QA) Off-line answer extraction Anaphora resolution for answer extraction Anaphora resolution technique for definite nouns Anaphora resolution technique for pronouns Experiment and Results Conclusion
Slide 4 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Question Anwering (QA) Task: Find an answer in a text collection to a question posed in a natural language. Question: Hoe oud is Ivanisevic? Answer: 23 jaar Question: Hoe begon Diana Ross haar professionele carrière? Answer: als voorzangeres van 'The Supremes' Question: Wanneer werd Hillary Clinton geboren? Answer: 26 oktober 1947
Slide 5 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Techniques for QA Based on Information Retrieval Use keywords to find relevant paragraphs Use NLP techniques to extract and rank answers Off-line extraction before QA process starts
Slide 6 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Off-line answer extraction Use dependency parser to parse the corpus Define dependency patterns [Location Name] has [Number] inhabitants Match dependency relations of sentence from text with dependency pattern Extract and save facts
Slide 7 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Example Question: Hoe oud is Ivanisevic ? Text: NH De 23-jarige Ivanisevic... Answer: 23 jaar
Slide 8 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Problem Question: Hoe oud is Ivanisevic ? Text: AD Ivanisevic blesseerde zich in mei vorig jaar aan de rechterknie. [...] De knieproblemen bleven de 23-jarige Kroaat achtervolgen.
Slide 9 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Anaphora resolution for definite nouns Modify patterns to match definite nouns [Definite noun] has [Number] inhabitants Create instance list using predicate and apposition relation Select first preceding name, check if it occurs together with the noun at the instance list Fall back: select first preceding name
Slide 10 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Anaphora resolution for pronouns Modify patterns to match pronouns [Pronoun] has [Number] inhabitants Create list of boys and girls names (baby names site at the internet) Select first preceding name, check if it does not occur on the list of the opposite sexe of the pronoun Fall back: select first preceding name
Slide 11 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Research Question To what extent could anaphora resolution help to improve the coverage of answer extraction?
Slide 12 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Experiment 12 question types Age Date of Birth Location of Birth Capital Date of Death Location of Death Manner/Cause of Death Age of Death Founded Function Inhabitants Winner
Slide 13 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Experiment Clef corpus for Dutch: Two newspapers (Algemeen Dagblad and NRC Handelsblad) 1994 and 1995 Basic predefined dependency patterns and patterns for anaphora resolution 200 Dutch Questions of Clef-2005 QA system: Joost
Slide 14 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Results for extraction Around 10,900 fact-types extra
Slide 15 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Results for QA 200 questions from Clef-2005 data-set
Slide 16 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Discussion of Results Hypothesis 1: Selection of question types too limited Hypothesis 2: answers to questions occur in one sentence
Slide 17 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Discussion of Results Hypothesis 1: Selection of question types too limited Hypothesis 2: answers to questions occur in one sentence
Slide 18 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Answer in one sentence Question 107: Wie was piloot van de missie die de astronomische satelliet, de Hubble Space Telescope, repareerde ? Text AD : Bowersox was piloot van de missie die de astronomische satelliet, de Hubble Space Telescope, repareerde.
Slide 19 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Conclusion One way to improve the coverage of answer extraction is anaphora resolution Modest improvement for the performance of QA It should be investigated what happens if the domain of question types on which anaphora resolution is applied is broadened
Slide 20 of 20 Increasing the coverage of answer extraction by applying anaphora resolution Questions?