Download presentation
Presentation is loading. Please wait.
Published byJacob Owens Modified over 6 years ago
1
Automatic Detection of Causal Relations for Question Answering
2003 Roxana Girju Baylor University
2
content Background Main Task Result Analysis Application in QA
3
Background Causation relations expressed in English
Explict (cause, lead to, kill, dry…) Implict (no keyword) Previous Work Using knowledge-based inferences
4
Main Task classifier A classifier based in machine learning
input: sentence(<NP1, verb, NP2>) vectorization main task output: YES or NO classifier
5
Vectorization Training Example (entityNP1,psychological-featureNP1,abstractionNP1, stateNP1,eventNP1,actNP1,groupNP1,possessionNP1,phenomeno nNP1; verb; entityNP2,psychological-featureNP2,abstractionNP2, stateNP2,eventNP2,actNP2,groupNP2,possessionNP2,phenomeno nNP2; target)
6
Sentence: Earthquake generates Tsunami
Vector: <f, f, f, f, f, f, f, f, t, generate, f, f, f, f, f, t, f, f, f> The complete training example: <f, f, f, f, f, f, f, f, t, generate, f, f, f, f, f, t, f, f, f, YES>
7
How to build the training set?
Step1: find the sentences (Where did the data come from?) Step2: select features (How to vectorization?)
8
Find the Sentences Step1: find NP pairs contain causation relationship
WordNet 1.7 contains 429 such NP pairs, the most frequent being medicine.(about 58.28%)
9
Step2: For each pair of causation nouns determined above, search the Internet and retain only sentences containing the pair. From these sentences,determine automatically all the parterns <NP1 verb/verb_expression NP2>
10
Step3: searching the text collection and retained 120 sentences containing the verb
60*120 = 7200 ( corpus A) Step4: extracting 6523 relationships of the type <NP1 verb NP2> from sentences. 2101 are causal relations, while 4422 are not (manually annotate)
11
Select Features Both lexical and semantic features
Lexical features: verb/verb_expression Semantic features: 9 noun hierarchies in WordNet: entity, psychological feature, abstraction, state, event, act, group, possession, and phenomenon.
12
Training Algorithm C4.5 decision tree
Inductive bias: a preference for the shorter tree that places high information gain attributes closer to the root
13
Result Analysis 683 relationships of the type <NP1 verb NP2> in corpus B 102/(115+38) = 66.67%
14
Reasons for Errors Mostly the fact that the causal pattern is very ambiguous Incorrect parsing of noun phrases The use of the rules with smaller accuracy(e.g. 63%) The lack of named entities in WordNet
15
Application in QA 50 questions tested, 61% precision for QA system with the causation module, and 36% precision for QA system without the module.
16
Thanks!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.