Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Detection of Causal Relations for Question Answering

Similar presentations


Presentation on theme: "Automatic Detection of Causal Relations for Question Answering"— Presentation transcript:

1 Automatic Detection of Causal Relations for Question Answering
2003 Roxana Girju Baylor University

2 content Background Main Task Result Analysis Application in QA

3 Background Causation relations expressed in English
Explict (cause, lead to, kill, dry…) Implict (no keyword) Previous Work Using knowledge-based inferences

4 Main Task classifier A classifier based in machine learning
input: sentence(<NP1, verb, NP2>) vectorization main task output: YES or NO classifier

5 Vectorization Training Example (entityNP1,psychological-featureNP1,abstractionNP1, stateNP1,eventNP1,actNP1,groupNP1,possessionNP1,phenomeno nNP1; verb; entityNP2,psychological-featureNP2,abstractionNP2, stateNP2,eventNP2,actNP2,groupNP2,possessionNP2,phenomeno nNP2; target)

6 Sentence: Earthquake generates Tsunami
Vector: <f, f, f, f, f, f, f, f, t, generate, f, f, f, f, f, t, f, f, f> The complete training example: <f, f, f, f, f, f, f, f, t, generate, f, f, f, f, f, t, f, f, f, YES>

7 How to build the training set?
Step1: find the sentences (Where did the data come from?) Step2: select features (How to vectorization?)

8 Find the Sentences Step1: find NP pairs contain causation relationship
WordNet 1.7 contains 429 such NP pairs, the most frequent being medicine.(about 58.28%)

9 Step2: For each pair of causation nouns determined above, search the Internet and retain only sentences containing the pair. From these sentences,determine automatically all the parterns <NP1 verb/verb_expression NP2>

10 Step3: searching the text collection and retained 120 sentences containing the verb
60*120 = 7200 ( corpus A) Step4: extracting 6523 relationships of the type <NP1 verb NP2> from sentences. 2101 are causal relations, while 4422 are not (manually annotate)

11 Select Features Both lexical and semantic features
Lexical features: verb/verb_expression Semantic features: 9 noun hierarchies in WordNet: entity, psychological feature, abstraction, state, event, act, group, possession, and phenomenon.

12 Training Algorithm C4.5 decision tree
Inductive bias: a preference for the shorter tree that places high information gain attributes closer to the root

13 Result Analysis 683 relationships of the type <NP1 verb NP2> in corpus B 102/(115+38) = 66.67%

14 Reasons for Errors Mostly the fact that the causal pattern is very ambiguous Incorrect parsing of noun phrases The use of the rules with smaller accuracy(e.g. 63%) The lack of named entities in WordNet

15 Application in QA 50 questions tested, 61% precision for QA system with the causation module, and 36% precision for QA system without the module.

16 Thanks!


Download ppt "Automatic Detection of Causal Relations for Question Answering"

Similar presentations


Ads by Google