Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Machine Learning Approach to Coreference Resolution of Noun Phrases

Similar presentations


Presentation on theme: "A Machine Learning Approach to Coreference Resolution of Noun Phrases"— Presentation transcript:

1 A Machine Learning Approach to Coreference Resolution of Noun Phrases
2/24/2019

2 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 2

3 The notion of Coreference Definition
The grammatical relation between two words that have a common referent (WordNet) In linguistics, Coreference is the phenomenon where two expressions in an utterance both refer to the same thing (Wikipedia) A Coreference resolution process output pairs of noun phrases (coreferences) 2/24/2019 5

4 The notion of Coreference Usage
Information Retrieval Question answering Shallow parsing And more… 2/24/2019 6

5 The notion of Coreference Example
(Eastern Air)a1 Proposes (Date For Talks on ((Pay)c1-Cut)d1 Plan)b1. (Eastern Airlines)a2 executives noticed (union)e1 leaders that the carrier wishes to discuss selective ((wage)c2 reductions)d2 on (Feb. 3)b2. ((Union)e2 representatives who could be reached)f1 said (they)f2 hadn’t decided whether (they)f3 would respond. By proposing (a meeting date)b3, (Eastern)a3 moved one step closer toward reopening current high-cost contract agreements with ((its)a4 unions)e3. 2/24/2019 10

6 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 11

7 Extraction of Markables Preprocessing
2/24/2019 14

8 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 15

9 Extracted Features 12 suggested features for markables pairs
Distance (How far the two markables are) i/j is a Pronoun (he, him, himself, his…) String match feature (base strings match) j is a Definite noun phrase (the) j is a Demonstrative noun phrase (this, that, these, those) Number agreement (i and j are both plural/singular) 2/24/2019 19

10 Extracted Features cont.
12 suggested features for markables pairs Semantic class agreement (i and j are of the same WordNet class) Gender agreement (i and j are of the same gender) Both proper name (i and j are proper names) Alias (i and j match. e.g. 1st jan and for dates) Apposition (j is an apposition of i. e.g. Mubarak, Egypt's president) 2/24/2019 22

11 Extracted Features Example
2/24/2019 25

12 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 26

13 Training Data MUC-6/7 conference corpora Creating positive examples
Creating negative examples 2/24/2019 27

14 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 28

15 Classifier Construction
Classifier types: neural network, SVM, KNN, Decision tree (selected) Decision tree structure: Each node of the tree is a question about one of the features. According to the answer, the path is chosen. When a leaf is reached, its label is returned. 2/24/2019 31

16 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 32

17 Testing After a classifier is built, it is tested against a pre-annotated example. Then, the results are compared with the “true” anotation. The measures are Recall (how many of the real coreferences were returned) and Precision (how many of the coreferences returned, are true ones). 2/24/2019 34

18 Testing Example (Ms. Washington)73's candidacy is being championed by (several powerful lawmakers)74 including ((her)76 boss)75, Chairman John Dingell)77 (D., (Mich.)78) of (the House Energy and Commerce Committee)79. (She)80 currently is (a counsel)81 to (the committee)82. (Ms. Washington)83 and (Mr. DingeU)84 have been considered (allies)85 of (the (securities)87 exchanges)86, while (banks)88 and ((futures)90 exchanges)89 have often fought with (them)91. 2/24/2019 37

19 Testing Example Classification
2/24/2019 40

20 Outline The notion of Coreference A Machine learning approach
Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 41

21 Result analysis Decision Tree
2/24/2019 44

22 Result analysis Recall & Precision
2/24/2019 45

23 Result analysis misconceptions
The Decision tree shows that only 8 features are being used. When used with 3 features (alias, apposition, string match) the scores (f-measure) were only 1-2.3% worse then when used with all of them  only 3 features really contribute. 2/24/2019 47

24 Result analysis misconceptions – cont.
66.3% of the positive results followed the path of the first tree node – string matching. 70% of the total precision problems are caused by string matching: Directors also approved the election of Allan Laufgraben, 54 years old, as president and (chief executive officer)1 and Peter A. Left, 43, as chief operating officer. Milton Petrie, 90-year-old chairman, president and (chief executive officer)2 since the company was founded in 1932, will continue as chairman. 2/24/2019 49

25 Result analysis conclusions
The great achievement according to the authors – the fact that a learning method, over “shallow features” achieves the same performance as top-of-the-art systems. A HUGE majority of the results (and errors) is determined by 1-3 features. Learning over such a small amount of features isn’t really learning. So the achievement does not look like one. Not to me, though. 2/24/2019 52


Download ppt "A Machine Learning Approach to Coreference Resolution of Noun Phrases"

Similar presentations


Ads by Google