A Machine Learning Approach to Coreference Resolution of Noun Phrases

Slides:

Advertisements

Similar presentations

General Overview Pasco-Hernando Community College Tutorial Series.

Advertisements

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.

A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.

Easy-First Coreference Resolution Veselin Stoyanov and Jason Eisner Johns Hopkins University.

Using Information Extraction for Question Answering Done by Rani Qumsiyeh.

Supervised models for coreference resolution Altaf Rahman and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas 1.

Semantics Heasley and Hurford Universe of discourse and definiteness Objective: –Students will be able to apply the concept of universe of discourse and.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Cornell Univ. ACL 2002 slides prepared by Ralph Grishman.

Anaphora Resolution Sanghoon Kwak Takahiro Aoyama.

Discourse Analysis Abhijit Mishra ( ) Samir Janardan Sohoni ( ) A statistical approach to coreference resolution of noun phrases.

Andreea Bodnari, 1 Peter Szolovits, 1 Ozlem Uzuner 2 1 MIT, CSAIL, Cambridge, MA, USA 2 Department of Information Studies, University at Albany SUNY, Albany,

Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.

Information Extraction

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.

Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,

Introduction  Information Extraction (IE)  A limited form of “complete text comprehension”  Document 로부터 entity, relationship 을 추출 

Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,

NOUN CLAUSE (compilation material)

Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.

A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida & Massimo Poesio (ACL-HLT 2011)

Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.

COLING 2012 Extracting and Normalizing Entity-Actions from Users’ comments Swapna Gottipati, Jing Jiang School of Information Systems, Singapore Management.

1 Toward Opinion Summarization: Linking the Sources Veselin Stoyanov and Claire Cardie Department of Computer Science Cornell University Ithaca, NY 14850,

Theory of Computation, Feodor F. Dragan, Kent State University 1 TheoryofComputation Spring, 2015 (Feodor F. Dragan) Department of Computer Science Kent.

Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.

Error Analysis for Learning-based Coreference Resolution Olga Uryupina

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.

Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.

Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth This research.

Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narrative Text Alaukik Aggarwal, Department of Computer.

Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.

Named Entity Disambiguation: A Hybrid Statistical and Rule-based Incremental Approach Hien Nguyen * (Ton Duc Thang University, Vietnam) Tru Cao (Ho Chi.

PhD Dissertation Defense Scaling Up Machine Learning Algorithms to Handle Big Data BY KHALIFEH ALJADDA ADVISOR: PROFESSOR JOHN A. MILLER DEC-2014 Computer.

Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.

Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.

Improving the Classification of Unknown Documents by Concept Graph Morteza Mohagheghi Reza Soltanpour

Using Semantic Relations to Improve Information Retrieval

Twitter as a Corpus for Sentiment Analysis and Opinion Mining

Pronoun/Antecedent Agreement Wednesday, Jan. 9 Thursday, Jan. 10.

Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Linguistic Graph Similarity for News Sentence Searching

NOUN CLAUSE (compilation material)

Prepared by: Mahmoud Rafeek Al-Farra

CRF &SVM in Medication Extraction

2018/6/26 An Energy-efficient TCAM-based Packet Classification with Decision-tree Mapping Author: Zhao Ruan, Xianfeng Li , Wenjun Li Publisher: 2013.

For Evaluating Dialog Error Conditions Based on Acoustic Information

CSCE 590 Web Scraping – Information Retrieval

Machine Learning Week 1.

Introduction to Information Extraction

Social Knowledge Mining

Clustering Algorithms for Noun Phrase Coreference Resolution

A Machine Learning Approach to Coreference Resolution of Noun Phrases

Automatic Detection of Causal Relations for Question Answering

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Hierarchical, Perceptron-like Learning for OBIE

Discovering Companies we Know

Extracting Why Text Segment from Web Based on Grammar-gram

Presentation transcript:

A Machine Learning Approach to Coreference Resolution of Noun Phrases 2/24/2019

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 2

The notion of Coreference Definition The grammatical relation between two words that have a common referent (WordNet) In linguistics, Coreference is the phenomenon where two expressions in an utterance both refer to the same thing (Wikipedia) A Coreference resolution process output pairs of noun phrases (coreferences) 2/24/2019 5

The notion of Coreference Usage Information Retrieval Question answering Shallow parsing And more… 2/24/2019 6

The notion of Coreference Example (Eastern Air)a1 Proposes (Date For Talks on ((Pay)c1-Cut)d1 Plan)b1. (Eastern Airlines)a2 executives noticed (union)e1 leaders that the carrier wishes to discuss selective ((wage)c2 reductions)d2 on (Feb. 3)b2. ((Union)e2 representatives who could be reached)f1 said (they)f2 hadn’t decided whether (they)f3 would respond. By proposing (a meeting date)b3, (Eastern)a3 moved one step closer toward reopening current high-cost contract agreements with ((its)a4 unions)e3. 2/24/2019 10

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 11

Extraction of Markables Preprocessing 2/24/2019 14

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 15

Extracted Features 12 suggested features for markables pairs Distance (How far the two markables are) i/j is a Pronoun (he, him, himself, his…) String match feature (base strings match) j is a Definite noun phrase (the) j is a Demonstrative noun phrase (this, that, these, those) Number agreement (i and j are both plural/singular) 2/24/2019 19

Extracted Features cont. 12 suggested features for markables pairs Semantic class agreement (i and j are of the same WordNet class) Gender agreement (i and j are of the same gender) Both proper name (i and j are proper names) Alias (i and j match. e.g. 1st jan and 01.01 for dates) Apposition (j is an apposition of i. e.g. Mubarak, Egypt's president) 2/24/2019 22

Extracted Features Example 2/24/2019 25

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 26

Training Data MUC-6/7 conference corpora Creating positive examples Creating negative examples 2/24/2019 27

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 28

Classifier Construction Classifier types: neural network, SVM, KNN, Decision tree (selected) Decision tree structure: Each node of the tree is a question about one of the features. According to the answer, the path is chosen. When a leaf is reached, its label is returned. 2/24/2019 31

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 32

Testing After a classifier is built, it is tested against a pre-annotated example. Then, the results are compared with the “true” anotation. The measures are Recall (how many of the real coreferences were returned) and Precision (how many of the coreferences returned, are true ones). 2/24/2019 34

Testing Example (Ms. Washington)73's candidacy is being championed by (several powerful lawmakers)74 including ((her)76 boss)75, Chairman John Dingell)77 (D., (Mich.)78) of (the House Energy and Commerce Committee)79. (She)80 currently is (a counsel)81 to (the committee)82. (Ms. Washington)83 and (Mr. DingeU)84 have been considered (allies)85 of (the (securities)87 exchanges)86, while (banks)88 and ((futures)90 exchanges)89 have often fought with (them)91. 2/24/2019 37

Testing Example Classification 2/24/2019 40

Outline The notion of Coreference A Machine learning approach Extraction of Markables Extracted Features Training Data Classifier Construction Testing Result analysis 2/24/2019 41

Result analysis Decision Tree 2/24/2019 44

Result analysis Recall & Precision 2/24/2019 45

Result analysis misconceptions The Decision tree shows that only 8 features are being used. When used with 3 features (alias, apposition, string match) the scores (f-measure) were only 1-2.3% worse then when used with all of them  only 3 features really contribute. 2/24/2019 47

Result analysis misconceptions – cont. 66.3% of the positive results followed the path of the first tree node – string matching. 70% of the total precision problems are caused by string matching: Directors also approved the election of Allan Laufgraben, 54 years old, as president and (chief executive officer)1 and Peter A. Left, 43, as chief operating officer. Milton Petrie, 90-year-old chairman, president and (chief executive officer)2 since the company was founded in 1932, will continue as chairman. 2/24/2019 49

Result analysis conclusions The great achievement according to the authors – the fact that a learning method, over “shallow features” achieves the same performance as top-of-the-art systems. A HUGE majority of the results (and errors) is determined by 1-3 features. Learning over such a small amount of features isn’t really learning. So the achievement does not look like one. Not to me, though. 2/24/2019 52