Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.

Slides:

Advertisements

Similar presentations

Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.

Advertisements

Dependency tree projection across parallel texts David Mareček Charles University in Prague Institute of Formal and Applied Linguistics.

CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

1 Relational Learning of Pattern-Match Rules for Information Extraction Presentation by Tim Chartrand of A paper bypaper Mary Elaine Califf and Raymond.

Part-Of-Speech Tagging and Chunking using CRF & TBL

Automatic Identification of Cognates, False Friends, and Partial Cognates University of Ottawa, Canada University of Ottawa, Canada.

Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.

Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.

Probabilistic Detection of Context-Sensitive Spelling Errors Johnny Bigert Royal Institute of Technology, Sweden

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

An Attack on Data Sparseness JHU –Tutorial June

January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.

Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.

1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.

Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.

1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.

Empirical Methods in Information Extraction - Claire Cardie 자연어처리연구실 한 경 수

Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.

Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.

Part of speech (POS) tagging

Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,

LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.

Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.

1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.

Research methods in corpus linguistics Xiaofei Lu.

Albert Gatt Corpora and Statistical Methods Lecture 9.

Natural Language Processing Expectation Maximization.

A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.

Machine translation Context-based approach Lucia Otoyo.

Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.

Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Survey of Semantic Annotation Platforms

Evaluation of the Statistical Machine Translation Service for Croatian-English Marija Brkić Department of Informatics, University of Rijeka

Querying Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval Doctorate Course Web Information Retrieval Speaker Gaia Trecarichi.

2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.

Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:

Using a Lemmatizer to Support the Development and Validation of the Greek WordNet Harry Kornilakis 1, Maria Grigoriadou 1, Eleni Galiotou 1,2, Evangelos.

Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.

Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.

Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.

Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.

13-1 Chapter 13 Part-of-Speech Tagging POS Tagging + HMMs Part of Speech Tagging –What and Why? What Information is Available? Visible Markov Models.

Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.

What you have learned and how you can use it : Grammars and Lexicons Parts I-III.

Tokenization & POS-Tagging

Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.

Mutual bilingual terminology extraction Le An Ha*, Gabriela Fernandez**, Ruslan Mitkov*, Gloria Corpas*** * University of Wolverhampton ** Universidad.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Automatic Grammar Induction and Parsing Free Text - Eric Brill Thur. POSTECH Dept. of Computer Science 심 준 혁.

Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff School of Computing University of Utah Janyce Wiebe, Theresa Wilson Computing.

Shallow Parsing for South Asian Languages -Himanshu Agrawal.

Chunk Parsing II Chunking as Tagging. Chunk Parsing “Shallow parsing has become an interesting alternative to full parsing. The main goal of a shallow.

Learning Extraction Patterns for Subjective Expressions 2007/10/09 DataMining Lab 안민영.

Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.

8 December 1997Industry Day Applications of SuperTagging Raman Chandrasekar.

Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.

Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.

NLP Midterm Solution #1 bilingual corpora –parallel corpus (document-aligned, sentence-aligned, word-aligned) (4) –comparable corpus (4) Source.

Natural Language Processing Vasile Rus

Approaches to Machine Translation

--Mengxue Zhang, Qingyang Li

Approaches to Machine Translation

Presentation transcript:

Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy Johns Hopkins University Presented by Ben Wellington Right In Front of You

Outline Reasons for Research Information Extraction AutoSlog Projection Experiments\Results Conclusions

Current IE Systems Current Information Extraction (IE) systems are expensive –Parsing tools –Development texts –Specialized dictionaries Language and domain specific Not easily portable across languages

Resources Available Not all languages have equal resources Annotated corpora, text analysis tools easily available for English For most of worlds languages, these don’t exist.

Information Extraction Task Plane Crashes –Find victim, vehicle, location “AutoSlog-TS” –Gather statistics from a corpus of relevant texts (with in the domain) and irrelevant ones (outside the domain) –Each pattern is a linguistic expression can extract noun phrases from one of three syntactic positions: subject, direct object, object of a prepositional phrase. –“ crashed”, “hijacked ”

AutoSlog Used AutoSlog-TS using AP news stories –About plane crashes –Not about plane crashes Creates list of extraction patterns, ranked according to their association with the domain –A human reviews the list to decide which are useful –For new text, use “Sundance”, a shallow parser. –“ crashed”, “hijacked ”

Cross Language Projection Yarowsky 2001 developed cross-language –POS tagging, –Base noun phrases –Named-entity tags –Morphological analysis. Atserias 1997 developed cross language –Ontologies and WordNets

Mechanics of Projection Use Machine Translation tools to create artificial parallel corpus –Adds error (-) –Frees cross-language projection research from existing bitexts, such as “Canadian Hansards” (+) –Can’t fine “plane crash” bitexts readily. (+)

The Algorithm 1.Sentence align the parallel corpus 2.Word-align the parallel corpus using the Giza++ system 3.Transfer English IE annotations and noun-phrase boundaries 4.Train a stand alone IE tagger on these projected annotations

Transferred

Transformation Based Learning TBL was discussed earlier in the course The tagger applies a number of transformations of the form 'in context C, if a word is tagged A, change its tag to B‘ The rule learner starts with the initial tagging (each word assigned its most common tag), tries all possible transformations, and selects the one which produces the greatest improvement (maximizes errors corrected - errors introduced).

TBL Rules Lexical N-gram Rule Templates

TBL Rules Lexical +POS N-gram rule templates

TBL Rules Subject Capture Rule Templates (no parser to find subjects)

TBL Rules Chaining Rule Templates (no parser to find noun-phrase boundaries)

The Experiment. English and French AP news stories about plane crashes. Two languages from different years. English (420,000 words) French (150,000 words) Hired 3 University Students to mark location, vehicle and victim.

Now that we’re all in Agreement Annotator Agreement –Exact-word match –Exact-NP match –“Boeing 727” vs. “new Boeing 727” –16-31% for French, 24-27% for English –43-54% for French, 51-59% for English –(one French annotator marked 4.5 times as many locations as another)

Baseline, monolingual French and English.

TBL-Based IE Projection ~Equal to F-measure of monolingual English, ~9% lower then F-measure of monolingual French

Two’s Company… Two Step processes are better than Three Step Processes. –Mean F-measure drop was from 32.6 to 28.8 A collection of domain specific English Texts can be used to project and induce new IE systems, even with out domain- specific texts in foreign language

Strength in Numbers Used a voting system with the different techniques. Able to achieve F-measure of 48%, 3% higher than the strongest individual system. Only 4% lower than the French monolingual system.

Conclusion Used IE systems for English to automatically derive IE systems for a second language Very little human effort French and English are relatively close though… Improve performance by improving English performance, or minor specializations for the language.