Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

Albert Gatt Corpora and Statistical Methods Lecture 11.
Page 1 SRL via Generalized Inference Vasin Punyakanok, Dan Roth, Wen-tau Yih, Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois.
1 Automatic Semantic Role Labeling Scott Wen-tau Yih Kristina Toutanova Microsoft Research Thanks to.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
Multilinugual PennTools that capture parses and predicate-argument structures, and their use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus,
Class-based nominal semantic role labeling: a preliminary investigation Matt Gerber Michigan State University, Department of Computer Science.
FrameNet, PropBank, VerbNet Rich Pell. FrameNet, PropBank, VerbNet  When syntactic information is not enough  Lexical databases  Annotate a natural.
E XTRACTING SEMANTIC ROLE INFORMATION FROM UNSTRUCTURED TEXTS Diana Trandab ă 1 and Alexandru Trandab ă 2 1 Faculty of Computer Science, University “Al.
Hindi Syntax Annotating Dependency, Lexical Predicate-Argument Structure, and Phrase Structure Martha Palmer (University of Colorado, USA) Rajesh Bhatt.
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
Max-Margin Matching for Semantic Role Labeling David Vickrey James Connor Daphne Koller Stanford University.
计算机科学与技术学院 Chinese Semantic Role Labeling with Dependency-driven Constituent Parse Tree Structure Hongling Wang, Bukang Wang Guodong Zhou NLP Lab, School.
Semantic Role Labeling Abdul-Lateef Yussiff
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer University of Pennsylvania October 9, 2001 Columbia University.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
The Hindi-Urdu Treebank Lecture 7: 7/29/ Multi-representational, Multi-layered treebank Traditional approach: – Syntactic treebank: PS or DS, but.
Introduction to treebanks Session 1: 7/08/
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Partial Prebracketing to Improve Parser Performance John Judge NCLT Seminar Series 7 th December 2005.
Reranking Parse Trees with a SRL system Charles Sutton and Andrew McCallum University of Massachusetts June 30, 2005.
1 A Fast Deterministic Parser for Chinese Mengqiu Wang, Kenji Sagae and Teruko Mitamura Language Technologies Institute School of Computer Science Carnegie.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Portability, Parallelism and Efficiency in Parsing Dan Bikel University of Pennsylvania March 11th, 2002.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
PropBank Martha Palmer University of Colorado. Unified Linguistic Annotation: Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank, Coreference,
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.
Penn 1 Kindle: Knowledge and Inference via Description Logics for Natural Language Dan Roth University of Illinois, Urbana-Champaign Martha Palmer University.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
INSTITUTE OF COMPUTING TECHNOLOGY Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy.
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 4.
Semantic Role Labeling. Introduction Semantic Role Labeling AgentThemePredicateLocation.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
Iceland 5/30-6/1/07 1 Parsing with Morphological Information for Treebank Construction Seth Kulick University of Pennsylvania.
ARDA Visit 1 Penn Lexical Semantics at Penn: Proposition Bank and VerbNet Martha Palmer, Dan Gildea, Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Karin.
NLP. Introduction to NLP Last week, Min broke the window with a hammer. The window was broken with a hammer by Min last week With a hammer, Min broke.
1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.
Multilinugual PennTools that capture parses and predicate-argument structures, for use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus, Mark.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Julia Hockenmaier and Mark Steedman.   The currently best single-model statistical parser (Charniak, 1999) achieves Parseval scores of over 89% on the.
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
LING 581: Advanced Computational Linguistics Lecture Notes March 2nd.
Natural Language Processing Vasile Rus
COSC 6336: Natural Language Processing
Leonardo Zilio Supervisors: Prof. Dr. Maria José Bocorny Finatto
English Proposition Bank: Status Report
PRESENTED BY: PEAR A BHUIYAN
Parsing in Multiple Languages
Authorship Attribution Using Probabilistic Context-Free Grammars
Two Discourse Driven Language Models for Semantics
Improving a Pipeline Architecture for Shallow Discourse Parsing
LING/C SC 581: Advanced Computational Linguistics
Towards comprehensive syntactic and semantic annotations of the clinical narrative Daniel Albright, Arrick Lanfranchi, Anwen Fredriksen, William F Styler.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Artificial Intelligence 2004 Speech & Natural Language Processing
Progress report on Semantic Role Labeling
Owen Rambow 6 Minutes.
Presentation transcript:

Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey

Project Overview Open research problem: ● Integrating syntactic parsing and semantic role labeling (SRL) Approach ● Retraining a history-based generative lexicalized parser (Bikel, 2002) ● Semantically-enriched training corpus (Penn Treebank + PropBank-derived semantic role annotations)

Treebank Syntactic Bracketing Style

Semantic Roles ● Relationship that a syntactic constituent has with a predicate ● Predicate-argument relations ● PropBank (Palmer et al., 2005)

PropBank Predicate-Argument Relations Frameset: hate.01 ARG0: experiencer ARG1: target

PropBank Argument Types ● ARG0 - ARG5: arguments associated with a verb predicate, defined in the PropBank Frames scheme. ● ARGM-XXX: adjunct-like arguments of various sorts, where XXX is the type of the adjunct. Types include locative (LOC), temporal (TMP), manner (MNR), etc. ● ARGA: causative agents. ● rel: the verb of the proposition.

Current Approaches ● Semantic role labeling (SRL) task: – Identify, given a verb: ● which nodes of the syntactic tree are arguments of that verb, and ● what semantic role each such argument plays with regard to the verb.

Current Approaches ● “Pipelined” approach ● Parsing → Pruning → ML-techniques → post-processing ● CoNLL-2005 (Carreras and Márquez, 2005) – SVM, Random Fields, Random Forests, … – Various lexical parameters

An Integrated Approach to Semantic Parsing ● Integrate syntactic and semantic parsing ● Retrain parser using semantically-enriched corpus (Treebank + PropBank-derived semantic roles) ● Parser itself performs semantic role labeling (SRL)

Project Components ● “Off-the-shelf”: – Parser (Bikel, 2002) emulating Collins’ (1999) model 2 – Penn Treebank Release 2 (Marcus et al., 1993) – PropBank 1.0 (Palmer, 2005) ● Written for project (mainly in Python): – Scripts to annotate Treebank with PropBank data – Script to generate new head-finding rules for Bikel’s parser – SRL evaluation scripts – Utility scripts (pre-processing, etc.)

Appending Semantic Roles to Treebank Syntactic Category Labels wsj/15/wsj_1568.mrg 16 2 gold hate.01 vn--a 0:1-ARG0 2:0-rel 3:1-ARG1

Syntactic Bracketing Evaluation Parseval measures (Black, et al., 1992)

Syntactic Bracketing Evaluation ● Harmonic mean of precision and recall:

Baseline Syntactic Bracketing Performance Parsing Section 00, trained with sections of Penn Treebank (1918 sentences) Parse Time: 114:41

Semantically-Augmented Treebanks ● N: augment node labels with ARGNs only ● N-C: augment node label with conflated ARGNs only ● M: augment node labels with ARGMs only ● M-C: augment node labels with conflated ARGMs only ● NMR: augment node labels with ARGNs, ARGMs and rels

Syntactic Bracketing Evaluation Parsing Section 00, trained with sections of Penn Treebank (1918 sentences)

Semantic Evaluation

● Evaluating by terminal number and height ● Evaluating by terminal span ● How strictly to evaluate?

Semantic Role Labeling Evaluation Parsing Section 00, trained with sections of Penn Treebank (1918 sentences)

Semantic Role Labeling Evaluation Parsing Section 00, trained with sections of Penn Treebank (1918 sentences)

Syntactic Nodes that Play Multiple Semantic Roles

Adding More Information ● Co-index the semantic role labels with governing predicate (verb) ● i.e. include the appropriate roleset name in each semantic label augmentation

Co-indexing the Semantic Augmentations

Adding More Information ● Data sparseness ● Time efficiency ● Need to make some sort of generalizations ● “Syntacto-semantic” verb classes ● VerbNet (Kipper et al., 2002)

Co-indexing with VerbNet classes

Future Ideas ● Integrate the (un co-indexed) output from the re-trained parser into a pipelined SRL system ● Syntactic parsing informed by semantic roles? – Recoding the parser to take better advantage of the semantic roles – Reranking n-best parser outputs based on semantic roles

Summary ● Retrained a history-based generative lexicalized parser with semantically-enriched corpus – Corpus annotation – Generating head-finding rules ● Evaluated parser’s performance – Syntactic parsing ( evalb ) – Semantic parsing (SRL)

References ● Bikel, Daniel M Design of a Multi-lingual, Parallel-processing Statistical Parsing Engine. In Proceedings of HLT2002, San Diego, California. ● Black, Ezra, Frederick Jelinek, John D. Lafferty, David M. Magerman, Robert L. Mercer and Salim Roukos Towards History-based Grammars: Using Richer Models for Probabilistic Parsing. In Proceedings DARPA Speech and Natural Language Workshop, Harriman, New York, pages Morgan Kaufmann. ● Carreras, Xavier and Lluís Màrquez Introduction to the CoNLL Shared Task: Semantic Role Labeling. In Proceedings of CoNLL- 2005, pages ● Collins, Michael John Head-driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia.

References ● Kipper, Karin, Hoa Trang Dang and Martha Palmer Class-Based Construction of a Verb Lexicon. In Proceedings of Seventeenth National Conference on Artificial Intelligence, Austin, Texas. ● Marcus, Mitchell P., Beatrice Santroini and Mary Ann Marcinkiewicz Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19(2): ● Palmer, Martha, Daniel Gildea and Paul Kingsbury The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics, 31(1): ● Yi, Szu-ting and Martha Palmer The integration of syntactic parsing and semantic role labeling. In Proceedings of CoNLL-2005, pages