Extracting LTAGs from Treebanks Fei Xia 04/26/07.

Slides:



Advertisements
Similar presentations
Notes on TAG (LTAG) and Feature Structures September AKJ.
Advertisements

Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
How to perform tree surgery Anna Rafferty Marie-Catherine de Marneffe.
April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute.
LTAG Semantics on the Derivation Tree Presented by Maria I. Tchalakova.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
DS-to-PS conversion Fei Xia University of Washington July 29,
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Insert A tree starts with the dummy node D D 200 D 7 Insert D
Features and Unification
June 7th, 2008TAG+91 Binding Theory in LTAG Lucas Champollion University of Pennsylvania
1/15 Synchronous Tree-Adjoining Grammars Authors: Stuart M. Shieber and Yves Schabes Reporter: 江欣倩 Professor: 陳嘉平.
MC-TAG, flexible composition, etc. ARAVIND K. JOSHI March
Conversion from DS to PS. Information in PS and DS PS (e.g., PTB) DS (some target DS) POS tagyes Function tag (e.g., -SBJ) yes Empty category and co-indexation.
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester1 Treebanks and Parsing Jan Hajič Institute of Formal and Applied Linguistics School of.
Announcements Main CSE file server went down last night –Hand in your homework using ‘submit_cse467’ as soon as you can – no penalty if handed in today.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Probabilistic Parsing Ling 571 Fei Xia Week 4: 10/18-10/20/05.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
Notes on TAG (LTAG) and Feature Structures Aravind K. Joshi April
Thoughts on Treebanks Christopher Manning Stanford University.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Intro to NLP - J. Eisner1 Tree-Adjoining Grammar (TAG) One of several formalisms that are actually more powerful than CFG Note: CFG with features.
12/06/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Statistical Parsing Dr. Jan Hajič CS Dept., Johns Hopkins Univ.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
Using Minimum Description Length to make Grammatical Generalizations Mike Dowman University of Tokyo.
Chart Parsing and Augmenting Grammars CSE-391: Artificial Intelligence University of Pennsylvania Matt Huenerfauth March 2005.
Introduction to Syntactic Parsing Roxana Girju November 18, 2004 Some slides were provided by Michael Collins (MIT) and Dan Moldovan (UT Dallas)
NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)
NLP. Introduction to NLP Shallow parsing Useful for information extraction –noun phrases, verb phrases, locations, etc. Example –FASTUS (Appelt and Israel,
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Supertagging CMSC Natural Language Processing January 31, 2006.
 Chapter 4 Noun Phrases Transformational Grammar Engl 424 Hayfa Alhomaid.
LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,
Handling Unlike Coordinated Phrases in TAG by Mixing Syntactic Category and Grammatical Function Carlos A. Prolo Faculdade de Informática – PUCRS CELSUL,
DERIVATION S RULES USEDPROBABILITY P(s) = Σ j P(T,S) where t is a parse of s = Σ j P(T) P(T) – The probability of a tree T is the product.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
CIS Treebanks, Trees, Querying, QC, etc. Seth Kulick Linguistic Data Consortium University of Pennsylvania
Week 12. NP movement Text 9.2 & 9.3 English Syntax.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Coping with Problems in Grammars Automatically Extracted from Treebanks Carlos A. Prolo Computer and Info. Science Dept. University of Pennsylvania.
Probabilistic and Lexicalized Parsing. Probabilistic CFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations –Use weights to.
Natural Language Processing Vasile Rus
Embedded Clauses in TAG
Extracted TAGs and Aspects of Their Use in Stochastic Modeling
Treebanks, Trees, Querying, QC, etc.
English Syntax Week 12. NP movement Text 9.2 & 9.3.
Basic Parsing with Context Free Grammars Chapter 13
Statistical NLP Spring 2011
Probabilistic and Lexicalized Parsing
CKY Parser 0Book 1 the 2 flight 3 through 4 Houston5 11/16/2018
TREE ADJOINING GRAMMAR
LING/C SC 581: Advanced Computational Linguistics
David Kauchak CS159 – Spring 2019
Tree-Adjoining Grammar (TAG)
Principles and Parameters (I)
Artificial Intelligence 2004 Speech & Natural Language Processing
Owen Rambow 6 Minutes.
Presentation transcript:

Extracting LTAGs from Treebanks Fei Xia 04/26/07

Q1: How does grammar extraction work?

Two types of elementary tree in LTAG VP ADVP ADV still VP* Initial tree:Auxiliary tree: S NP VP VNP draft  Arguments and adjuncts are in different types of elementary trees

Adjoining operation Y Y*

They still draft policies

The treebank tree

Step 1: Distinguish head/argument/adjunct

Step 2: Insert additional nodes S still they draft policies PRP NP ADVP RB VP NP VBP NNS VP still they draft policies PRP NPADVP RB VP NP NNS S VBP

Step 3: Build elementary trees #1: #2: #3 : #4 :

Extracted grammar NP PRP they VP ADVPVP* RB still #1:#2: NP NNS policies S NP VP NPVBP draft #3: #4:

Q2: What info was missing in the source treebank? Head/argument/adjunct distinction –Use function tags and heuristics Raising verbs (e.g., seem, appear) vs. other verbs. –He seems to be late –He wants to be late  Need a list of raising verbs in that language Features, feature equation (e.g., agreement), …

Q3: what methodological lessons can be drawn? The algorithm for extracting LTAGs from treebanks is straightforward. Some missing information can be “recovered” based on heuristics, others cannot.  The extracted LTAGs are not as rich as the ones built by hand. Nevertheless, the grammars have been shown to be useful for parsing, SuperTagging, etc.

Q4: What are the advantages of a PS or DS treebank? The original extraction algorithm assumes the input is a PS treebank. But it can be easily extended if the input is a DS treebank. –Extract tree segments from DS –Run DS  PS algorithm on the segments to get elementary trees

Q5: Building a treebank for a formalism or building a general treebank? I prefer the latter because –A general treebank can be used for different formalisms. –Different grammars under the same formalisms can be extracted. –Annotating a general treebank is often easier.