How to perform tree surgery Anna Rafferty Marie-Catherine de Marneffe.

Slides:



Advertisements
Similar presentations
LING 581: Advanced Computational Linguistics Lecture Notes January 30th.
Advertisements

Chapter 4 Syntax.
Resources: Question Classification Schemes, Graesser et al. Automatic Factual Question Generation from Text (Chapter 3), Michael Heilman.
Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 4.
Chapter 23 Multi-Way Search Trees. Chapter Scope Examine 2-3 and 2-4 trees Introduce the concept of a B-tree Example specialized implementations of B-trees.
Using Treebanks tgrep2 Lecture 2: 07/12/2011. Using Corpora For discovery For evaluation of theories For identifying tendencies – distribution of a class.
The Wonderful World of Tregex
Recovering empty categories. Penn Treebank The Penn Treebank Project annotates naturally occurring text for linguistic structure. It produces skeletal.
1 More Xkwic and Tgrep LING 5200 Computational Corpus Linguistics Martha Palmer March 2, 2006.
Insert A tree starts with the dummy node D D 200 D 7 Insert D
Extracting LTAGs from Treebanks Fei Xia 04/26/07.
B + -Trees (Part 2) Lecture 21 COMP171 Fall 2006.
Trees Main and Savitch Chapter 10. Binary Trees A binary tree has nodes, similar to nodes in a linked list structure. Data of one sort or another may.
AVL Trees / Slide 1 Deletion  To delete a key target, we find it at a leaf x, and remove it. * Two situations to worry about: (1) target is a key in some.
How to Generate Cloze Questions from Definitions: a Syntactic Approach Donna Gates, Gregory Aist (Iowa State University), Jack Mostow, Margaret McKeown.
Data Structures – Binary Tree
Chapter 4 Syntax Part II.
Types of Binary Trees Introduction. Types of Binary Trees There are several types of binary trees possible each with its own properties. Few important.
Tree.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.
LING 581: Advanced Computational Linguistics Lecture Notes February 12th.
IV. SYNTAX. 1.1 What is syntax? Syntax is the study of how sentences are structured, or in other words, it tries to state what words can be combined with.
Tree (new ADT) Terminology:  A tree is a collection of elements (nodes)  Each node may have 0 or more successors (called children)  How many does a.
Lecture E: Phrase functions and clause functions
Chapter 4: Syntax Part V.
Making it stick together…
NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)
 defined as Extensible Markup Language (XML) is a set of rules for encoding documents  Defines structure and data.
M180: Data Structures & Algorithms in Java Trees & Binary Trees Arab Open University 1.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Handling Unlike Coordinated Phrases in TAG by Mixing Syntactic Category and Grammatical Function Carlos A. Prolo Faculdade de Informática – PUCRS CELSUL,
DERIVATION S RULES USEDPROBABILITY P(s) = Σ j P(T,S) where t is a parse of s = Σ j P(T) P(T) – The probability of a tree T is the product.
TYPES OF PHRASES REPRESENTING THE INTERNAL STRUCTURE OF PHRASES 12/5/2016.
Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 3.
3.3 A More Detailed Look At Transformations Inversion (revised): Move Infl to C. Do Insertion: Insert interrogative do into an empty.
18-1 Chapter 18 Binary Trees Data Structures and Design in Java © Rick Mercer.
Question 4 Tutorial 8. Part A Insert 20, 10, 15, 5,7, 30, 25, 18, 37, 12 and 40 in sequence into an empty binary tree
LING 581: Advanced Computational Linguistics Lecture Notes February 24th.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Coping with Problems in Grammars Automatically Extracted from Treebanks Carlos A. Prolo Computer and Info. Science Dept. University of Pennsylvania.
Embedded Clauses in TAG
G64ADS Advanced Data Structures
A Linear-Space Top-down Algorithm for Tree Inclusion Problem
An Introduction to the Government and Binding Theory
Binary search tree. Removing a node
4.3 The Generative Approach
Btrees Insertion.
Tree.
Red-Black Trees Bottom-Up Deletion.
Binary Search Tree In order Pre order Post order Search Insertion
Road Map - Quarter CS Concepts Data Structures Java Language
Structural relations Carnie 2013, chapter 4 Kofi K. Saah.
LING 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
Tree Rotations & Splay Trees
Red-Black Trees Bottom-Up Deletion.
Assignment Demonstration
UMBC CSMC 341 Red-Black-Trees-1
Random inserting into a B+ Tree
B-TREE ________________________________________________________
Red-Black Trees Bottom-Up Deletion.
Representing binary trees with lists
CMSC 341 Splay Trees.
(edited by Nadia Al-Ghreimil)
Plan for Lecture Review code and trace output
Heaps By JJ Shepherd.
Red Black Trees.
LING/C SC 581: Advanced Computational Linguistics
Data Structures – Binary Tree
Presentation transcript:

How to perform tree surgery Anna Rafferty Marie-Catherine de Marneffe

Tsurgeon by Roger Levy What? makes operations on a grammatical tree How? based on Tregex syntax Where? Javanlp: trees.tregex.tsurgeon

How? Tregex utility for identifying patterns in trees (like regular expressions for strings) node descriptions and relationships between nodes NP < /^NN/ NP NN filterscigaretteitsin croco- dilite usingstoppedfirmThe PRP IN PPVBG VPVBDDT VP S NN NP NN NP NNS

Tsurgeon syntax Define a pattern to be matched on the trees VBZ=vbz $ NP Define one or several operation(s) relabel vbz VBZ_TRANSITIVE

Delete (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT ?))) PUNCT=punct > SBARQ delete punct

Delete (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT ?))) PUNCT=punct > SBARQ delete punct Delete the node and everything below it delete …

Excise (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat)))))) SBARQ=sbarq > ROOT excise sbarq sbarq (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat)))))

Excise (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat)))))) SBARQ=sbarq > ROOT excise sbarq sbarq name1 is name2 or dominates name2. All children of name2 go into the parent of name1, where name1 was. excise

Prune prune … Different from delete : If after the pruning the parent has no children anymore, the parent is pruned too.

Insert (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))))) SQ=sq > ROOT !<- /PUNCT/ insert (PUNCT.) >-1 sq (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT.))) Caveat: cyclic application of rules

Position for ‘insert’ and ‘move’ insert := $+ the left sister of the named node $-the right sister of the named node >i the i_th daughter of the named node >-i the i_th daughter, counting from the right, of the named node.

Move (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT.))) VP < (/^WH/=wh $++ /^VB/=vb) move vb $+ wh move moves the named node into the specified position

Move (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT.))) VP < (/^WH/=wh $++ /^VB/=vb) move vb $+ wh (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (VB eat) (WHNP what))) (PUNCT.)))

Adjoin (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (VB eat) (WHNP what))) (PUNCT.))) VP=vp > SQ !> (__ << usually) adjoin (VP (ADVP (ADV usually)) vp (ROOT (SQ (NP (NNS Cats)) (VP (ADVP (RB usually)) (VP (VBP do) (VP (VB eat) (WHNP what))) (PUNCT.)))

Adjoin syntax adjoin Adjoins the specified auxiliary tree into the named node. The daughters of the target node will become the daughters of the foot of the auxiliary tree. adjoin (VP (ADVP (ADV usually)) vp foot

On the command line java Tsurgeon -treeFile [ ]* aFile -> a file containing the trees to be transformed operationFile -> pattern (Tregex expression) an empty line operation(s) (one by line)

How to use the Tsurgeon class TregexPattern matchPattern = TregexPattern.compile("SQ=sq < (/^WH/ $++ VP)"); List ps = new ArrayList (); TsurgeonPattern p = Tsurgeon.parseOperation("relabel sq S"); ps.add(p); Collection result = Tsurgeon.processPatternOnTrees(matchPattern,Tsurgeon. collectOperations(ps),lTrees);

To become a specialist See Roger’s README! Practice tree surgery!