Statistical NLP Spring 2011

Slides:

Advertisements

Similar presentations

The Learning Non-Isomorphic Tree Mappings for Machine Translation Jason Eisner - Johns Hopkins Univ. a b A B events of misinform wrongly report to-John.

Advertisements

Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.

Albert Gatt Corpora and Statistical Methods Lecture 11.

Learning Accurate, Compact, and Interpretable Tree Annotation Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Miloš.

Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.

Dependency Parsing Some slides are based on:

Statistical NLP: Lecture 3

GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.

1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.

Albert Gatt LIN3022 Natural Language Processing Lecture 8.

Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.

1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

1 CS546: Machine Learning and Natural Language Multi-Class and Structured Prediction Problems Slides from Taskar and Klein are used in this lecture TexPoint.

Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.

Intro to NLP - J. Eisner1 Tree-Adjoining Grammar (TAG) One of several formalisms that are actually more powerful than CFG Note: CFG with features.

1 CS546: Machine Learning and Natural Language Latent-Variable Models for Structured Prediction Problems: Syntactic Parsing Slides / Figures from Slav.

The Chain Rule Rule for finding the derivative of a composition of two functions. If y is a function of u and u is a function of x, then y is a function.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.

Albert Gatt Corpora and Statistical Methods Lecture 11.

What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Supertagging CMSC Natural Language Processing January 31, 2006.

NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.

A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.

LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.

Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.

LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.

Introduction to Parsing

Natural Language Processing Vasile Rus

5. Context-Free Grammars and Languages

CSC 594 Topics in AI – Natural Language Processing

Basic Program Analysis

Statistical NLP Winter 2009

Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur

Introduction to Parsing (adapted from CS 164 at Berkeley)

Statistical NLP: Lecture 3

Basic Parsing with Context Free Grammars Chapter 13

Statistical NLP Spring 2010

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27

Statistical NLP Spring 2011

Daniel Fried*, Mitchell Stern* and Dan Klein UC Berkeley

Probabilistic and Lexicalized Parsing

CSCI 5832 Natural Language Processing

Basic Program Analysis: AST

CSCI 5832 Natural Language Processing

CS 388: Natural Language Processing: Syntactic Parsing

LING/C SC 581: Advanced Computational Linguistics

Instructor: Nick Cercone CSEB -

Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui

CSCI 5832 Natural Language Processing

CSCI 5832 Natural Language Processing

CSCI 5832 Natural Language Processing

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27

Chapter 2: A Simple One Pass Compiler

Lecture 7: Introduction to Parsing (Syntax Analysis)

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26

CSCI 5832 Natural Language Processing

CSCI 5832 Natural Language Processing

Linguistic Essentials

CSCI 5832 Natural Language Processing

Statistical NLP Spring 2011

Parsing I: CFGs & the Earley Parser

Probabilistic Parsing

Tree-Adjoining Grammar (TAG)

Dekai Wu Presented by David Goss-Grubbs

Prof. Pushpak Bhattacharyya, IIT Bombay

Presentation transcript:

Statistical NLP Spring 2011 Lecture 18: Parsing IV Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA

Parse Reranking Assume the number of parses is very small We can represent each parse T as an arbitrary feature vector (T) Typically, all local rules are features Also non-local features, like how right-branching the overall tree is [Charniak and Johnson 05] gives a rich set of features

Inside and Outside Scores

Inside and Outside Scores

K-Best Parsing [Huang and Chiang 05, Pauls, Klein, Quirk 10]

Dependency Parsing Lexicalized parsers can be seen as producing dependency trees Each local binary tree corresponds to an attachment in the dependency graph questioned lawyer witness the the

Dependency Parsing Pure dependency parsing is only cubic [Eisner 99] Some work on non-projective dependencies Common in, e.g. Czech parsing Can do with MST algorithms [McDonald and Pereira 05] X[h] h Y[h] Z[h’] h h’ i h k h’ j h k h’

Shift-Reduce Parsers Another way to derive a tree: Parsing No useful dynamic programming search Can still use beam search [Ratnaparkhi 97]

Data-oriented parsing: Rewrite large (possibly lexicalized) subtrees in a single step Formally, a tree-insertion grammar Derivational ambiguity whether subtrees were generated atomically or compositionally Most probable parse is NP-complete

TIG: Insertion

Tree-adjoining grammars Start with local trees Can insert structure with adjunction operators Mildly context-sensitive Models long-distance dependencies naturally … as well as other weird stuff that CFGs don’t capture well (e.g. cross-serial dependencies)

TAG: Long Distance

CCG Parsing Combinatory Categorial Grammar Fully (mono-) lexicalized grammar Categories encode argument sequences Very closely related to the lambda calculus (more later) Can have spurious ambiguities (why?)

Syntax-Based MT

Translation by Parsing

Translation by Parsing

Learning MT Grammars

Extracting syntactic rules Extract rules (Galley et. al. ’04, ‘06)

Rules can... capture phrasal translation reorder parts of the tree traverse the tree without reordering insert (and delete) words

Bad alignments make bad rules This isn’t very good, but let’s look at a worse example...

Sometimes they’re really bad One bad link makes a totally unusable rule!