Introduction to Syntactic Parsing Roxana Girju November 18, 2004 Some slides were provided by Michael Collins (MIT) and Dan Moldovan (UT Dallas)

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.
Probabilistic Parsing Chapter 14, Part 2 This slide set was adapted from J. Martin, R. Mihalcea, Rebecca Hwa, and Ray Mooney.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Introduction to Linguistics
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
CS : Speech, Natural Language Processing and the Web/Topics in Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12: Deeper.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
December 2004CSA3050: PCFGs1 CSA305: Natural Language Algorithms Probabilistic Phrase Structure Grammars (PCFGs)
Natural Language Processing Lecture 6 : Revision.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
SI485i : NLP Set 8 PCFGs and the CKY Algorithm. PCFGs We saw how CFGs can model English (sort of) Probabilistic CFGs put weights on the production rules.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
Methods for the Automatic Construction of Topic Maps Eric Freese, Senior Consultant ISOGEN International.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
University of Edinburgh27/10/20151 Lexical Dependency Parsing Chris Brew OhioState University.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Parsing and Syntax. Syntactic Formalisms: Historic Perspective “Syntax” comes from Greek word “syntaxis”, meaning “setting out together or arrangement”
Albert Gatt Corpora and Statistical Methods Lecture 11.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
CS460/626 : Natural Language Processing/Speech, NLP and the Web Some parse tree examples (from quiz 3) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th.
Linguistic Essentials
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
Conversion of Penn Treebank Data to Text. Penn TreeBank Project “A Bank of Linguistic Trees” (as of 11/1992) University of Pennsylvania, LINC Laboratory.
Grammars Grammars can get quite complex, but are essential. Syntax: the form of the text that is valid Semantics: the meaning of the form – Sometimes semantics.
CSA2050 Introduction to Computational Linguistics Parsing I.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
DERIVATION S RULES USEDPROBABILITY P(s) = Σ j P(T,S) where t is a parse of s = Σ j P(T) P(T) – The probability of a tree T is the product.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.
NATURAL LANGUAGE PROCESSING
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
Basic Parsing with Context Free Grammars Chapter 13
CS : Speech, NLP and the Web/Topics in AI
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
CS 388: Natural Language Processing: Syntactic Parsing
LING/C SC 581: Advanced Computational Linguistics
Improving an Open Source Question Answering System
Natural Language - General
CS : Language Technology For The Web/Natural Language Processing
Automatic Detection of Causal Relations for Question Answering
David Kauchak CS159 – Spring 2019
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Introduction to Syntactic Parsing Roxana Girju November 18, 2004 Some slides were provided by Michael Collins (MIT) and Dan Moldovan (UT Dallas)

Overview  An introduction to the parsing problem  Context free grammars (CFDs)  A brief(!) sketch of the syntax of English  Examples of ambiguous structures  PCFGs, their formal properties  Weaknesses of PCFGs  Heads in CFGs  Chart parsing – algorithm and an example

Syntactic Parsing Syntax: provides rules to put together words to form components of sentence and to put together these components to form sentences. Knowledge of syntax is useful for: 1.Parsing 2.QA 3.IE 4.Translation, etc. Grammar: is the formal specification of rules of a language. Parsing/Syntactic Parsing : is a method to perform syntactic analysis of a sentence.

Parsing (Syntactic Structure) INPUT: Boeing is located in Seattle. OUTPUT: S NP N Boeing VP V is VP V located PP P in NP N Seattle

Data for Parsing Experiments Canadian Utilities had1988 revenue of NNPNNPSVBD CDNN NP IN C$ 1.16 billion, $CDCD PUNC, QP NP PP NP mainly RB ADVP from its IN PRP$ natural gas JJNN and electric utility businessesin CCJJNNNNS NP IN Alberta,where NNP PUNC, WHADVP NP WRB the company serves about 800,000 customers. DTNN NP VBZ RBCD QP NNS PUNC. NP VP S SBAR NP PP NP PP VP Penn WSJ Treebank = 50,000 sentences with associated trees Usual set-up: 40,000 training sentences, 2400 test sentences An example tree: TOP S NP Canadian Utilities had 1988 revenue of C$ 1.16 billion, mainly from its natural gas and electric utility businesses in Alberta, where the company serves about 800,000 customers.

The Information Conveyed by Parse Trees 1) Part of speech (POS) for each word (N/NN = noun, V = verb, D/DT = determiner, P/IN=preposition) S NP D theburglar N VP V robbed NP D theapartment N

2) Phrases S NP DT theburglar N VP V robbed DT NP theapartment Noun Phrases (NP): “the burglar”, “the apartment” Verb Phrases (VP):“robbed the apartment” Sentences (S):“the burglar robbed the apartment” N

3) Useful Relationships S NP subjectV VP verb S NP DT theburglar N VP V robbed NP DT theapartment =>“the burglar” is the subject of “robbed” N

An Example Application: Machine Translation English word order is Japanese word order is subject – verb – object subject – object – verb English: Japanese: IBM bought Lotus IBM Lotus bought English: Japanese: Sources said that IBM bought Lotus yesterday Sources yesterday IBM Lotus bought that said

NP => NN 8 NOTE: VI/VT=VB

DERIVATION S RULES USED

DERIVATION S RULES USED S=>NP VP NP VP

DERIVATION S RULES USED S=>NP VP NP=>DT N NP VP DT N VP

The Problem with Parsing: Ambiguity INPUT: She announced a program to promote safety in trucks and vans + POSSIBLE OUTPUTS: S NP She VP announced NP a program VP totopromote NP safety PP inNP trucks and vans S NP She VP announced NP a program VP totopromote NP safety PP in NP trucks andand NP vans S NP SheShe VP announced NP a program VP totopromote NP NPand safety PP in NP trucks NP vans S NP She VP announced NP a program VP totopromote NP PP safetysafety inNP trucks and vans S NP SheShe VP announced NP a program VP to promote NP PP safety in NP trucks andand NP vans S NP She VP announced NP NPVP a program to promote NP safetysafety PP inNP trucks and vans And there are more...

13

VP Vt drove PP downthestreet PP inthecar VP Vt drove PP down NP the N street PP inthecar

NP D the N N JJ fast N NN car N NN mechanic PP IN under NP D the N N NN pigeonin PP IN NP D N theNN box

NP D the N N N JJ fast N NN car N NN mechanic PP IN under NP D the N NN pigeon PP IN in NP D N theNN box

Sources of Ambiguity: Noun Premodifiers fast NN car N NN Noun premodifiers:NP D the N JJ N mechanic D the N N JJ N fastNN car N NN mechanic

A Funny Thing about the Penn Treebank Leaves NP premodifier structure flat, or underspecified: NP DT the JJ fast NN carmechanic NN NP DT the JJ fast NN carmechanic NN PP IN under NP DT thepigeon NN