1 I256 Applied Natural Language Processing Fall 2009 Sentence Structure Barbara Rosario.

Slides:



Advertisements
Similar presentations
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Advertisements

Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
PARSING WITH CONTEXT-FREE GRAMMARS
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
9/8/20151 Natural Language Processing Lecture Notes 1.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Parsing and Syntax. Syntactic Formalisms: Historic Perspective “Syntax” comes from Greek word “syntaxis”, meaning “setting out together or arrangement”
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Linguistic Essentials
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
CSA2050 Introduction to Computational Linguistics Parsing I.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Natural Language Processing Vasile Rus
CSC 594 Topics in AI – Natural Language Processing
CSC 594 Topics in AI – Natural Language Processing
Basic Parsing with Context Free Grammars Chapter 13
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Probabilistic and Lexicalized Parsing
Statistical NLP Spring 2011
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
Natural Language - General
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Parsing and More Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
David Kauchak CS159 – Spring 2019
Presentation transcript:

1 I256 Applied Natural Language Processing Fall 2009 Sentence Structure Barbara Rosario

2 Resources for IR Excellent resources for IR: –Course syllabus of Stanford course: Information Retrieval and Web Search (CS 276 / LING 286) syllabus.htmlhttp:// syllabus.html –Book: Introduction to Information Retrieval (

3 Outline Sentence Structure Constituency Syntactic Ambiguities Context Free Grammars (CFG) Probabilistic CFG (PCFG) Main issues –Designing grammars –Learning grammars –Inference (automatic parsing) Lexicalized Trees Review Acknowledgments: Some slides are adapted and/or taken from Klein’s CS 288 courseCS 288 course

4 Analyzing Sentence Structure Key motivation is natural language understanding. –How much more of the meaning of a text can we access when we can reliably recognize the linguistic structures it contains? –With the help of the sentence structure, can we answer simple questions about "what happened" or "who did what to whom"?

5 Phrase Structure Parsing Phrase structure parsing organizes syntax into constituents or brackets new art critics write reviews with computers PP NP N’ NP VP S

6 Example Parse Hurricane Emily howled toward Mexico 's Caribbean coast on Sunday packing 135 mph winds and torrential rain and causing panic in Cancun, where frightened tourists squeezed into musty shelters.

7 Analyzing Sentence Structure How can we use a formal grammar to describe the structure of an unlimited set of sentences? How can we “discover” / design such a grammar?

8 Constituency Tests Words combine with other words to form units. How do we know what nodes go in the tree? –What is the evidence of being a unit? Classic constituency tests: –Substitution –Question answers –Semantic grounds Coherence Reference Idioms –Dislocation –Conjunction

9 Constituent structure: Substitution Substitutability: a sequence of words in a well- formed sentence can be replaced by a shorter sequence without rendering the sentence ill- formed. –The little bear saw the fine fat trout in the brook.

10 Constituent structure

11 Constituent structure Each node in this tree (including the words) is called a constituent. –The immediate constituents of S are NP and VP.

12 Conflicting Tests Constituency isn’t always clear –Units of transfer: think about ~ penser à talk about ~ hablar de –Phonological reduction: I will go  I’ll go I want to go  I wanna go –Coordination He went to and came from the store. La vélocité des ondes sismiques

13 PP Attachment I cleaned the dishes from dinner I cleaned the dishes with detergent I cleaned the dishes in my pajamas I cleaned the dishes in the sink

14 PP Attachment

15 Syntactic Ambiguities Prepositional phrases: They cooked the beans in the pot on the stove with handles. Particle vs. preposition: The puppy tore up the staircase. Gerund vs. participial adjective Visiting relatives can be boring. Changing schedules frequently confused passengers. Modifier scope within NPs impractical design requirements plastic cup holder Coordination scope: Small rats and mice can squeeze into holes or cracks in the wall. And others…

16 Context Free Grammar (CFG) Write symbolic or logical rules: Grammar (CFG) Lexicon ROOT  S S  NP VP NP  DT NN NP  NN NNS NN  interest NNS  raises VBP  interest VBZ  raises … NP  NP PP VP  VBP NP VP  VBP NP PP PP  IN NP

17 Context Free Grammar (CFG) NLTK, context-free grammars are defined in the nltk.grammar module. Define a grammar (you can write your own grammars)

18 CFG: formal definition A context-free grammar is a tuple –N : the set of non-terminals Phrasal categories: S, NP, VP, ADJP, etc. Parts-of-speech (pre-terminals): NN, JJ, DT, VB –T : the set of terminals (the words) –S : the start symbol Often written as ROOT or TOP –R : the set of rules Of the form X  Y 1 Y 2 … Y k, with X, Y i  N Examples: S  NP VP, VP  VP CC VP Also called rewrites, productions, or local trees

19 CFG: parsing Parse a sentence admitted by the grammar Use deduction systems to prove parses from words –Simple 10-rule grammar: 592 parses –Real-size grammar: many millions of parses! This scales very badly, didn’t yield broad-coverage tools

20 Treebank Access Treebank to develop broad-coverage grammars.

21 PLURAL NOUN NOUNDET ADJ NOUN NP CONJ NP PP Treebank Grammar Scale Treebank grammars can be enormous –The raw grammar has ~10K states, excluding the lexicon –Better parsers usually make the grammars larger, not smaller Solution?

22 Probabilistic Context Free Grammar (PCFG) Context free grammar that associates a probability with each of its productions. –P(Y 1 Y 2 … Y k | X) The probability of a parse generated by a PCFG is simply the product of the probabilities of the productions used to generate it.

23 Outline Sentence Structure Constituency Syntactic Ambiguities Context Free Grammars (CFG) Probabilistic CFG (PCFG) Main issues –Designing grammars –Learning grammars (learn the set of rules automatically) –Parsing (inference: analyze a sentence and automatically build a syntax tree) Lexicalized Trees

24 The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]

25  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]  Head lexicalization [Collins ’99, Charniak ’00] The Game of Designing a Grammar

26  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]  Head lexicalization [Collins ’99, Charniak ’00]  Automatic clustering The Game of Designing a Grammar

27 Learning Many complicated learning algorithms… –Another time )-; –Or take CS 288 spring 2010 (recommended!)

28 Parsing with Context Free Grammar A parser processes input sentences according to the productions of a grammar, and builds one or more constituent structures that conform to the grammar. (Inference) –It is a procedural interpretation of the grammar. –It searches through the space of trees licensed by a grammar to find one that has the required sentence along its fringe.

29 Parsing algorithms Top-down method (aka recursive descent parsing) Bottom-up method (aka shift-reduce parsing) Left-corner parsing Dynamic programming technique called chart parsing. Etc…

30 Bottom up parser: Begins with a tree consisting of the node S At each stage it consults the grammar to find a production that can be used to enlarge the tree When a lexical production is encountered, its word is compared against the input After a complete parse has been found, the parser backtracks to look for more parses.

31 Issues Memory requirements Computation time

32 Runtime: Practice Parsing with the vanilla treebank grammar: Why’s it worse in practice? –Longer sentences “unlock” more of the grammar –All kinds of systems issues don’t scale ~ 20K Rules (not an optimized parser!) Observed exponent: 3.6

33 Problems with PCFGs If we do no annotation, these trees differ only in one rule: –VP  VP PP –NP  NP PP Parse will go one way or the other, regardless of words Lexicalization allows us to be sensitive to specific words

34 Lexicalized Trees Add “headwords” to each phrasal node –Syntactic vs. semantic heads –Headship not in (most) treebanks –Usually use head rules, e.g.: NP: –Take leftmost NP –Take rightmost N* –Take rightmost JJ –Take right child VP: –Take leftmost VB* –Take leftmost VP –Take left child

35 Lexicalized PCFGs? Problem: we now have to estimate probabilities like Never going to get these atomically off of a treebank Solution: break up derivation into smaller steps

36 Resources Foundation of Stat NLP (chapter 12) Dan Klein’s group (and his class cs 288) – – Speech and Language processing. Jurafsky and Martin (chapters 12, 13, 14) Software: –Berkeley parser (Klein group) –Michael Collins parser:

37 Dependency grammars Phrase structure grammar is concerned with how words and sequences of words combine to form constituents. A distinct and complementary approach, dependency grammar, focuses instead on how words relate to other words Dependency is a binary asymmetric relation that holds between a head and its dependents.

38 Dependency grammars Dependency graph: labeled directed graph –nodes are the lexical items –labeled arcs represent dependency relations from heads to dependents Can be used to directly express grammatical functions as a type of dependency.

39 Dependency grammars Dependency structure gives attachments. In principle, can express any kind of dependency How to find the dependencies? Shaw Publishing acquired 30 % of American City in March WHAT WHEN WHO

40 Link up pairs with high mutual information –Mutual information measures how much one word tells us about another. –The doesn’t tell us much about what follows I.e. “the” and “red” have small mutual information –United ? Idea: Lexical Affinity Models congress narrowly passed the amended bill

41 Problem: Non-Syntactic Affinity Words select other words (also) on syntactic grounds Mutual information between words does not necessarily indicate syntactic selection. a new year begins in new york expect brushbacks but no beanballs congress narrowly passed the amended bill

42 Idea: Word Classes Individual words like congress are entwined with semantic facts about the world. Syntactic classes, like NOUN and ADVERB are bleached of word-specific semantics. Automatic word classes more likely to look like DAYS- OF-WEEK or PERSON-NAME. We could build dependency models over word classes. [cf. Carroll and Charniak, 1992] congress narrowly passed the amended bill NOUN ADVERB VERB DET PARTICIPLE NOUN

43 Review Python and NLTK Lower level text processing (stemming segmentation…) Grammar –Morphology –Part-of-speech (POS) –Phrase level syntax (PCFG, parsing) Semantics –Word sense disambiguation (WSD) –Lexical acquisition

44 Review “Higher level” apps –Information extraction –Machine translation –Summarization –Question answering –Information retrieval Intro to probability theory and graphical models (GM) –Example for WSD –Language Models (LM) and smoothing Corpus-based statistical approaches to tackle NLP problems –Data (corpora, labels, linguistic resources) –Feature extractions –Statistical models: Classification and clustering

45 Review What I hope we achieved: Given a language problem, know how to frame it in NLP language, and use the appropriate algorithms to tackle it Overall idea of linguistic problems Overall understanding of NLP tasks, both lower level and higher level application Basic understanding of Stat NLP –Corpora & annotation –Classification, clustering –Sparsity problem Familiarity with Python and NLTK