Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.

Slides:



Advertisements
Similar presentations
Introduction to Syntax and Context-Free Grammars Owen Rambow
Advertisements

Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Context Free Grammars Chapter 12 (Much influenced by Owen Rambow) September 2012 Lecture #5.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Chapter 4 Syntax.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Sub-constituents of NP in English September 12, 2007.
Introduction to Syntax Owen Rambow September 30.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Statistical NLP: Lecture 3
Midterm Exam Nov. 2 1pm to 4pm Room: 3002 NSH Open book –But no internet or cell phone May bring food. May step outside to smoke. May go to restrooms.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Introduction to Syntax Owen Rambow September
Introduction to Syntax Owen Rambow October
Syntax: Structural Descriptions of Sentences. Why Study Syntax? Syntax provides systematic rules for forming new sentences in a language. can be used.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Introduction to Syntax and Context-Free Grammars Owen Rambow September
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Linguistics II Syntax. Rules of how words go together to form sentences What types of words go together How the presence of some words predetermines others.
Today  What is syntax?  Grammaticality  Ambiguity  Phrase structure Readings: 6.1 – 6.2.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Constituency Tests Phrase Structure Rules
Syntax Nuha AlWadaani.
THE PARTS OF SYNTAX Don’t worry, it’s just a phrase ELL113 Week 4.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
For Monday Read chapter 23, sections 1-2 FOIL exercise due.
Natural Language Processing Lecture 6 : Revision.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
Chapter 12: FORMAL GRAMMARS OF ENGLISH Heshaam Faili University of Tehran.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
Today Phrase structure rules, trees Constituents Recursion Conjunction
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Linguistic Essentials
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
For Friday No reading Program 4 due. Program 4 Any questions?
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
College of Science and Humanity Studies, Al-Kharj.
SYNTAX.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
Introduction to Syntax and Context-Free Grammars
An Introduction to the Government and Binding Theory
Statistical NLP: Lecture 3
Part I: Basics and Constituency
CSC NLP -Context-Free Grammars
Introduction to Syntax and Context-Free Grammars cs
Introduction to Linguistics
Introduction to Syntax
David Kauchak CS159 – Spring 2019
Presentation transcript:

Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin

What is Syntax? Structure of language How words are arranged together and related to one another Goal of syntactic analysis: relate surface form (what someone says or writes) to underlying structure, to support semantic analysis (what the utterance or text means) Syntactic representation: typically a tree structure

Structure in Strings A set of words, or, a lexicon: the a small nice big very boy girl sees likes Some `good’ (grammatical) sentences: –the boy likes a girl –the small girl likes the big girl –a very small nice boy sees a very nice boy Some bad (ungrammatical) sentences: –*the boy the girl –*small boy likes nice girl Can we find a way of distinguishing between the two kinds of sequences? Can we identify similarities among grammatical subsequences?

One Version of Constituent Structure Lexicon: the a small nice big very boy girl sees likes Grammatical sentences: –(the) boy (likes a girl) –(the small) girl (likes the big girl) –(a very small nice) boy (sees a very nice boy) Ungrammatical sentences: –*(the) boy (the girl) –*(small) boy (likes the nice girl)

Another Constituency Hypothesis Lexicon: the a small nice big very boy girl sees likes Grammatical sentences: –(the boy) likes (a girl) –(the small girl) likes (the big girl) –(a very small nice boy) sees (a very nice boy) Ungrammatical sentences: –*(the boy) (the girl) –*(small boy) likes (the nice girl) Better: fewer types of constituents (blue and red are of same type)

Even More Structures Lexicon: the a small nice big very boy girl sees likes Grammatical sentences: –((the) boy) likes ((a) girl) –((the) (small) girl) likes ((the) (big) girl) –((a) ((very) small) (nice) boy) sees ((a) ((very) nice) girl) Ungrammatical sentences: –*((the) boy) ((the) girl) –*((small) boy) likes ((the) (nice) girl)

From Substrings to Trees (((the) boy) likes ((a) girl)) boy the likes girl a

How do we Label the Nodes? ( ((the) boy) likes ((a) girl) ) Choose constituents so each one has one non-bracketed word: the head Group words by distribution of constituents they head (POS) –Noun (N), verb (V), adjective (Adj), adverb (Adv), determiner (Det) Category of constituent: XP, where X is POS –NP, S, AdjP, AdvP, DetP

Types of Nodes (((the/Det) boy/N) likes/V ((a/Det) girl/N)) boy the likes girl a DetP NP DetP S Phrase-structure tree nonterminal symbols = constituents terminal symbols = words

Determining Part-of-Speech A blue seat/a child seat: noun or adjective? –Syntax: a blue seat a child seat a very blue seat *a very child seat this seat is blue *this seat is child –Morphology: bluer *childer –blue and child are not the same POS –blue is Adj, child is Noun

Determining Part-of-Speech –Preposition or particle? A he threw out the garbage B he threw the garbage out the door A he threw the garbage out B *he threw the garbage the door out –The two out are not same POS A is particle, B is Preposition

Constituency Some Noun phrases (NPs) A red dog on a blue tree A blue dog on a red tree Some big dogs and some little dogs A dog I Big dogs, little dogs, red dogs, blue dogs, yellow dogs, green dogs, black dogs, and white dogs How do we know these form a constituent?

NP Constituency NPs can all appear before a verb: –Some big dogs and some little dogs are going around in cars… –Big dogs, little dogs, red dogs, blue dogs, yellow dogs, green dogs, black dogs, and white dogs are all at a dog party! –I do not But individual words can’t always appear before verbs: –*little are going… –*blue are… –*and are Must be able to state generalizations like: –Noun phrases occur before verbs

PP Constituency Preposing and postposing: –Under a tree is a yellow dog. –A yellow dog is under a tree. But not: –*Under, is a yellow dog a tree. –*Under a is a yellow dog tree. Prepositional phrases notable for ambiguity in attachment –I saw a man on a hill with a telescope.

Phrase Structure and Dependency Structure likes/ V boy/ N girl/ N the/ Det a/ Det boy the likes girl a DetP NP DetP S All nodes are labeled with words! Only leaf nodes labeled with words!

Phrase Structure and Dependency Structure likes/ V boy/ N girl/ N the/ Det a/ Det boy the likes girl a DetP NP DetP S Representationally equivalent if each nonterminal node has one lexical daughter (its head)

Types of Dependency likes/ V boy/ N girl/ N a/ Det small/ Adj the/ Det very/ Adv sometimes/ Adv Obj Subj Adj(unct) Fw Adj

Grammatical Relations Types of relations between words –Arguments: subject, object, indirect object, prepositional object –Adjuncts: temporal, locative, causal, manner, … –Function Words

Subcategorization List of arguments of a word (typically, a verb), with features about realization (POS, perhaps case, verb form etc) In canonical order Subject-Object-IndObj Example: –like: N-N, N-V(to-inf) –see: N, N-N, N-N-V(inf) NB: J&M talk about subcategorization only within VP

VP Constituency boy the likes girl a DetP NP DetP S boy the likes DetP NP girl a NP DetP S VP

VP Constituency Existence of VP is a linguistic (i.e., empirical) claim, not a methodological claim Syntactic evidence –VP-fronting (and quickly clean the carpet he did! ) –VP-ellipsis (He cleaned the carpet quickly, and so did she ) –Adjuncts can occur before and after VP, but not in VP (He often eats beans, *he eats often beans ) NB: VP cannot be represented in a dependency representation

Context-Free Grammars Defined in formal language theory –Terminals: e.g. cat –Non-terminal symbols: e.g. NP, VP –Start symbol: e.g. S –Rewriting rules: e.g. S  NP VP Start with start symbol, rewrite using rules, done when only terminals left

A Fragment of English S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the Input: the cat is on the mat

Derivations in a CFG S S S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the

Derivations in a CFG NP VP NP S VP S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the

Derivations in a CFG DetP N VP DetP NP S VP N S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the

Derivations in a CFG the cat VP catthe DetP NP S VP N S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the

Derivations in a CFG the cat V PP catthe DetP NP PP S VP NV S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the

Derivations in a CFG the cat is Prep NP cattheis DetP NP PP Prep S VP N NP V S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the

Derivations in a CFG the cat is on Det N cattheis DetP NP DetP PP Prep S VP N NP V S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the on N

Derivations in a CFG the cat is on the mat cattheis DetP NP DetP PP Prep S VP N NP V S  NP VP VP  V PP NP  DetP N N  cat | mat V  is PP  Prep NP Prep  on DetP  the on N themat

A More Complicated Fragment of English S  NP VP S  VP VP  V PP VP  V NP VP  V NP  DetP NP NP  N NP NP  N PP  Prep NP N  cat | mat | food | bowl | Mary V  is | likes | sits Prep  on | in | under DetP  the | a Mary likes the cat bowl. The cat ate the tasty food. Hello. Nice talking to you.

Pocket Sphinx Grammar Format Variables go in angle brackets, e.g. Terminals must appear in your pronunciation dictionary (case sensitive) X Y is concatenation -- e.g., I WANT (X | Y) means X or Y -- e.g., (WANT|NEED) Square brackets mean optional -- e.g., [ON] FRIDAY * means that the expansion may be spoken zero or more times -- e.g. * + means one or more times -- e.g. +

Example = BOSTON | NEWYORK | WASHINGTON | BALTIMORE; = MORNING | EVENING; = FRIDAY | MONDAY; public = (((WHAT TRAINS LEAVE) | (WHAT TIME CAN I TRAVEL) | (IS THERE A TRAIN)) (FROM|TO) [(FROM|TO) ] ON [ ]); Hello. No. I want to go on Tuesday. When does the train leave?

Next Class Language modeling for large vocabulary applications: Ngrams