Modeling Grammaticality

Slides:



Advertisements
Similar presentations
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
Advertisements

Intro to NLP - J. Eisner1 Modeling Grammaticality [mostly a blackboard lecture]
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
1 Jason EisnerNoah A. Smith Johns HopkinsCarnegie Mellon TeachCL ACL – June 20, 2008 Competitive Grammar Writing VP.
FSG, RegEx, and CFG Chapter 2 Of Sag’s Syntactic Theory.
Dr. Abdullah S. Al-Dobaian1 Ch. 2: Phrase Structure Syntactic Structure (basic concepts) Syntactic Structure (basic concepts)  A tree diagram marks constituents.
Statistical NLP: Lecture 3
SYNTAX Introduction to Linguistics. BASIC IDEAS What is a sentence? A string of random words? If it is a sentence, does it have to be meaningful?
PSY 369: Psycholinguistics Some basic linguistic theory part2.
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
Syntax 1 Ling400. What is syntax? the study of the internal structure of sentences: how to put together words to form sentencesthe study of the internal.
Natural Language Processing - Feature Structures - Feature Structures and Unification.
Intro to NLP - J. Eisner1 Probabilistic CKY.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Syntax LING October 11, 2006 Joshua Tauberer.
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Context Free Grammar S -> NP VP NP -> det (adj) N
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
Intro to NLP - J. Eisner1 Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:  no restrictions on the form of the grammar:  A.
Natural Language Processing Lecture 6 : Revision.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Linguistic Essentials
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
SYNTAX.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Finite State LanguagesCSE Intro to Cognitive Science1 The Computational Modeling of Language: Finite State Languages Lecture I: Slides 1-21 Lecture.
NATURAL LANGUAGE PROCESSING
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Grammars.
Subject-Verb Agreement: Helping Words Get Along
Statistical NLP Winter 2009
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Modeling Grammaticality
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
Probabilistic CKY Parser
SYNTAX.
Chapter Eight Syntax.
Formal Language Theory
Part I: Basics and Constituency
Method of Language Definition
CKY Parser 0Book 1 the 2 flight 3 through 4 Houston5 11/16/2018
CS 388: Natural Language Processing: Syntactic Parsing
Chapter Eight Syntax.
Natural Language - General
Introduction to Linguistics
Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:
Subject-Verb Agreement: Helping Words Get Along
Linguistic Essentials
David Kauchak CS159 – Spring 2019
Lecture 7: Definite Clause Grammars
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Prof. Pushpak Bhattacharyya, IIT Bombay
Presentation transcript:

Modeling Grammaticality 600.465 - Intro to NLP - J. Eisner

Word trigrams: A good model of English? Which sentences are acceptable?  names all has ? s  ? forms ? was  his house no main verb same 600.465 - Intro to NLP - J. Eisner 2 has s has

Why it does okay … We never see “the go of” in our training text. So our dice will never generate “the go of.” That trigram has probability 0.

Why it does okay … but isn’t perfect. We never see “the go of” in our training text. So our dice will never generate “the go of.” That trigram has probability 0. But we still got some ungrammatical sentences … All their 3-grams are “attested” in the training text, but still the sentence isn’t good. You shouldn’t eat these chickens because these chickens eat arsenic and bone meal … Training sentences … eat these chickens eat … 3-gram model

Why it does okay … but isn’t perfect. We never see “the go of” in our training text. So our dice will never generate “the go of.” That trigram has probability 0. But we still got some ungrammatical sentences … All their 3-grams are “attested” in the training text, but still the sentence isn’t good. Could we rule these bad sentences out? 4-grams, 5-grams, … 50-grams? Would we now generate only grammatical English?

Grammatical English sentences Possible under trained 50-gram model ? Training sentences Possible under trained 4-gram model Possible under trained 3-gram model (can be built from observed 3-grams by rolling dice)

What happens as you increase the amount of training text? Possible under trained 50-gram model ? Training sentences Possible under trained 4-gram model Possible under trained 3-gram model (can be built from observed 3-grams by rolling dice)

What happens as you increase the amount of training text? Training sentences (all of English!) Now where are the 3-gram, 4-gram, 50-gram boxes? Is the 50-gram box now perfect? (Can any model of language be perfect?) Can you name some non-blue sentences in the 50-gram box?

Are n-gram models enough? Can we make a list of (say) 3-grams that combine into all the grammatical sentences of English? Ok, how about only the grammatical sentences? How about all and only?

Can we avoid the systematic problems with n-gram models? Remembering things from arbitrarily far back in the sentence Was the subject singular or plural? Have we had a verb yet? Formal language equivalent: A language that allows strings having the forms a x* b and c x* d (x* means “0 or more x’s”) Can we check grammaticality using a 50-gram model? No? Then what can we use instead?

Finite-state models Regular expression: a x* b | c x* d Finite-state acceptor: x Must remember whether first letter was a or c. Where does the FSA do that? a b x c d

Context-free grammars Sentence  Noun Verb Noun S  N V N N  Mary V  likes How many sentences? Let’s add: N  John Let’s add: V  sleeps, S  N V Let’s add: V  thinks, S  N V S

Write a grammar of English Syntactic rules. 1 S  NP VP . 1 VP  VerbT NP 20 NP  Det N’ 1 NP  Proper 20 N’  Noun 1 N’  N’ PP 1 PP  Prep NP You have two weeks.  What’s a grammar?

Now write a grammar of English Syntactic rules. Lexical rules. 1 S  NP VP . 1 VP  VerbT NP 20 NP  Det N’ 1 NP  Proper 20 N’  Noun 1 N’  N’ PP 1 PP  Prep NP 1 Noun  castle 1 Noun  king … 1 Proper  Arthur 1 Proper  Guinevere 1 Det  a 1 Det  every 1 VerbT  covers 1 VerbT  rides 1 Misc  that 1 Misc  bloodier 1 Misc  does

Now write a grammar of English Here’s one to start with. 1 S  NP VP . 1 VP  VerbT NP 20 NP  Det N’ 1 NP  Proper 20 N’  Noun 1 N’  N’ PP 1 PP  Prep NP S 1 NP VP .

Now write a grammar of English Here’s one to start with. 1 S  NP VP . 1 VP  VerbT NP 20 NP  Det N’ 1 NP  Proper 20 N’  Noun 1 N’  N’ PP 1 PP  Prep NP S NP VP . Det N’ 20/21 1/21

Now write a grammar of English Here’s one to start with. 1 S  NP VP . 1 VP  VerbT NP 20 NP  Det N’ 1 NP  Proper 20 N’  Noun 1 N’  N’ PP 1 PP  Prep NP S NP VP . Det N’ drinks [[Arthur [across the [coconut in the castle]]] [above another chalice]] every Noun castle

Randomly Sampling a Sentence S  NP VP NP  Det N NP  NP PP VP  V NP VP  VP PP PP  P NP NP VP VP PP Papa V NP NP P NP  Papa N  caviar N  spoon V  spoon V  ate P  with Det  the Det  a ate Det N with Det N the caviar a spoon

Ambiguity S Papa NP VP V Det N the caviar a spoon ate PP P with S  NP VP NP  Det N NP  NP PP VP  V NP VP  VP PP PP  P NP NP  Papa N  caviar N  spoon V  spoon V  ate P  with Det  the Det  a

Ambiguity S Papa NP VP the caviar Det N a spoon ate V with P PP S  NP VP NP  Det N NP  NP PP VP  V NP VP  VP PP PP  P NP NP  Papa N  caviar N  spoon V  spoon V  ate P  with Det  the Det  a

Parsing S NP VP V Det N PP P Papa ate the caviar with a spoon NP  Det N NP  NP PP VP  V NP VP  VP PP PP  P NP NP  Papa N  caviar N  spoon V  spoon V  ate P  with Det  the Det  a S NP VP V Det N PP P Papa ate the caviar with a spoon

Dependency Parsing slide adapted from Yuji Matsumoto He reckons the current account deficit will narrow to only 1.8 billion in September . MOD MOD SUBJ COMP MOD SUBJ COMP SPEC MOD S-COMP ROOT slide adapted from Yuji Matsumoto