Amirkabir University of Technology Computer Engineering Faculty AILAB Grammars for Natural Language Ahmad Abdollahzadeh Barfouroush Mehr 1381.

Slides:



Advertisements
Similar presentations
CSA2050: DCG I1 CSA2050 Introduction to Computational Linguistics Lecture 8 Definite Clause Grammars.
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Basic Parsing with Context-Free Grammars CS 4705 Julia Hirschberg 1 Some slides adapted from Kathy McKeown and Dan Jurafsky.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
1 Grammars and Parsing Allen ’ s Chapters 3, Jurafski & Martin ’ s Chapters 8-9.
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
PSY 369: Psycholinguistics Some basic linguistic theory part2.
Phrase Structure Rules Must allow all and only the grammatical sentences in a language Descriptive rules, not necessarily prescriptive Each rule “rewrites”
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
PZ02A - Language translation
Amirkabir University of Technology Computer Engineering Faculty AILAB Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing Course,
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Normal forms for Context-Free Grammars
Chapter 3: Formal Translation Models
Computational Grammars Azadeh Maghsoodi. History Before First 20s 20s World War II Last 1950s Nowadays.
Lecture 9UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 9.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.
Natural Language Processing Lecture 6 : Revision.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Syntax Why is the structure of language (syntax) important? How do we represent syntax? What does an example grammar for English look like? What strategies.
ENGLISH SYNTAX Introduction to Transformational Grammar.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 4.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
November 2011CLINT-LN CFG1 Computational Linguistics Introduction Context Free Grammars.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
© 2005 Hans Uszkoreit FLST WS 05/06 FLST Grammars and Parsing Hans Uszkoreit.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 3.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
NATURAL LANGUAGE PROCESSING
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Theory of Computation Lecture #
David Kauchak CS159 – Spring 2019
Presentation transcript:

Amirkabir University of Technology Computer Engineering Faculty AILAB Grammars for Natural Language Ahmad Abdollahzadeh Barfouroush Mehr 1381

An Example of a NLU System Structure Words (Input) Parsing Syntatic Structure and Logical form Contextual Interpretation Final Meaning Application Reasoning Lexicon Grammars Discourse Context Application Context Meaning of response Utterance Planning Syntatic Structure and Logical form of response Realization Words (Response)

Grammar and Parsing To Examine how the syntatic structure of a sentence can be computed, you must consider two things:  Grammar: A formal specification of the allowable structures in the language.  Parsing: The method of analysing a sentence to determine its structure according to the grammar.

Grammar and Language A Grammar G generates a characteristic language L(G) and assigns structures to all s  L(G). For grammar G and start symbol S,L(G) = {x | S derives x} For X  S and let , ,Y {S U N}* (i.e. sequence of terminal/non-terminal symbols)  X  immidiately derives  Y  iff X  Y  G  X  derives  Z  if  X  immidiately derives  Z  or  X  immidiately derives  Y  and  X  derives  Z  A grammar does not tell us how to generate L(G) or how to discover such structures.

Grammar Definition Grammer G is defined by a four tuple and is written as in the form of G = (N,S,P,S 0 ) where, - N is non-terminal symbols set - S is terminal symbols set - P is rewrite rules of the form    where  and  are strings Of symbols. -S 0 is start symbol - In this definition N and S are two separate sets. - Only non-terminals are re-writeable and can occure in both sides of a rule.

An Example of a Grammar 1- S  NP VP 2- VP  V NP 3- NP  NAME 4- NP  ART N 5- NAME  Amir 6- V  ate 7- ART  the 8- N  biscuite S0 = S N = {N,ART,NAME,NP,VP,P} P = {rules in 1 to 8} S = {ate,the,Amir,biscuite}

Sentence Structure  Two Methods for representing sentence structures are: Parse tree Lists

Parse Tree Man ate the apple S NPVP NameV NP ManateART N the apple

Parse Tree Includes information about - precedence between constituents – dominance between constituents Constitutes a trace of the rule applications used to derive a sentence. Does not tell you the order in which the rules were used

Lists Man ate the apple ( S (NP (NAME Man)) (VP (V ate) (NP (ART the) (N apple))))

Chomskey’s Hierachy Different classes of grammar result from various restrictions on the form of rules. Grammers can be compared according to range of languages each formalism can describe.

Types of Grammar in Hierarchy  Regular or Rigth Linear (Type 3): Every rewrite rule is of the form X  aY or X  a, where a is sequence of terminals.  Context-free Grammar (CFG) (Type 2): Every rewrite rule is of the form X  a, where X is in N and a is in (S U N)+.  Context-sensitive Grammar (Type 1): Every rewrite rule is of the form g1Xg2  g1ag2, where X is N, g1, g2 and a are in (N U S)+.  Unrestricted (Type 0): Every rewrite rule is of the form a  b, where There is no restriction on rule.

Categorized Grammer  Grammer G is defined by a five tuple and is written as in the form of G = (N,S,T,P,S 0 ) where, - N is non-terminal symbols set - S is terminal symbols set - P is rewrite rules of the form    where  and  are strings Of symbols. -S 0 is start symbol - T is category terminal or lexical symbols written as T1,T2,..,Tn. - S is written as S = T1,T2,…,Tn. - Every categorized terminal is written as Ti  a i1 | a i2 | … | a in

An Example of a Categorized Grammar 1- S  NP VP9- VP  Verb 2- NP  Art NP210- VP  Verb NP 3- NP  NP211- VP  VP PP 4- NP2  Noun 5- NP2  Adj NP2 6- NP2  NP2 PP 7- PP  Prep NP 8- PP  Prep NP PP S0 = S N = {S,NP,VP,NP2,PP} T = {Art, Noun,Adj,Prep,Verb} Art = {a,the}, Noun = {Man,Woman,boy,cow,chicken} Verb = {eat,run,put} Adj = {old, young,heavy} Prep = {in,by,of,over}

Criteria for Evaluating Grammars Does it undergenerate? Does it overgenerate? Does it assign appropriate structures to sentences it generates? Is it simple to understand?How many rules are? Does it contain generalisations or special cases? How ambiguous is it?

Overgeneration and Undergeneration Overgeneration: A grammar should generate only sentences in the language.It should reject sentences not in the language. Undergeneration: A grammar should generate all sentences in the language.There should not be sentences in the language that are not generated by the grammar.

Appropriate Structures - A grammar should assign linguistically plausible structures. S --  N VP N VP --  V ART ADJ N --  [John] V --  [ate] ART --  [a] ADJ --  [juicy] N --  [hamburger]

Understandability/Generality - Understandability: The grammar should be simple. - Generality: The range of sentences the grammar analyzes correctly.

Ambiguity NP   NP PP PP   Prep NP (the man)(on the hill with a telescope by the sea) (the man on the hill)(with a telescope by the sea) (the man on the hill with a telescope)(by the sea) etc.

Context-free Grammars (CFG) CFG formalism is poweful enough to descibe must of the structure in natural languages. CFG is restricted enough so that efficient parsers can be built to analyse sentences.

CFGs: Advantages and Disadvantages Advantages Easytowrite Declarative Linguistically natural (sometimes) Well understood Formal properties Computationally effective Disadvantages Notion of “head” is absent Categories are unanalysable

Chomsky Normal Form (CNF) Suppose G = (N,S,P,S 0 ) is a context-free grammar. G is in Chomsky Normal Form if every rule in P be in one of the following forms: 1) X  YZ for {X,Y,Z} in N or 2) X  a for a in S There is an algorithm that shows every CFG can be equal to a CNF grammar.

An Algorithm for Converting CFG to CNF For every grammar G = (N,S,P,S 0 ) there is a equivalent grammar G’ in Chomsky Normal Form. 1- Transfer every rule in X  YZ for {X,Y,Z} in N or X  a for a in S to CNF. 2- Consider every rule in the form X  Y1,a1,Y2,…,Yn. All terminal symbols ai are replaced by Xi and new rule Xi  ai is added to P’. 3- Step 2 produces rules in the form X  Y1,Y2,…,Yn. If n 2 non-terminals are added as follow: X  Y1, Z1 = Z1  Y2, Z2 = Zn-1  Yn-1,Yn

Greibach Normal Form (GNF) Suppose G = (N,S,T,P,S 0 ) is a categorized context- free grammar. A rule in the form X  a1a2…an is in GNF if a1 is in S or T and a2,…,an be non-terminals. If all rules in P be in GNF, then G is GNF. So, G should not contain rules in the form X  e. By direct substitution we can reach to GNF.

An Example of CFG  GNF CFG 1- S  NP VP 2- S  NP VP PREPS 3- NP  Det NP2 4- NP  NP2 5- NP2  Noun 6- NP2  Det NP2 7- NP2  NP3 PREPS 8- NP3  Noun 9- PREPS  PP 10- PREPS  PP PREPS 11- PP  Prep NP 12- VP  Verb

An Example of CFG  GNF GNF 1a- S  Det NP2 VP 5) NP2  Noun 1b- S  Noun VP 6) NP2  Adj NP2 1c- S  Adj NP2 VP 7) NP2  Noun PREPS 1d- S  Noun PREPS VP 8) NP3  Noun 2a- S  Det NP2 VP PREPS 9) PREPS  Prep NP 2b- S  Noun VP PREPS 10) PREPS  Prep NP PREPS 2c- S  Adj NP2 VP PREPS 11) PP  Prep NP 2d- S  Noun PREPS VP PREPS 12) VP  Verb 3- NP  Det NP2 4a- NP  NP2 4b- NP  Adj NP2 4c- NP  Noun PREPS

Phrase Structure

Problems with phrase structure The shooting of the hunters was terrible. (The shooting) (of the hunters) (was terrible.) The boy hit the ball The ball was hit by the boy.

Surface Structure

Surface vs. Deep Structure Surface structure:Surface structure: the phrase structure of the current utterance Deep structure:Deep structure: a canonical phrase structure that has the same meaning as the surface structure Transformational grammar:Transformational grammar: rules that transform a deep phrase structure into surface phrase structures with the same meaning

Surface Structure

Deep Structure: The boy hit the ball AND The ball was hit by the boy NP VP VNP The boy hit the ball Sentence

Transformational Grammar (1965) Generates surface structure from deep structure. Syntatic Component Phrase-strcuture rules Deep Structure Transformational Rules Surface Structure Semantic component Phonological component

Example of TG Context-free grammar generates deep structure, then a set of trasformations transform deep structure to surface structure S NPVP ART NAUX V NP Thecat will catch man

Example of TG Yes/No Question transformation SS NP VPAUX NP VP ? ARTN AUX V NP ART N V NP S AUXNPVP ? ART N V NP Will the cat catch man Transformation

Transformational Grammar Base component: Generates the deep strcuture. Transformational component: Transforms the deep structure to surface structure by using transformational rules. Transformational rules change the sentence elements, insert or delete elements and/or replace one element with another element. Example of a rule: NP + V + ed + NP  Did + NP + V + NP + ?

Grammer Types (1) Constraint-based Lexicalist Grammar (CBLG) - Sag, I. A. and Waswo, Syntatic Theory – a formal introduction, CSLI Publications, Categorical Grammar (CG) - Konig, E., LexGram, A Practical Categorical Grammar Formalism, Journal of Language and Computation, Dependency Grammar (DG) - Sag, I. A. and Waswo, Syntatic Theory – a formal introduction, CSLI Publications, 1999.

Grammer Types (2) Link Grammar - Sleator, D., Temperley D., Parsing English with Link Grammar, Carnegie Mellon Univ, Lexical Functional Grammar (LFG) - Sag, I. A. and Waswo, Syntatic Theory – a formal introduction, CSLI Publications, Tree-Adjoining Grammar (TAG) - Allen, James, Natural Language Understanding, 1995

Grammer Types (3) Generalized Phrase Structure Grammar (GPSG) - Sag, I. A. and Waswo, Syntatic Theory – a formal introduction, CSLI Publications, Head-driven Phrase Structure Grammar - Pollard, C., and Sag I. A., Head-driven Phrase Structure Grammar, Chicago Univ Press, This page provides information about Head-Driven Phrase Structure Grammar (HPSG) related activities at the Center for the Study of Language and Information (CSLI) at Stanford University, and pointers to other resources on the web.Center for the Study of Language and InformationStanford University

Grammer Types (4) Probabilistic Feature Grammar (PFG) - Goodman, Joshua, Probabilistic Feature Grammar, Harward university,