Computing Science, University of Aberdeen1 CS4025: Grammars l Grammars for English l Features l Definite Clause Grammars See J&M Chapter 9 in 1 st ed,

Slides:



Advertisements
Similar presentations
The Structure of Sentences Asian 401
Advertisements

Prolog programming....Dr.Yasser Nada. Chapter 8 Parsing in Prolog Taif University Fall 2010 Dr. Yasser Ahmed nada prolog programming....Dr.Yasser Nada.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
November 2008NLP1 Natural Language Processing Definite Clause Grammars.
CSA2050: DCG I1 CSA2050 Introduction to Computational Linguistics Lecture 8 Definite Clause Grammars.
CSA4050: Advanced Topics in NLP Semantics IV Partial Execution Proper Noun Adjective.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Fall 2008Programming Development Techniques 1 Topic 9 Symbol Manipulation Generating English Sentences Section This is an additional example to symbolic.
Grammars.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Natural Language Processing - Feature Structures - Feature Structures and Unification.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Parsing: Features & ATN & Prolog By
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
LING 364: Introduction to Formal Semantics Lecture 5 January 26th.
Linguistics II Syntax. Rules of how words go together to form sentences What types of words go together How the presence of some words predetermines others.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Constituency Tests Phrase Structure Rules
THE PARTS OF SYNTAX Don’t worry, it’s just a phrase ELL113 Week 4.
Phrases and Sentences: Grammar
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Embedded Clauses in TAG
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
Computing Science, University of Aberdeen1 CS4025: Grammars l Grammars for English l Features l Definite Clause Grammars See J&M Chapter 9 in 1 st ed,
Grammars.
LING 388: Language and Computers Sandiway Fong Lecture 17.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Syntax I: Constituents and Structure Gareth Price – Duke University.
Natural Language Processing Lecture 6 : Revision.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
LING 388: Language and Computers Sandiway Fong Lecture 18.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Computing Science, University of Aberdeen1 CS4025: Logic-Based Semantics l Compositionality in practice l Producing logic-based meaning representations.
Artificial Intelligence: Natural Language
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
1 Definite Clause Grammars for Language Analysis – A Survey of the Formalism and a Comparison with Augmented Transition Networks 인공지능 연구실 Hee keun Heo.
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
Artificial Intelligence 2004
Curry A Tasty dish? Haskell Curry!. Curried Functions Currying is a functional programming technique that takes a function of N arguments and produces.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
Week 3. Clauses and Trees English Syntax. Trees and constituency A sentence has a hierarchical structure Constituents can have constituents of their own.
Natural Language Processing Vasile Rus
Chapter Eight Syntax.
Part I: Basics and Constituency
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
Syntax.
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Chapter Eight Syntax.
Introduction to Linguistics
David Kauchak CS159 – Spring 2019
Curry A Tasty dish? Haskell Curry!.
Presentation transcript:

Computing Science, University of Aberdeen1 CS4025: Grammars l Grammars for English l Features l Definite Clause Grammars See J&M Chapter 9 in 1 st ed, 12 in 2 nd, Prolog textbooks (for DCGs)

Computing Science, University of Aberdeen2 Grammar: Definition The surface structure (syntax) of sentences is usually described by some kind of grammar. l A grammar defines syntactically legal sentences. »John ate an apple(syn legal) »John ate apple(not syn legal) »John ate a building(syn legal) l More importantly, a grammar provides a description of the structure of a legal sentence

Computing Science, University of Aberdeen3 Facts about Grammars l There is no single “correct”or “complete” grammar for any natural language (though people have tried to make them and certain terminology has emerged that is generally useful). l Linguists dispute exactly what formalism should be used for stating a grammar. l Linguists dispute exactly what data the grammar should apply to. l Grammars describe subsets of NLs. They all have their strengths and weaknesses.

Computing Science, University of Aberdeen4 Sentence Constituents l Grammar rules mention constituents (e.g. NP – noun phrase) as well as single word types (noun) l A constituent can occur in many rules »NP also after a preposition (near the dog), in possessives (the cat’s paws),... l These capture to some extent ideas such as: »Where NP occurs in a rule, any phrase of this class can appear »NP’s have something in common in terms of their meaning »Two phrases of the same kind, e.g. NP, can be joined together by the word “and” to yield a larger phrase of the same kind. Phrase structure tests.

Computing Science, University of Aberdeen5 Common Constituents l Some common constituents »Noun phrase: the dog, an apple, the man I saw yesterday »Adjective phrase: happy, very happy »Verb group: eat, ate, am eating, have been eating. »Verb phrase: eating an apple, ran into the store »Prepositional phrase: in the car »Sentence: A woman ate an apple. l Constituents named “X phrase” in general have a most important word, the head, of category X (underlined above). The head determines properties of the rest of the phrase.

Computing Science, University of Aberdeen6 Phrase-Structure Grammar l PS grammars use context-free rewrite rules to define constituents and sentences »S => NP Verb NP l Similar to rules used to define syntax of programming languages. »if-statement => if condition then statement l Also called context-free grammar (CFG) because the context does not restrict the application of the rules. l Adequate for (almost all of?) English »see J&M, chap 13

Computing Science, University of Aberdeen7 A simple grammar –S => NP VP –NP => Det Adj* Noun –NP => ProperNoun –VP => IntranVerb –VP => TranVerb NP –Det: a, the –Adj: big, black –CommonNoun: dog, cat –ProperNoun: Fido, Misty, Sam –IntranVerb: runs, sleeps –TranVerb: chases, sees, saw How does this assign structure to simple sentences?

Computing Science, University of Aberdeen8 Example: Sam sleeps VP (VerbPhrase) IntranVerb Samsleeps Sentence ProperNoun NP (NounPhrase) Rewrite/expansion rules.

Computing Science, University of Aberdeen9 Ex: The cat sees the dog VP Det Noun TranVerb catseesthedog Sentence NP The Det Noun NP

Computing Science, University of Aberdeen10 Fido chases the big black cat VP Det Adj TranVerb Fidochasesthebig Sentence NP ProperNoun NP Adj black Noun cat

Computing Science, University of Aberdeen11 Recursion l NL Grammars allow recursion NP = Det Adj* CommonNoun PP* PP = Prep NP l For example: The big cat... The big black wild cat.... The man in the store saw John. I fixed the intake of the engine of BA’s first jet.

Computing Science, University of Aberdeen12 The man in the store saw John man Sentence PP Det Noun Prep inthestore NP The Det Noun NP VP [saw John]

Computing Science, University of Aberdeen13 Grammar is not everything l Sentences may be grammatically OK but not acceptable (acceptability?) »The sleepy table eats the green idea. »Fido chases the black big cat. l Sentences may be understandable even if grammatically flawed. »Fido chases a cat small. BUT: l Grammar terminology may be useful for describing sentences even if they are flawed (e.g. in tutoring systems) l Grammars may be useful for analysing fragments within unrestricted and badly-formed text.

Computing Science, University of Aberdeen14 Features l Features are annotations on words and consituents l Grammar rules use unification (as in Prolog) to manipulate features. (See references on unification, which is informally feature matching to the most common feature set.) l Allow grammars to be smaller and simpler. If we accommodated all features in the above grammar, it would be very complex. l See J&M, chap 11 (1 st edition), chap 15 (2 nd edition)

Computing Science, University of Aberdeen15 Ex: Number l One feature is [number:N], where N is singular or plural. »A noun’s number comes from morphology (and semantics) –cat- singular noun –cats- plural noun »An NP's number comes from its head noun »A VP’s number comes from its verb (or auxiliary verb) –is happy- singular VP –are happy- plural VP »NP and VP must agree in number: –The rat has an injury –*The rat have an injury

Computing Science, University of Aberdeen16 Ex: Number l Similar issue between determiners and the head common noun: »This tall man... »*This tall men... »*These tall man... »These tall men... l Languages can be richer (Slavic) or poorer (English, Chinese) morphologically (analysis of word forms). l Many different sorts of features in the world's languages (animacy, shape, gender,...).

Computing Science, University of Aberdeen17 Modified Grammar »S => NP[num:N] VP[num:N] »NP[num:N] => Det[num:N] Adj* Noun[num:N] PP* »NP[num:singular] => ProperNoun »VP[num:N] => IntranVerb[num:N] »VP[num:N] => TranVerb[num:N] NP »PP => Prep NP »Noun[num:sing]: dog »Noun[num:plur]: dogs etc. »??Sentence features?

Computing Science, University of Aberdeen18 Example rule NP[num:N] = Det[num:N] Adj* Noun[num:N] l An NP’s determiner and noun (but not its adjectives in English v Russian) must agree in number. –this dog is OK –these dogs is OK –this dogs is not OK (det must agree) –these dog is not OK (det must agree) –the big red dog is OK –the big red dogs is OK (adjs don’t agree) l [number of NP] is [number of Noun] l Might also need some 'feature passing' in complex structures.

Computing Science, University of Aberdeen19 Ex: These cats see this dog VP[plur] Det[sing] Noun[sing] TranVerb[plur] catsseethisdog Sentence[plur] NP[sing] These Det[plural] Noun[plur] NP[plur]

Computing Science, University of Aberdeen20 Ex: *This cats see this dog VP[plur] Det[sing] Noun[sing] TranVerb[plur] catsseethisdog *Sentence[] NP[sing] This Det[sing] Noun[plur] NP[??]

Computing Science, University of Aberdeen21 Other features l Person (I, you, him) l Verb form (past, present, base, past participle, pres participle) l Gender (masc, fem, neuter) l Polarity (positive, negative) l etc, etc, etc

Computing Science, University of Aberdeen22 Definite Clause Grammars l A definite clause grammar (DCG) is a CFG with added feature information l The syntax is also Prolog-like, and DCGs are usually implemented in Prolog l You don’t need to know Prolog to use DCGs l The formalism was (partially) invented in Scotland

Computing Science, University of Aberdeen23 DCG Notation (1) l Use ---> between rule LHS and RHS. l Non-terminal symbols (e.g., np) written as Prolog constants (start with lower case letters) l Features (eg, Num) are arguments in () »Features are shown by position, not name »Prolog variables (start with upper case letters) indicate unknown values l Actual words are lists, in [] l Prolog tests can be attached in {} l _ is the ``wild card'' variable l (Optional) use distinguished predicate to define the goal symbol (eg, s).

Computing Science, University of Aberdeen24 DCG Notation (2) Rule notationDCG notation S => NP VPs ---> np, vp. S[num:N] => NP[num:N] VP[num:N]s(Num) ---> np(Num), vp(Num). N => dogn ---> [dog]. N => New Yorkn ---> [new, york]. NP => np ---> [X], {isname(X)}. NB the number of “-” in the “--->” varies according to the implementation Variables have to match, e.g. N, Num, X....

Computing Science, University of Aberdeen25 An example DCG distinguished(s). s ---> np(Num), vp(Num). np(Num) ---> det(Num), noun(Num). np(sing) ---> name. vp(Num) ---> intranverb(Num). vp(Num) ---> tranverb(Num), np(_). _ means no constraint on Num agreement between verb and direct object. det(_) ---> [the]. noun(sing) ---> [dog]. name ---> [sam]. intranverb(sing) ---> [sleeps]. intranverb(plur) ---> [sleep]. tranverb(sing) ---> [chases]. tranverb(plur) ---> [chase].

Computing Science, University of Aberdeen26 DCG’s and Prolog l DCG’s are translated into Prolog in a fairly straightforward fashion »See Shapiro, The Art of Prolog l In fact, Prolog was originally designed as a language for NLP, as part of a machine-translation project. l Prolog has evolved since, but is still well-suited to many NLP operations.

Computing Science, University of Aberdeen27 Edinburgh Package l We will use a parsing system for DCGs developed (for teaching purposes) by Holt and Ritchie from Edinburgh University. l Use SWI Prolog

Computing Science, University of Aberdeen28 Using the package l Statements always end with “.” l Prompt is ?- l Type code into a file, not directly into the interpreter. l Load files with[file]. ».pl= prolog source l By default, type “n.” to backtrack request

Computing Science, University of Aberdeen29 Using the DCG package ?- prp([sam, chases, the, dog]). parse and pretty print ?- prp(np, [the, dog]). parse and print an np ?- parse([sam, chases, the, dog], X). returns parse in X ?- parse(np, [the, dog], X). parse a constituent (np) For all the above, after each parse is printed, you will be asked if you want to look for additional parses NB for other DCG implementations, the grammar writer explicitly builds parse structures in the grammar rules.

Computing Science, University of Aberdeen30 Other Grammar Formalisms l Many other grammar formalisms »GPSG, HPSG, LFG, TAG, ATN, GB,... l No consensus on which is best, although many strong opinions (like C vs Pascal vs Lisp) l Like programming languages, formalism used often determined by experience of developers, and available support tools