© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 7: Definite Clause Grammars Theory –Introduce context free grammars and some related concepts.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Natural Language and Prolog
Advertisements

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 10: Cuts and Negation Theory –Explain how to control Prolog`s backtracking behaviour with.
Prolog programming....Dr.Yasser Nada. Chapter 8 Parsing in Prolog Taif University Fall 2010 Dr. Yasser Ahmed nada prolog programming....Dr.Yasser Nada.
November 2008NLP1 Natural Language Processing Definite Clause Grammars.
CSA2050: DCG I1 CSA2050 Introduction to Computational Linguistics Lecture 8 Definite Clause Grammars.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 6: More Lists Theory –Define append/3, a predicate for concatenating two lists, and illustrate.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 6: More Lists Theory –Define append/3, a predicate for concatenating two lists, and illustrate.
Prolog: List © Patrick Blackburn, Johan Bos & Kristina Striegnitz.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
CS5371 Theory of Computation
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 8: More DCGs Theory –Examine two important capabilities offered by DCG notation: Extra arguments.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 5: Arithmetic Theory –Introduce Prolog`s built-in abilities for performing arithmetic –Apply.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 7: Definite Clause Grammars Theory –Introduce context free grammars and some related concepts.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 4: Lists Theory –Introduce lists, an important recursive data structure often used in Prolog.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 9: A closer look at terms Theory –Introduce the == predicate –Take a closer look at term structure.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 5: Arithmetic Theory –Introduce Prolog`s built-in abilities for performing arithmetic –Apply.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Normal forms for Context-Free Grammars
Chapter 3: Formal Translation Models
LING 388: Language and Computers Sandiway Fong Lecture 13: 10/10.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 8: More DCGs Exercises –Solution to exercises LPN 7 –Recall of previous lecture Theory –Examine.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 4: Lists Theory –Introduce lists, an important recursive data structure often used in Prolog.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
LING 388: Language and Computers Sandiway Fong Lecture 8.
LING 388: Language and Computers Sandiway Fong 10/4 Lecture 12.
Linguistic Theory Lecture 3 Movement. A brief history of movement Movements as ‘special rules’ proposed to capture facts that phrase structure rules cannot.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Horn clauses A literal is an atomic formula or its negation A clause is a disjunction of literals.
Logic Programming Lecture 6: Parsing, Difference Lists and Definite Clause Grammars.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
LING 388: Language and Computers Sandiway Fong Lecture 7.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
CSC312 Automata Theory Lecture # 2 Languages.
LANGUAGE TRANSLATORS: WEEK 3 LECTURE: Grammar Theory Introduction to Parsing Parser - Generators TUTORIAL: Questions on grammar theory WEEKLY WORK: Read.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Parsing. Language A set of strings from an alphabet which may be empty, finite or infinite. A language is defined by a grammar and we may also say that.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
November 2011CLINT-LN CFG1 Computational Linguistics Introduction Context Free Grammars.
Artificial Intelligence: Natural Language
CSA2050 Introduction to Computational Linguistics Parsing I.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 9: A closer look at terms Theory –Introduce the == predicate –Take a closer look at term structure.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong. Did you install SWI Prolog?
Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Logic Programming Lecture 6: Parsing and Definite Clause Grammars.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Describing Syntax and Semantics
Formal Language Theory
CHAPTER 2 Context-Free Languages
Theory of Computation Lecture #
Chapter 10: Compilers and Language Translation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of formalizing the structure.
Lecture 7: Definite Clause Grammars
Faculty of Computer Science and Information System
Presentation transcript:

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Lecture 7: Definite Clause Grammars Theory –Introduce context free grammars and some related concepts –Introduce definite clause grammars, the Prolog way of working with context free grammars (and other grammars too) Exercises

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Context free grammars Prolog offers a special notation for defining grammars, namely definite clause grammars (DCGs) So what is a grammar? We will answer this question by discussing context free grammars CFGs are a very powerful mechanism, and can handle most syntactic aspects of natural languages (such as English or Italian)

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Example of a CFG s  np vp np  det n vp  v np vp  v det  the det  a n  man n  woman v  shoots

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Ingredients of a grammar The  symbol is used to define the rules The symbols s, np, vp, det, n, v are called the non-terminal symbols The symbols in italics are the terminal symbols: the, a, man, woman, shoots s  np vp np  det n vp  v np vp  v det  the det  a n  man n  woman v  shoots

© Patrick Blackburn, Johan Bos & Kristina Striegnitz A little bit of linguistics The non-terminal symbols in this grammar have a traditional meaning in linguistics: –np: noun phrase –vp: verb phrase –det: determiner –n: noun –v: verb –s: sentence

© Patrick Blackburn, Johan Bos & Kristina Striegnitz More linguistics In a linguistic grammar, the non- terminal symbols usually correspond to grammatical categories In a linguistic grammar, the terminal symbols are called the lexical items, or simply words (a computer scientist might call them the alphabet)

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Context free rules The grammar contains nine context free rules A context free rule consists of: –A single non-terminal symbol –followed by  –followed by a finite sequence of terminal or non-terminal symbols s  np vp np  det n vp  v np vp  v det  the det  a n  man n  woman v  shoots

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Grammar coverage Consider the following string: the woman shoots a man Is this string grammatical according to our grammar? And if it is, what syntactic structure does it have?

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Syntactic structure s vp np np det n v det n the woman shoots a man s  np vp np  det n vp  v np vp  v det  the det  a n  man n  woman v  shoots

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Parse trees Trees representing the syntactic structure of a string are often called parse trees Parse trees are important: –They give us information about the string –They gives us information about structure

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Grammatical strings If we are given a string of words, and a grammar, and it turns out we can build a parse tree, then we say that the string is grammatical (with respect to the given grammar) –E.g., the man shoots is grammatical If we cannot build a parse tree, the given string is ungrammatical (with respect to the given grammar) –E.g., a shoots woman is ungrammatical

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Generated language The language generated by a grammar consists of all the strings that the grammar classifies as grammatical For instance a woman shoots a man a man shoots belong to the language generated by our little grammar

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Recogniser A context free recogniser is a program which correctly tells us whether or not a string belongs to the language generated by a context free grammar To put it another way, a recogniser is a program that correctly classifies strings as grammatical or ungrammatical

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Information about structure But both in linguistics and computer science, we are not merely interested in whether a string is grammatical or not We also want to know why it is grammatical: we want to know what its structure is The parse tree gives us this structure

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Parser A context free parser correctly decides whether a string belongs to the language generated by a context free grammar And it also tells us what its structure is To sum up: –A recogniser just says yes or no –A parser also gives us a parse tree

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Context free language We know what a context free grammar is, but what is a context free language? Simply: a context free language is a language that can be generated by a context free grammar Some human languages are context free, some others are not –English and Italian are probably context free –Dutch and Swiss-German are not context free

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Theory vs. Practice So far the theory, but how do we work with context free grammars in Prolog? Suppose we are given a context free grammar. –How can we write a recogniser for it? –How can we write a parser for it? In this lecture we will look at how to define a recogniser

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition in Prolog We shall use lists to represent strings [a,woman,shoots,a,man] The rule s  np vp can be thought as concatenating an np-list with a vp-list resulting in an s-list We know how to concatenate lists in Prolog: using append/3 So let`s turn this idea into Prolog

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using append/3 s(C):- np(A), vp(B), append(A,B,C). np(C):- det(A), n(B), append(A,B,C). vp(C):- v(A), np(B), append(A,B,C). vp(C):- v(C). det([the]). det([a]). n([man]). n([woman]). v([shoots]).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using append/3 ?- s([the,woman,shoots,a,man]). yes ?- s(C):- np(A), vp(B), append(A,B,C). np(C):- det(A), n(B), append(A,B,C). vp(C):- v(A), np(B), append(A,B,C). vp(C):- v(C). det([the]). det([a]). n([man]). n([woman]). v([shoots]).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using append/3 ?- s(S). S = [the,man,shoots,the,man]; S = [the,man,shoots,the,woman]; S = [the,woman,shoots,a,man] … s(C):- np(A), vp(B), append(A,B,C). np(C):- det(A), n(B), append(A,B,C). vp(C):- v(A), np(B), append(A,B,C). vp(C):- v(C). det([the]). det([a]). n([man]). n([woman]). v([shoots]).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using append/3 ?- np([the,woman]). yes ?- np(X). X = [the,man]; X = [the,woman] s(C):- np(A), vp(B), append(A,B,C). np(C):- det(A), n(B), append(A,B,C). vp(C):- v(A), np(B), append(A,B,C). vp(C):- v(C). det([the]). det([a]). n([man]). n([woman]). v([shoots]).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Problems with this recogniser It doesn`t use the input string to guide the search Goals such as np(A) and vp(B) are called with uninstantiated variables Moving the append/3 goals to the front is still not very appealing --- this will only shift the problem --- there will be a lot of calls to append/3 with uninstantiated variables

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Difference lists A more efficient implementation can be obtained by using difference lists This is a sophisticated Prolog technique for representing and working with lists Examples: [a,b,c]-[ ] is the list [a,b,c] [a,b,c,d]-[d] is the list [a,b,c] [a,b,c|T]-T is the list [a,b,c] X-X is the empty list [ ]

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using difference lists s(A-C):- np(A-B), vp(B-C). np(A-C):- det(A-B), n(B-C). vp(A-C):- v(A-B), np(B-C). vp(A-C):- v(A-C). det([the|W]-W). det([a|W]-W). n([man|W]-W). n([woman|W]-W). v([shoots|W]-W). A Cs(A-C) Bnp(A-B) vp(B-C)

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using difference lists ?- s([the,man,shoots,a,man]-[ ]). yes ?- s(A-C):- np(A-B), vp(B-C). np(A-C):- det(A-B), n(B-C). vp(A-C):- v(A-B), np(B-C). vp(A-C):- v(A-C). det([the|W]-W). det([a|W]-W). n([man|W]-W). n([woman|W]-W). v([shoots|W]-W).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz CFG recognition using difference lists ?- s(X-[ ]). S = [the,man,shoots,the,man]; S = [the,man,shoots,a,man]; …. s(A-C):- np(A-B), vp(B-C). np(A-C):- det(A-B), n(B-C). vp(A-C):- v(A-B), np(B-C). vp(A-C):- v(A-C). det([the|W]-W). det([a|W]-W). n([man|W]-W). n([woman|W]-W). v([shoots|W]-W).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Summary so far The recogniser using difference lists is a lot more efficient than the one using append/3 However, it is not that easy to understand and it is a pain having to keep track of all those difference list variables It would be nice to have a recogniser as simple as the first and as efficient as the second This is possible: using DCGs

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Definite Clause Grammars What are DCGs? Quite simply, a nice notation for writing grammars that hides the underlying difference list variables Let us look at three examples

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: first example s --> np, vp. np --> det, n. vp --> v, np. vp --> v. det --> [the]. det --> [a]. n --> [man]. n --> [woman]. v --> [shoots].

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: first example ?- s([a,man,shoots,a,woman],[ ]). yes ?- s --> np, vp. np --> det, n. vp --> v, np. vp --> v. det --> [the]. det --> [a]. n --> [man]. n --> [woman]. v --> [shoots].

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: first example ?- s(X,[ ]). S = [the,man,shoots,the,man]; S = [the,man,shoots,a,man]; …. s --> np, vp. np --> det, n. vp --> v, np. vp --> v. det --> [the]. det --> [a]. n --> [man]. n --> [woman]. v --> [shoots].

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: second example We added some recursive rules to the grammar… What and how many sentences does this grammar generate? What does Prolog do with this DCG? s --> s, conj, s. s --> np, vp. np --> det, n. vp --> v, np. vp --> v. det --> [the]. det --> [a]. n --> [man]. n --> [woman]. v --> [shoots]. conj --> [and]. conj --> [or]. conj --> [but].

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCG without left-recursive rules s --> simple_s, conj, s. s --> simple_s. simple_s --> np, vp. np --> det, n. vp --> v, np. vp --> v. det --> [the]. det --> [a]. n --> [man]. n --> [woman]. v --> [shoots]. conj --> [and]. conj --> [or]. conj --> [but].

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs are not magic! The moral: DCGs are a nice notation, but you cannot write arbitrary context- free grammars as a DCG and have it run without problems DCGs are ordinary Prolog rules in disguise So keep an eye out for left-recursion!

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: third example We will define a DCG for a formal language A formal language is simply a set of strings –Formal languages are objects that computer scientists and mathematicians define and study –Natural languages are languages that human beings normally use to communicate We will define the language a n b n

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: third example s --> []. s --> l,s,r. l --> [a]. r --> [b]. ?- s([a,a,a,b,b,b],[ ]). yes ?- s([a,a,a,a,b,b,b],[ ]). no We will define the formal language a n b n

© Patrick Blackburn, Johan Bos & Kristina Striegnitz DCGs: third example s --> []. s --> l,s,r. l --> [a]. r --> [b]. ?- s(X,[ ]). X = [ ]; X = [a,b]; X = [a,a,b,b]; X = [a,a,a,b,b,b] …. We will define the formal language a n b n

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise 6.1 Let's call a list doubled if it is made of two consecutive blocks of elements that are exactly the same. For example, [a,b,c,a,b,c] is doubled (it's made up of [a,b,c] followed by [a,b,c]) and so is [foo,gubble,foo,gubble]. On the other hand, [foo,gubble,foo] is not doubled. Write a predicate doubled(List) which succeeds when List is a doubled list. doubled(List) :- append(X,X,List). doubled(List) :- List-X = X.

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise 6.2 A palindrome is a word or phrase that spells the same forwards and backwards. For example, `rotator', `eve', and `nurses run' are all palindromes. Write a predicate palindrome(List), which checks whether List is a palindrome. accReverse([ ],L,L). accReverse([H|T],Acc,Rev):- accReverse(T,[H|Acc],Rev). reverse(L1,L2):- accReverse(L1,[ ],L2). palindrome(Word) :- reverse(Word,Word).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise Write a predicate second(X,List) which checks whether X is the second element of List. second(X,List) :- [_,X|_] = List.

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise Write a predicate swap12(List1,List2) which checks whether List1 is identical to List2, except that the first two elements are exchanged. swap12([A,B|T],[B,A|T]).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise Write a predicate final(X,List) which checks whether X is the last element of List. final(X,[X]). final(X,[_|T]) :- final(X,T).

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise Write a predicate toptail(InList,Outlist) which says `no' if inlist is a list containing fewer than 2 elements, and which deletes the first and the last elements of Inlist and returns the result as Outlist, when Inlist is a list containing at least 2 elements. toptail(InList,OutList) :- InList = [First|T], final(Last,InList), append(OutList,Last,T). abcdefghi T Last = i First = a

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Solution to Exercise Write a predicate swapfl(List1,List2) which checks whether List1 is identical to List2, except that the first and last elements are exchanged. Hint: here's where append comes in useful again. swapfl([H1|T1], [H2|T2]) :- append(Middle,[H2],T1), append(Middle,[H1],T2). abcdefghi ibcdefgha Middle

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Exercise LPN Chapter 7 7.1, 7.2, 7.3

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Summary of this lecture We explained the idea of grammars and context free grammars are We introduced the Prolog technique of using difference lists We showed that difference lists can be used to describe grammars Definite Clause Grammars is just a nice Prolog notation for programming with difference lists

© Patrick Blackburn, Johan Bos & Kristina Striegnitz Next lecture More Definite Clause Grammars –Examine two important capabilities offered by DCG notation Extra arguments Extra tests –Discuss the status and limitations of definite clause grammars