LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong. Did you install SWI Prolog?

Slides:



Advertisements
Similar presentations
Natural Language Processing - Formal Language - (formal) Language (formal) Grammar.
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
CFGs and PDAs Sipser 2 (pages ). Long long ago…
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
LING 388: Language and Computers Sandiway Fong Lecture 2: 8/24.
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
ICE1341 Programming Languages Spring 2005 Lecture #4 Lecture #4 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/22.
ISBN Chapter 3 Describing Syntax and Semantics.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
CFGs and PDAs Sipser 2 (pages ). Last time…
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.
Theory of Computation What types of things are computable? How can we demonstrate what things are computable?
LING 364: Introduction to Formal Semantics
LING 388 Language and Computers Lecture 8 9/25/03 Sandiway FONG.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/6.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/7.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
LING 388 Language and Computers Take-Home Final Examination 12/9/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/25.
LING 388 Language and Computers Lecture 11 10/7/03 Sandiway FONG.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5.
LING 388 Language and Computers Lecture 9 9/30/03 Sandiway FONG.
Chapter 3: Formal Translation Models
A shorted version from: Anastasia Berdnikova & Denis Miretskiy.
LING 364: Introduction to Formal Semantics Lecture 5 January 26th.
LING 388: Language and Computers Sandiway Fong Lecture 13: 10/10.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
LING 388 Language and Computers Lecture 6 9/18/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 8.
LING 388: Language and Computers Sandiway Fong 10/4 Lecture 12.
Syntax and Semantics Structure of programming languages.
LING 388: Language and Computers Sandiway Fong Lecture 4.
LING 388: Language and Computers Sandiway Fong Lecture 7.
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
Design contex-free grammars that generate: L 1 = { u v : u ∈ {a,b}*, v ∈ {a, c}*, and |u| ≤ |v| ≤ 3 |u| }. L 2 = { a p b q c p a r b 2r : p, q, r ≥ 0 }
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Grammars CPSC 5135.
30/09/04 AIPP Lecture 3: Recursion, Structures, and Lists1 Recursion, Structures, and Lists Artificial Intelligence Programming in Prolog Lecturer: Tim.
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
LING/C SC/PSYC 438/538 Lecture 13 Sandiway Fong. Administrivia Reading Homework – Chapter 3 of JM: Words and Transducers.
LING/C SC/PSYC 438/538 Lecture 14 Sandiway Fong. Administrivia Homework 6 graded.
CS 3240 – Chapter 5. LanguageMachineGrammar RegularFinite AutomatonRegular Expression, Regular Grammar Context-FreePushdown AutomatonContext-Free Grammar.
9.7: Chomsky Hierarchy.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
LING/C SC/PSYC 438/538 Lecture 16 Sandiway Fong. SWI Prolog Grammar rules are translated when the program is loaded into Prolog rules. Solves the mystery.
07/10/04 AIPP Lecture 5: List Processing1 List Processing Artificial Intelligence Programming in Prolog Lecturer: Tim Smith Lecture 5 07/10/04.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 12 Mälardalen University 2007.
Lecture 6: Context-Free Languages
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong. Last Time Talked about: – 1. Declarative (logical) reading of grammar rules – 2. Prolog query: s(String,[]).
20 G M aaba acba aaba.. What is it about? Models of Language Generation Models of Language Recognition.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Programming Languages Translator
Natural Language Processing - Formal Language -
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
Presentation transcript:

LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong

Did you install SWI Prolog?

SWI Prolog Cheatsheet At the prompt ?- 1.halt. 2.listing. listing(name). more useful 3.[filename]. loads filename.pl 4.trace.(step through derivation, hit return) 5.notrace. 6.debug. 7.nodebug.turn off debugger 8.spy(name).spy on predicate name 9.pwd.print working directory 10.working_directory(_,Y). switch directories to Y Anytime –^C (then a(bort) or h(elp) for other options) Everything typed at the prompt must end in a period.

Prolog online resources Useful Online Tutorial – Learn Prolog Now! Patrick Blackburn, Johan Bos & Kristina Striegnitz

Prolog Recursion Example (factorial): – 0! = 1 – n! = n * (n-1)! for n>0 In Prolog: –factorial(0,1). –factorial(N,NF) :- M is N-1, factorial(M,MF), NF is N * MF. Problem: infinite loop Fix: 2 nd case only applies to numbers > 0 factorial(N,NF) :- N>0, M is N-1, factorial(M,MF), NF is N * MF. Prolog built-in: X is

Regular Languages Three formalisms, same expressive power 1.Regular expressions 2.Finite State Automata 3.Regular Grammars We’ll look at this case using a logic programming language: Prolog We’ll look at this case using a logic programming language: Prolog

Chomsky Hierarchy division of grammar into subclasses partitioned by “generative power/capacity” Type-0 General rewrite rules – Turing-complete, powerful enough to encode any computer program – can simulate a Turing machine – anything that’s “computable” can be simulated using a Turing machine – Type-1 Context-sensitive rules – weaker, but still very powerful – a n b n c n Type-2 Context-free rules weaker still a n b n Pushdown Automata (PDA) – Type-3 Regular grammar rules – very restricted – Regular Expressions a + b + – Finite State Automata (FSA) Turing machine: artist’s conception from Wikipedia tape read/write head finite state machine Natural language s Natural language s

Chomsky Hierarchy Regular Grammars FSA Regular Expressions DCG = Type-0 Type-3 Type-2 Type-1

Prolog Grammar Rule System known as “Definite Clause Grammars” (DCG) – based on type-2 restrictions (context-free grammars) – but with extensions – (powerful enough to encode the hierarchy all the way up to type-0) – Prolog was originally designed (1970s) to also support natural language processing – we’ll start with the bottom of the hierarchy i.e. the least powerful regular grammars (type-3)

Definite Clause Grammars (DCG) Background – a “typical” formal grammar contains 4 things – a set of non-terminal symbols (N) – these symbols will be expanded or rewritten by the rules a set of terminal symbols (T) – these symbols cannot be expanded production rules (P) of the form – LHS  RHS – In regular and CF grammars, LHS must be a single non-terminal symbol – RHS: a sequence of terminal and non-terminal symbols: possibly with restrictions, e.g. for regular grammars a designated start symbol (S) – a non-terminal to start the derivation Language – set of terminal strings generated by – e.g. through a top-down derivation

Definite Clause Grammars (DCG) Background a “typical” formal grammar contains 4 things – a set of non-terminal symbols (N) – a set of terminal symbols (T) – production rules (P) of the form LHS  RHS – a designated start symbol (S) Example grammar (regular): S  aB B  aB B  bC B  b C  bC C  b Notes: Start symbol: S Non-terminals: {S,B,C} (uppercase letters) Terminals: {a,b} (lowercase letters)

DefiniteClause Grammars (DCG) Example – Formal grammarDCG format – S  aB s --> [a],b. – B  aB b --> [a],b. – B  bC b --> [b],c. – B  b b --> [b]. – C  bC c --> [b],c. – C  b c --> [b]. Notes: – Start symbol: S – Non-terminals: {S,B,C} – (uppercase letters) – Terminals: {a,b} – (lowercase letters) DCG format: both terminals and non-terminal symbols begin with lowercase letters – variables begin with an uppercase letter (or underscore) --> is the rewrite symbol terminals are enclosed in square brackets (list notation) nonterminals don’t have square brackets surrounding them the comma (,: and) represents the concatenation symbol a period (. ) is required at the end of every DCG rule

13 Regular Grammars Regular or Chomsky hierarchy type-3 grammars – are a class of formal grammars with a restricted RHS LHS → RHS“LHS rewrites/expands to RHS” all rules contain only a single non-terminal, and (possibly) a single terminal) on the right hand side Canonical Forms: x --> y, [t].x --> [t]. (left recursive) or x --> [t], y.x --> [t]. (right recursive) – where x and y are non-terminal symbols and – t (enclosed in square brackets) represents a terminal symbol. Note: – can’t mix these two forms (and still have a regular grammar)! – can’t have both left and right recursive rules in the same grammar Terminology: or “left/right linear” Terminology: or “left/right linear”

Definite Clause Grammars (DCG) What language does our regular grammar generate? by writing the grammar in Prolog, we have a ready-made recognizer program – no need to write a separate grammar rule interpreter (in this case) Example queries – ?- s([a,a,b,b,b],[]). Yes – ?- s([a,b,a],[]). No Note: – Query uses the start symbol s with two arguments: – (1) sequence (as a list) to be recognized and – (2) the empty list [] one or more a’s followed by one or more b’s 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b]. 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b]. Prolog lists: In square brackets, separated by commas e.g. [a] [a,b,c] Prolog lists: In square brackets, separated by commas e.g. [a] [a,b,c]

Prolog lists Perl lists: = (“a”, “b”, “c”); = qw(a b c); = (); Prolog lists: – List = [a, b, c](List is a variable) – List = [a|[b|[c|[]]]](a = head, tail = [b|[c|[]]]) – List = [] Mixed notation: [a|[b,c]] [a,b|[c]]

Regular Grammars Tree representation – Example ?- s([a,a,b],[]). true 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b]. Derivation: s [a], b(rule 1) [a],[a],b(rule 2) [a],[a],[b](rule 4) Derivation: s [a], b(rule 1) [a],[a],b(rule 2) [a],[a],[b](rule 4) our a regular grammar Using trace, we can observe the progress of the derivation… There’s a choice of rules for nonterminal b: Prolog tries the first rule There’s a choice of rules for nonterminal b: Prolog tries the first rule Through backtracking It can try other choices Through backtracking It can try other choices all terminals, so we stop

Regular Grammars Tree representation – Example ?- s([a,a,b,b,b],[]). 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b]. Derivation: s [a], b(rule 1) [a],[a],b(rule 2) [a],[a],[b],c(rule 3) [a],[a],[b],[b],c(rule 5) [a],[a],[b],[b],[b](rule 6) Derivation: s [a], b(rule 1) [a],[a],b(rule 2) [a],[a],[b],c(rule 3) [a],[a],[b],[b],c(rule 5) [a],[a],[b],[b],[b](rule 6)

Prolog Derivations Prolog’s computation rule: – Try first matching rule in the database (remember others for backtracking) – Backtrack if matching rule leads to failure – undo and try next matching rule (or if asked for more solutions) For grammars: – Top-down left to right derivations left to right = expand leftmost nonterminal first Leftmost expansion done recursively = depth-first

Prolog Derivations For a top-down derivation, logically, we have: Choice – about which rule to use for nonterminals b and c No choice – About which nonterminal to expand next 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b]. Bottom up derivation for [a,a,b,b] 1.[a],[a],[b],[b] 2.[a],[a],[b],c(rule 6) 3.[a],[a],b(rule 3) 4.[a],b(rule 2) 5.s (rule 1) Prolog doesn’t give you bottom-up derivations … you’d have to program it up Prolog doesn’t give you bottom-up derivations … you’d have to program it up

SWI Prolog Grammar rules are translated when the program is loaded into Prolog rules. Solves the mystery why we have to type two arguments with the nonterminal at the command prompt Recall list notation: – [1|[2,3,4]] = [1,2,3,4] 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b]. 1.s([a|A], B) :- b(A, B). 2. b([a|A], B) :- b(A, B). 3. b([b|A], B) :- c(A, B). 4. b([b|A], A). 5. c([b|A], B) :- c(A, B). 6. c([b|A], A).