LING 438/538 Computational Linguistics Sandiway Fong Lecture 25: 11/21.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Mooly Sagiv and Roman Manevich School of Computer Science
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Formal Aspects Term 2, Week4 LECTURE: LR “Shift-Reduce” Parsers: The JavaCup Parser-Generator CREATES LR “Shift-Reduce” Parsers, they are very commonly.
Parsing VI The LR(1) Table Construction. LR(k) items The LR(1) table construction algorithm uses LR(1) items to represent valid configurations of an LR(1)
LING 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/7.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
LING 388: Language and Computers Sandiway Fong Lecture 13: 10/10.
2/20/2008Prof. Hilfinger CS164 Lecture 121 Earley’s Algorithm: General Context-Free Parsing Lecture 12 P. N. Hilfinger.
Bottom-up parsing Goal of parser : build a derivation
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
LING 388: Language and Computers Sandiway Fong 10/4 Lecture 12.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
LESSON 24.
Intro to NLP - J. Eisner1 Earley’s Algorithm (1970) Nice combo of our parsing ideas so far:  no restrictions on the form of the grammar:  A.
11/22/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Shift-Reduce Parsing in Detail Dr. Jan Hajič CS Dept., Johns.
Syntax and Semantics Structure of programming languages.
LING 388: Language and Computers Sandiway Fong Lecture 7.
1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers.
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
Syntax and Semantics Structure of programming languages.
1 Chart Parsing Allen ’ s Chapter 3 J & M ’ s Chapter 10.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
111 Chapter 6 LR Parsing Techniques Prof Chung. 1.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Jan 13 th.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
CS 154 Formal Languages and Computability March 22 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
UNIT - 3 SYNTAX ANALYSIS - II
Table-driven parsing Parsing performed by a finite state machine.
Chapter 4 Syntax Analysis.
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
Fall Compiler Principles Lecture 4: Parsing part 3
Subject Name:COMPILER DESIGN Subject Code:10CS63
Syntax Analysis - LR(1) and LALR(1) Parsing
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing and More Parsing
Parsing #2 Leonidas Fegaras.
Kanat Bolazar February 16, 2010
CSA2050 Introduction to Computational Linguistics
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Chap. 3 BOTTOM-UP PARSING
Outline 6.0 Introduction 6.1 Shift-Reduce Parsers 6.2 LR Parsers
Presentation transcript:

LING 438/538 Computational Linguistics Sandiway Fong Lecture 25: 11/21

2 Administrivia Lecture schedule (from last time) –Tuesday 21st November Homework #6: Context-free Grammars and Parsing due Tuesday 28th –Thursday 23rd November Turkey Day –Tuesday 28th November –Thursday 30th November Homework #7: Machine Translation due December 7th 538 Presentations –Tuesday 5th December Homework #7: Machine Translation 538 Presentations

3 Administrivia 538 Presentations: assignments

4 Last Time Chapter 10: –Parsing with Context-Free Grammars Top-down Parsing –Prolog’s DCG rule system –Left recursion –Left-corner idea Bottom-up Parsing –Dotted rules –LR parsing: shift and reduce operations

5 Bottom-Up Parsing LR(0) parsing –An example of bottom-up tabular parsing –Similar to the top-down Earley algorithm described in the textbook in that both methods use the idea of dotted rules –LR is more efficient it computes the dotted rules offline (during parser/grammar construction) Earley computes the dotted rules at parse time LR actions –Shift: read an input word i.e. advance current input word pointer to the next word –Reduce: complete a nonterminal i.e. complete parsing a grammar rule –Accept: complete the parse i.e. start symbol (e.g. S) derives the terminal string

6 Tabular Parsing Dotted Rule Notation –“dot” used to indicate the progress of a parse through a phrase structure rule –examples vp --> v. np means we’ve seen v and predict np np -->. d np means we’re predicting a d (followed by np) vp --> vp pp. means we’ve completed a vp state –a set of dotted rules encodes the state of the parse kernel vp --> v. np vp --> v. completion (of predict NP) np -->. d n np -->. n np -->. np cp

7 Tabular Parsing compute possible states by advancing the dot –example: –(Assume d is next in the input) vp --> v. np vp --> v.(eliminated) np --> d. n np -->. n(eliminated) np -->. np cp

8 Tabular Parsing Dotted rules –example State 0: – s ->. np vp – np ->.d np – np ->.n – np ->.np pp –possible actions shift d and go to new state shift n and go to new state Creating new states S ->. NP VP NP ->. D N NP ->. N NP ->. NP PP NP -> D. N NP -> N. State 0 State 2 State 1 shift d shift n

9 Tabular Parsing State 1: Shift N, goto State 2 S ->. NP VP NP ->. D N NP ->. N NP ->. NP PP NP -> D. N NP -> N. State 0 State 2 State 1 NP -> D N. State 3

10 Tabular Parsing Shift –take input word, and –place on stack [ V hit ] … [ N man] [ D a ] Input Stack state 3 S ->. NP VP NP ->. D N NP ->. N NP ->. NP PP NP -> D. N NP -> N. State 0 State 2 State 1 NP -> D N. State 3 shift d shift n

11 Tabular Parsing State 2: Reduce action NP -> N. S ->. NP VP NP ->. D N NP ->. N NP ->. NP PP NP -> D. N NP -> N. State 0 State 2 State 1 NP -> D N. State 3

12 Tabular Parsing Reduce NP -> N. –pop [ N milk] off the stack, and –replace with [ NP [ N milk]] on stack [ V is ] … [ N milk] Input Stack State 2 [ NP milk]

13 Tabular Parsing State 3: Reduce NP -> D N. S ->. NP VP NP ->. D N NP ->. N NP ->. NP PP NP -> N. State 0 State 2 NP -> D. N State 1 NP -> D N. State 3

14 Tabular Parsing Reduce NP -> D N. –pop [ N man] and [ D a] off the stack –replace with [ NP [ D a][ N man]] [ V hit ] … [ N man] [ D a ] Input Stack State 3 [ NP [ D a ][ N man]]

15 Tabular Parsing State 0: Transition NP S ->. NP VP NP ->. D N NP ->. N NP ->. NP PP NP -> N. State 0 State 2 S -> NP. VP NP -> NP. PP VP ->. V NP VP ->. V VP ->. VP PP PP ->. P NP State 4

16 Tabular Parsing for both states 2 and 3 –NP -> N.(reduce NP -> N) –NP -> D N.(reduce NP -> D N) after Reduce NP operation –Goto state 4 notes: –states are unique –grammar is finite –procedure generating states must terminate since the number of possible dotted rules

17 Tabular Parsing StateActionGoto 0Shift D Shift N Reduce NP -> N4 3Reduce NP -> D N 4 4……

18 Tabular Parsing Observations table is sparse example State 0, Input: [ V..] parse fails immediately in a given state, input may be irrelevant example State 2 (there is no shift operation) there may be action conflicts example State 1: shift D, shift N more interesting cases shift-reduce and reduce-reduce conflicts

19 Tabular Parsing finishing up –an extra initial rule is usually added to the grammar –SS --> S. $ SS = start symbol $ = end of sentence marker –input: milk is good for you $ –accept action discard $ from input return element at the top of stack as the parse tree

20 LR Parsing in Prolog Recap –finite state machine each state represents a set of dotted rules –example » S -->. NP VP » NP -->. D N » NP -->. N » NP -->. NP PP we transition, i.e. move, from state to state by advancing the “dot” over terminal and nonterminal symbols

21 LR Parsing in Prolog Plan: –formally describe a LR finite state machine construction process –define the parse procedure parse(Sentence,Tree) in terms of the LR finite state machine –run John saw the man with a telescope ? - parse([john,saw,the,man,with,a,telescope],T). which produces two parses (PP-attachment ambiguity)

22 Grammar assume grammar rules and lexicon: –rule(s,[np,vp]).convenient format for the LR(0) generator –rule(np,[d,n]). –rule(np,[n]). –rule(np,[np,pp]). –rule(vp,[v,np]). –rule(vp,[v]). –rule(vp,[vp,pp]). –rule(pp,[p,np]). –lexicon(the,d). lexicon(a,d). –lexicon(man,n). lexicon(john,n). lexicon(telescope,n). –lexicon(saw,v). lexicon(v,runs). –lexicon(with,p).

23 Grammar extra definitions :- dynamic rule/2. start(ss). rule(ss,[s,$]). nonT(ss). nonT(s). nonT(np). nonT(vp). nonT(pp). term(n). term(v). term(p). term(d). term($). notes: –$ = end of sentence marker –Prolog programming trick declaring rule/2 as dynamic allows us to use the builtin clause(rule(LHS,RHS),true,Ref) to keep a pointer (Ref) to a particular rule

24 Grammar Rule Predicates define –%% Assume grammar rules are stored as database facts –%% rule(LHS,RHS) –ruleLHS(NonT,Ref) :- clause(rule(NonT,_),true,Ref). –ruleRHS(RHS,Ref) :- clause(rule(_,RHS),true,Ref). –ruleElements(LHS,RHS,Ref) :- % assume Ref instantiated –clause(rule(LHS,RHS),true,Ref). note –Ref (when instantiated) is a pointer to an instance of rule(LHS,RHS).

25 A Counter in Prolog define – stateCounter(N) to hold the current state number (N = 0,1,2,3…) define predicates –resetStateCounter :- –retractall(stateCounter(_)), –assert(stateCounter(0)). –incStateCounter :- –retract(stateCounter(X)), –Y is X + 1, –assert(stateCounter(Y)). Prolog builtins used: retract/1 - removes matching item from the database retractall/1 - removes all matching items from the database assert/1 - adds item to the database

26 Data Structures define cfsm/3 –cfsm(L,CFSet,N) “state configuration” CFSet = list of dotted rules for state N L = |CFSet| (used for quicker lookup) define cf/3 –cf(Ref,I) “dotted rule configuration” Ref points to a rule(LHS,RHS) (I = 0,1,2…) is the index of the “dot” in RHS

27 Build FSA initially R 1 = rule(ss,[s,$]). ss -->. s $ cf(R 1,0) do a closure on the dotted rule, adding s -->. np vp np -->. d n … SS -->. S $ S -->. NP VP NP -->. D N NP -->. N NP -->. NP PP State 0

28 Build FSM: Closure Operation define –mkStartCF(cf(Ref,0)) :- start(Start),ruleLHS(Start,Ref). call –mkStartCF(StartCF), –closure([StartCF],S0), define closure/2 recursively –closure(CFSet,CFSet1) :- dotNonT(CFSet,NonT), predict(NonT,CFSet,CFSet2), closure(CFSet2,CFSet1). –closure(CFSet,CFSet).

29 Build FSM: Closure Operation define dotNonT/2 to pick out possible instances of Y in X --> ….Y … –dotNonT([cf(Ref,Pos)|_],NonT) :- –dotNonT1(Ref,Pos,NonT). –dotNonT([_|L],NonT) :- dotNonT(L,NonT). –dotNonT1(Ref,Pos,NonT) :- – ruleRHS(RHS,Ref), nth(Pos,RHS,NonT), nonT(NonT). notes – dotNonT/2 works just like list member/2 – nth(N,L,X) picks out (N+1)th element (X) in list L

30 Build FSM: Closure Operation define predict/3 to add new dotted rules for NonT –predict(NonT,CFSet,NewCFSet) :- –findall(cf(Ref,0),ruleLHS(NonT,Ref),NewCFs), –merge(NewCFs,CFSet,NewCFSet,[],new). define merge/3 to add new dotted rules only if there’re not already present in CFSet merge([],L,L,Flag,Flag). merge([cf(Ref,Pos)|L],CFSet,CFSet1,Flag,Flag1) :- % already present member(cf(Ref,Pos),CFSet), merge(L,CFSet,CFSet1,Flag,Flag1). merge([CF|L],CFSet,CFSet1,_,Flag) :-% CF is new merge(L,[CF|CFSet],CFSet1,new,Flag). note –the variable Flag ([]/new) is used to make sure something has been added to CFSet

31 Build FSM: Closure Operation call –mkStartCF(StartCF), –closure1([StartCF],S0), –resetStateCounter, –length(S0,L), –cfsmEntry(S0,L), define storage predicate cfsmEntry/2 –cfsmEntry(CFSet,L) :- –stateCounter(State), –incStateCounter, –asserta(cfsm(L,CFSet,State)). cfsm(L,CFSet,N) “state configuration” CFSet = list of dotted rules for state N L = |CFSet| (used for quicker lookup)

32 Build FSM: Build new state define buildState/1 –buildState(CFSet,S1) :- –transition(CFSet,Symbol,CFSet1), –length(CFSet1,L), –addCFSet(CFSet1,L,S2), – assert(goto(S1,Symbol,S2)), –fail. –buildState(_,_). notes –transition/3 produces a new CFSet by advancing the dot over Symbol –addCFSet/3 will add a new state represented by CFSet1 (if it doesn’t already exist) –State transitions represented by goto(S1,Symbol,S2)

33 Build FSM: Build new state define transition/3 –transition(CFSet,Symbol,CFSet1) :- –pickSymbol(CFSet,Symbol), –advanceDot(CFSet,Symbol,CFSet2), –closure(CFSet2,CFSet1). Note: pickSymbol/2 picks a symbol next to a dot in a dotted rule in CFSet define advanceDot/3 –advanceDot([cf(Ref,Pos)|L],Symbol,[cf(Ref,Pos1)|CFSet]) :- –ruleRHS(RHS,Ref), nth(Pos,RHS,Symbol), –!, –Pos1 is Pos+1, –advanceDot(L,Symbol,CFSet). –advanceDot([_|L],Symbol,CFSet) :- !, advanceDot(L,Symbol,CFSet). –advanceDot([],_,[]). S -->. NP VP NP -->. D N NP -->. N NP -->. NP PP State 0 S --> NP. VP NP --> NP. PP VP -->. V NP VP -->. V VP -->. VP PP PP -->. P NP State 4

34 Build FSM: Build new state define addCFSet/3 –addCFSet(CFSet,L,S) :-% CFSet already established –findCFSet(CFSet,S,L), –!. –addCFSet(CFSet,L,S) :-% CFSet is new state #N –cfsmEntry(CFSet,L,S).% add it Note: –findCFSet/3 will succeed only if CFSet exists in the current cfsm/3 database –cfsmEntry/3 defined earlier will increment the state number (S) and perform: ?- asserta(cfsm(L,CFSet,S)).

35 Build Actions two main actions –Shift move a word from the input onto the stack Example: –NP -->.D N –Reduce build a new constituent Example: –NP --> D N.

36 Build Actions Machine components [ V hit ] … [ N man] [ D a ] Input Structure Stack (items) Control Stack (states) A machine operation step (action) will have signature: –CS x Input x SS  CS ’ x Input ’ x SS ’ where CS = control stack SS = (constituent) structure stack

37 Build Actions: shift action example –shift(n) code –action(S, CS, Input, SS, CS2, Input2, SS2 ) :- Input = [Item|Input2], category(Item,n), goto(S,n,S2), CS2 = [S2|CS], SS2 = [Item|SS]. notes: (changes) –Input2 is Input minus Item –SS2 is SS plus Item –CS2 is CS plus S2 from goto(S,n,S2)

38 Build Actions: shift action calling pattern for action/7 –given values for: current state (S) control and structure stacks (CS,SS) –compute new values of: state (S2) control and structure stacks (CS2,SS2) action(S, CS, Input, SS, CS2, Input2, SS2 ) GivenCompute

39 Build Actions: reduce action example –reduce NP --> D N. code –action(S, CS, Input, SS, CS2, Input2, SS2 ) :- Input = Input2, SS = [N,D|SS3], SS2 = [np(D,N)|SS3], CS = [_,_,S1|CS3], CS2 = [S2,S1|CS3], goto(S1,np,S2). notes –input is unchanged –pop 2 items off the stacks –goto is not based on current state

40 Build Actions define shift/reduce action generation procedure –buildActions :- –cfsm(_,CFSet,State), –actions(CFSet,Instructions), –genActions(State,Instructions), –fail. –buildActions. define actions/2 –actions([],[]). –actions([CF|CFs],L) :- –reduceAction(CF,L1), –shiftAction(CF,L2), –append(L1,L2,L3), –actions(CFs,L4), –union(L3,L4,L)% should be no duplicate actions

41 Build Actions define shift and reduce actions –reduceAction(cf(Ref,Pos),[reduce(Ref)]) :- –ruleRHS(RHS,Ref), –length(RHS,Pos),% finds config. A--> . –!. –reduceAction(_,[]). –% assume that Symbol in Vt –shiftAction(cf(Ref,Pos),[shift(Symbol)]) :- –ruleRHS(RHS,Ref),% finds config. A--> .a  –nth(Pos,RHS,Symbol), –term(Symbol), –!. –shiftAction(_,[]). builds sequences of instructions of the form –[shift(n), reduce(R3)] etc.

42 Build Actions define procedure genActions/2 –which turns instructions such as: shift(n) –into code like action(S, CS, Input, SS, CS2, Input2, SS2 ) :- –Input = [Item|Input2], –category(Item,n), –goto(S,n,S2), –CS2 = [S2|CS2].

43 Build Actions genActions/2 –processes a list of actions for a given state S –genActions(_,[]). –genActions(S,[Action|As]) :- –nl, –actionClause(S,Action,Clause), –write(Clause), write('.'), –genActions(S,As). Prolog builtins nl - writes a newline to standard output write/1 - writes supplied argument to standard output

44 Build Actions: shift generate action/7 for shift –% shifting a $ –actionClause(State,shift($),action(State,_,[$],SS,accept,[],SS)) :- !. –% shifting anything other than a $ –actionClause(State,shift(Symbol), –(action(State,CS,[I|Is],SS,[S|CS],Is,[I|SS]) :- – functor(I,Symbol,_), – goto(State,Symbol,S))). –note: see words/2 later assume input item is of form c(word), e.g. n(john)

45 Build Actions: reduce generate action/7 for reduce –actionClause(State,reduce(Ref), –(action(State,CS,I,SS,[S2,Last|CS1],I,[Item|SS1]) :- – goto(Last,NT,S2))) :- –ruleElements(NT,RHS,Ref), –popStk(RHS,CS,Last,CS1), –popAndLink(RHS,SS,SS1,L), –Item =.. [NT|L]. note –popStk/4 and popAndLink/4 both generate code to pop the control and structure stacks

46 LR Machine: goto table example of LR Machine constructed –% State 8: pp->.p np vp->vp.pp s->np vp. –goto(4,vp,8). –% State 9: vp->vp pp. –goto(8,pp,9). –goto(8,p,6). –goto(7,d,2). –goto(7,n,3). –% State 10: pp->.p np np->np.pp vp->v np. –goto(7,np,10). –goto(10,pp,5). –goto(10,p,6). –goto(6,d,2). –goto(6,n,3). –% State 11: pp->.p np np->np.pp pp->p np. –goto(6,np,11). –goto(11,pp,5). –goto(11,p,6). –% State 12: np->d n. –goto(2,n,12). –% State 13: ss->s $. –goto(1,$,13).

47 LR Machine: action table example of action table constructed –action(State,CS,Input,SS,CS ’,Input ’,SS ’ ) –% 7 –action(7,_14,[_20|_18],_16,[_22|_14],_18,[_20|_16]):-functor(_20,n,_32),goto(7,n,_22). –action(7,_58,[_64|_62],_60,[_66|_58],_62,[_64|_60]):-functor(_64,d,_76),goto(7,d,_66). –action(7,[_38,_10|_11],_03,[_44|_13],[_08,_10|_11],_03,[vp(_44)|_13]):-goto(_10,vp,_08). –% 6 –action(6,_78,[_84|_82],_80,[_86|_78],_82,[_84|_80]):-functor(_84,n,_96),goto(6,n,_86). –action(6,_22,[_28|_26],_24,[_30|_22],_26,[_28|_24]):-functor(_28,d,_40),goto(6,d,_30). –% 5 –action(5,[_68,_70,_38|_39],_31,[_78,_82|_41],[_36,_38|_39],_31,[np(_82,_78)|_41]):- goto(_38,np,_36).

48 Parser define parse/2 as follows –parse(Words,Parse) :- –words(Words,L), –machine([0],L,[],Parse). –machine(CS,Input,SS,Parse) :- –CS = accept –-> SS = [Parse] –; CS = [State|_], – action(State,CS,Input,SS,CS2,Input2,SS2), – machine(CS2,Input2,SS2,Parse). –words([],[$]). –words([W|Ws],[I|Is]) :- lexicon(W,C), I =.. [C,W], words(Ws,Is).

49 Administrivia Prolog code available on the course webpage files grammar0.pl - example grammar lr0.pl - LR(0) parser/generator machine0.pl - generated tables

50 LR Parsing in Prolog How to use steps 1. ?- [grammar0]. (consult toy grammar) 2. ?- [lr0]. (consult LR code) 3. ?- build. (constructs goto table) 4. ?- buildActions. (constructs shift/reduce actions) How to use (saving output to a file) steps 1. ?- [grammar0]. (consult toy grammar) 2. ?- [lr0]. (consult LR code) 3. ?- tell(‘filename.pl’). (redirect screen output to filename.pl) 4. ?- build. (constructs goto table) 5. ?- buildActions. (constructs shift/reduce actions) 6. ?- told. 7.(close filename.pl)

51 Parser Example: –?- parse([john,saw,the,man,with,a,telescope],X). –X = s(np(n(john)),vp(v(saw),np(np(d(the),n(man)),pp(p(with),np(d(a),n(t elescope)))))) ; –X = s(np(n(john)),vp(vp(v(saw),np(d(the),n(man))),pp(p(with),np(d(a),n(t elescope))))) ; –no

52 LR(0) Goto Table D22 N3123 NP410 V7 VP8 P666 PP595 S1 $13

53 LR(0) Action Table A1A1 SDSD ASNSN R NP SVSV R NP SDSD SDSD SPSP R VP SPSP SPSP R NP A2A2 SNSN SPSP SNSN SNSN RSRS R VP R PP A3A3 RV P S = shift, R = reduce, A = accept Empty cells = error states Multiple actions = machine conflict Prolog’s computation rule: backtrack

54 LR(0) Conflict Statistics Toy grammar –14 states –6 states with 2 competing actions states 11,10,8: –shift-reduce conflict –1 state with 3 competing actions State 7: –shift(d) shift(n) reduce(vp->v)

55 Lookahead LR(1) –a shift/reduce tabular parser –using one (terminal) lookahead symbol decide on the Action (shift, reduce) to take based on –state x input symbol example –select reduce operation consulting the current input symbol cf. LR(0) –select an action based on just the current state

56 Lookahead potential advantage –the input symbol may partition the action space –resulting in fewer conflicts provided the current input symbol can help to choose between possible actions potential disadvantages –larger finite state machine more possible dotted rule/lookahead combinations than just dotted rule combinations –might not help much depends on grammar –more complex (off-line) computation building the LR machine gets more complicated

57 Lookahead formally –X --> ….Y …, L L = lookahead set L = set of possible terminals that can follow X example –State 0 ss-->.s $ [[]] s-->.np vp [$] np-->.d n [p,v] np-->.n [p,v] np-->.np pp [p,v]

58 Lookahead central idea for propagating lookahead in state machine –if dotted rule is complete, –lookahead informs parser about what the next terminal symbol should be –example NP --> D N., L reduce by NP rule provided current input symbol is in set L

59 Lookahead Prolog code to compute LR(1) machine is given on the course homepage –see file lr1.pl –procedure after loading in grammar and lr1.pl ?- computeF. (extra step: compute first set) ?- build. (build goto table) ?- buildActions.

60 LR(1) Machine toy grammar revisited –number of states: 20 –cf. 14 for LR(0) –almost deterministic almost LR(1) only 3 State/Lookahead combinations have a conflict

61 LR(0) vs. LR(1) LR(1) parser-generator disambiguates using one symbol of lookahead –allows it to improve the determinacy of the parser –theoretical result: LR(1) is optimal (for one symbol of lookahead) LR(1) LR(0)

62 LR(1) Action Table nSSSSSS dSSSS vR NP SR NP R PP R VP pR NP SR NP R VP SR VP R NP S R PP R NP R NP S R VP S R PP R VP $AR VP RSRS R VP R NP R PP R NP R NP R VP S = shift, R = reduce, A = accept Empty cells = error states

63 LR Parsing in fact –LR-parsers are generally acknowledged to be the fastest parsers especially when combined with the chart technique (to be described today) –reference (Tomita, 1985) –textbook Earley’s algorithm uses chart but builds dotted-rule configurations dynamically at parse- time instead of ahead of time (so slower than LR)

64 Homework 6 don’t panic –all Prolog code is supplied –you just have to run goal –test understanding of ideas behind the algorithms discussed in class

65 LR Code grammar: –grammar0.pl machine generators: –lr0.pl –lr1.pl generated machines: –LR(0): machine0.pl –LR(1): machine1.pl :- dynamic rule/2. nonT(ss). nonT(s). nonT(np). nonT(vp). nonT(pp). term(n). term(v). term(p). term(d). term($). start(ss). rule(ss,[s,$]). rule(s,[np,vp]). rule(np,[d,n]). rule(np,[n]). rule(np,[np,pp]). rule(vp,[v,np]). rule(vp,[v]). rule(vp,[vp,pp]). rule(pp,[p,np]). lexicon(the,d). lexicon(a,d). lexicon(man,n). lexicon(john,n). lexicon(telescope,n). lexicon(saw,v). lexicon(v,runs). lexicon(with,p).

66 Homework 6 –Question 1: (5pts) –compare LR(0) and LR(1) algorithms run both the LR(0) and LR(1) machines on the sentence John saw the man with a telescope looking for all answers compare the number of calls to the predicate machine which one makes the fewer calls? by how many? –Question 2: (5pts) LR(1) Action Table –states 14 and 15 in machine1.pl are very similar 14: np-->np pp./[p,$] 15: np-->d n./[p,$] can they be merged? explain your answer