Bernd Fischer RW713: Compiler and Software Language Engineering.

Slides:



Advertisements
Similar presentations
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Advertisements

Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
LR-Grammars LR(0), LR(1), and LR(K).
Review: LR(k) parsers a1 … a2 … an $ LR parsing program Action goto Sm xm … s1 x1 s0 output input stack Parsing table.
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
Mooly Sagiv and Roman Manevich School of Computer Science
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #10 Parsing.
Bottom Up Parsing.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Shift/Reduce and LR(1) Professor Yihjia Tsai Tamkang University.
Bottom-up parsing Goal of parser : build a derivation
LESSON 24.
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
LR Parsing Compiler Baojian Hua
Chap. 6, Bottom-Up Parsing J. H. Wang May 17, 2011.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
CSc 453 Syntax Analysis (Parsing) Saumya Debray The University of Arizona Tucson.
CSc 453 Syntax Analysis (Parsing)
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Syntax and Semantics Structure of programming languages.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
4. Bottom-up Parsing Chih-Hung Wang
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
COMPILER CONSTRUCTION
2016/7/9Page 1 Lecture 11: Semester Review COMP3100 Dept. Computer Science and Technology United International College.
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
50/50 rule You need to get 50% from tests, AND
Table-driven parsing Parsing performed by a finite state machine.
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Syntax Analysis Part II
CSC 4181 Compiler Construction Parsing
Parsing #2 Leonidas Fegaras.
Lecture (From slides by G. Necula & R. Bodik)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing #2 Leonidas Fegaras.
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Presentation transcript:

Bernd Fischer RW713: Compiler and Software Language Engineering

Bottom-Up Parsing

Top-down vs. bottom-up parsing

Ex +Nat* Ex Ex  Nat | (Ex) | Ex + Ex | Ex * Ex Matched input string  success ! Corresponds to a Leftmost derivation  hence LL Ex Ex + Ex Nat + Ex Nat + Ex * Ex Nat + Nat * Ex Nat + Nat * Nat

Top-down vs. bottom-up parsing Nat + Nat * Nat Ex + Nat * Nat Ex + Ex * Nat Ex + Ex * Ex Ex + Ex Ex +Nat* Ex Ex  Nat | (Ex) | Ex + Ex | Ex * Ex Reached start symbol  success ! Corresponds to a Rightmost derivation (in reverse)!  hence LR

Shift-Reduce Parsing

REMINDER Use a parse stack to represent the derivation: initialize s = S if x = ε –if s = ε then accept else reject if tos ∈ T –if x i = tos then pop; skip x i else reject if tos ∈ N –pick a production tos → α in P; pop; push(α) Top-down parsing searches for the (leftmost) derivation using a stack. The parser stack can be explicit or implicit. symbol by symbol, in reverse order.

Shift-reduce parsing searches for the (rightmost) derivation using a stack. Use a parse stack to represent the derivation: initialize s = ε if x = ε –if s = S then accept else reject shift: push(x i ) reduce: if s = αX 1 X 2... X n –pick a production A → X 1 X 2... X n in P; pop n ; push(A); The parser stack is typically explicit. tos

Shift-reduce parsing searches for the (rightmost) derivation using a stack.

Use a parse stack to represent the derivation: initialize s = ε if x = ε –if s = S then accept else reject shift: push(x i ) reduce: if s = αX 1 X 2... X n –pick a production A → X 1 X 2... X n in P; pop n ; push(A); shift or reduce? which production? The parser stack is typically explicit.

Shift-reduce parsing searches for the (rightmost) derivation using a stack. Schematic syntax tree with α ∈ (N ∪ T)*, x, y ∈ T*, a ∈ T, and start symbol S read pointer stack “shift a”“reduce with A → γ” ? ? need to constrain choice

Shift-reduce parsing maintains a viable prefix on the stack. Definition: Let G = (N, T, P, S) be a context-free grammar and S ⇒ r * βAy ⇒ r βγy. Then γ is called a handle or redex of the right-sentential form βγy. Each prefix of βγ is called a viable prefix of G. Shift-reduce parsing invariants: The parser stack is a viable prefix. S ⇒ r * sy

Shift-reduce parsing maintains a viable prefix on the stack. Definition: Let G = (N, T, P, S) be a context-free grammar and S ⇒ r * βAy ⇒ r βγy. Then γ is called a handle or redex of the right-sentential form βγy. Each prefix of βγ is called a viable prefix of G. Theorem: The language of viable prefixes of a grammar G is regular. Corollary: We can build and use a DFA to recognize viable prefixes and so constrain the choice of a shift-reduce parser.

LR(0) Parsing

LR(0) items

Parsing with an NFA over LR(0) items.

Constructing the LR(0) NFA Let G = (N, T, P, S) be a context-free grammar. For each nonterminal A ∈ N, construct the item automaton. Build union of item automata: Start state is the start state of item automaton for S, final states are final states of item automata. Add transitions from each state which contains the dot in front of a nonterminal A to the starting state of the item automaton of A. Theorem: The automaton obtained in this way exactly accepts the language of viable prefixes of G if all states are declared to be final.

Constructing the LR(0) NFA

Constructing the LR(0) DFA

Direct Construction of the LR(0) DFA Needs closure operation on itemsets

Direct Construction of the LR(0) DFA Needs closure operation on itemsets

Direct Construction of the LR(0) DFA Needs closure operation on itemsets

Direct Construction of the LR(0) DFA Needs goto operation to represent transition relation

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Direct Construction of the LR(0) DFA Example:

Parsing with the LR(0) DFA In principle the LR(0) DFA can be used for parsing: run DFA over sentential form until accepting state is reached apply accepting rule from itemset to reduce tail of viable prefix re-run DFA over new sent. form until accepting state is reached apply accepting rule from itemset to reduce tail of viable prefix... ⇒ instead: use pushdown automaton viable prefix redex

LR(0) Pushdown Automata Basic ideas: states == itemsets (conceptionally) uses two stacks: –states –grammar elements uses four kinds of actions per state –shift – push current input symbol –reduce(rule) – reduce with rule –accept –error – default uses goto-table: state x grammar symbol → state usually ignored

LR(0) Pushdown Automata Basic loop: state contains shift item A → α ● aβ –check that x i = a; syntax error if not –push a on symbol stack –push goto[tos, a] on stack state contains reduce item A → α ● –pop |α| elements off symbol stack –pop |α| elements off stack –push A on symbol stack –push goto[tos, A] on stack –accept if A = S and x = ε “dot before terminal” “dot at end”

Parsing with the LR(0) PDA

Does this always work...? shift/reduce conflict reduce/reduce conflict

Recess Refresher

Pop-Quiz... Remember: Check whether Γ 5 is LR(0)!

SLR(1)-Parsing

Does this always work...? reduce/reduce conflict shift/reduce conflict Grammar tells us to… … reduce to A if next input is a … reduce to B if next input is b … shift if next input is c follow sets

SLR(1) parsing uses follow sets to resolve conflicts in an LR(0) state. follow(A) = {a} follow(B) = {b} follow(A) ∩ follow(B) = ∅ ⇒ use next token to pick rule ⇒ resolves reduce/reduce conflict c ∉ follow(A) ∪ follow(B) ⇒ resolves shift/reduce conflict

SLR(1) Grammars Definition: Let G = (N, T, P, S) be a context-free grammar and I be a state of the LR(0) DFA for G. I has an SLR(1) conflict iff I contains two different reduce items A → α ● and B → β ● such that follow(A) ∩ follow(B) ≠ ∅ ; or two items A → α ● and B → β ● aγ such that a ∈ follow(A). G is an SLR(1) grammar if there is no SLR(1) conflict.

LR(0) vs. SLR(1) LR(0): uses sets of LR(0) items as states uses GOTO[state, grammar symbol] as transitions actions depend on state only SLR(1) uses sets of LR(0) items as states uses GOTO[state, grammar symbol] as transitions actions depend on state and next input token

xLR(1) parsing tables Different LR(1) parsing variants use the same tables:

xLR(1) parsing tables Different LR(1) parsing variants use the same tables: empty entry == syntax error empty entry == can’t happen SLR(1) tables can easily be constructed from the LR(0) DFA via the SLR(1) definition.

Construction of SLR(1) tables

Does this always work...? follow(A) = {a,b} follow(B) = {b}

Does this always work...? b ∈ follow(A) a ∈ follow(A) follow(A) = {a,b} follow(B) = {b} The follow-sets... are a global approximation of possible continuations ignore (left) derivation context

Canonical LR(1)-Parsing

LR(1) items are pairs of LR(0) items and a look-ahead symbol.

LR(1) item computation 4. LR(1) property same as SLR(1) property (but uses lookaheads from LR(1) items)

LR(1) DFA construction - example

● ● ●

LR(1) DFA construction - example same LR(0) item in different LR(1) states ⇒ different look-aheads reflect different derivation contexts ⇒ state-splitting removes SLR(1) conflict

LALR(1)-Parsing

LALR(1) DFA construction - conceptual LALR(1) tables can be constructed directly, without going via the LR(1) states. [ASLU, 4.7.5] union of lookahead sets

LALR(1) DFA construction - conceptual 4. LALR(1) property same as SLR(1) property (but uses lookaheads from merged LR(1) items)

LALR(1) DFA construction - example

conflicts are rare, though

“Hacking” xLR(1) tables

Handling precedence and associativity

Idea: remove conflicting parse table entries

Handling precedence and associativity Idea: remove conflicting parse table entries

Handling precedence and associativity Idea: remove conflicting parse table entries

Error handling Idea: add specific error handlers into ACTION table: default: tell legal tokens (i.e., have non-error entry)

xLR comparison

LR vs LL

LR(0) vs SLR(1) vs LALR(1) vs LR(1) method of choice (yacc and friends)

LR(0) vs SLR(1) vs LALR(1) vs LR(1)

Formal Language Theory

Classes of Grammars

Classes of Languages