Mooly Sagiv and Roman Manevich School of Computer Science

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #10 Parsing.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
Compiler Construction Parsing I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Joey Paquet, 2000, 2002, 2012, Lecture 6 Bottom-Up Parsing.
LR Parsing Compiler Baojian Hua
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Concordia University Department of Computer Science and Software Engineering Click to edit Master title style COMPILER DESIGN Review Joey Paquet,
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Joey Paquet, Lecture 12 Review. Joey Paquet, Course Review Compiler architecture –Lexical analysis, syntactic analysis, semantic.
1 Compiler Construction Syntax Analysis Top-down parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
Compilation /15a Lecture 5 1 * Syntax Analysis: Bottom-Up parsing Noam Rinetzky.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
Top-Down Parsing.
4. Bottom-up Parsing Chih-Hung Wang
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
50/50 rule You need to get 50% from tests, AND
Unit-3 Bottom-Up-Parsing.
Textbook:Modern Compiler Design
Table-driven parsing Parsing performed by a finite state machine.
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
4 (c) parsing.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
Syntax Analysis - 3 Chapter 4.
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Mooly Sagiv and Roman Manevich School of Computer Science Winter 2006-2007 Compiler Construction T3 – Syntax Analysis (Parsing, part 1 of 2) Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University TODO: ensure consistency in displaying the values of instrumentation predicates for concrete heaps

Material taught in lecture The task of syntax analysis Error handling Context-Free Grammars Derivations and Parsing Trees Leftmost derivation, rightmost derivation Parse trees Ambiguous grammars Rewriting grammar Top-down (predictive) parsing LL(k) / Recursive-descent Top-Down vs. Bottom-Up parsing

Today Today: Next week: Review Bottom-up parsing IC Language Lexical Analysis Syntax Analysis Parsing AST Symbol Table etc. Inter. Rep. (IR) Code Generation Executable code exe Today: Review Grammars, parse trees, ambiguity Top-down parsing Bottom-up parsing Next week: Conflict resolution Shift/Reduce parsing via JavaCup (Error handling) AST intro. PA2

Goals of parsing Programming language has syntactic rules Context-Free Grammars Decide whether program satisfies syntactic structure Error detection Error recovery Simplification: rules on tokens Build Abstract Syntax Tree

From text to abstract syntax program text 5 + (7 * x) Lexical Analyzer token stream ) id * num ( + Grammar: E  id E  num E  E + E E  E * E E  ( E ) Parser parse tree num E + * ( ) id syntax error valid + num 7 x * Abstract syntax tree

Terminology Symbols: terminals (tokens) + * ( ) id num non-terminals E Grammar rules: E  id E  num E  E + E E  E * E E  ( E ) Symbols: terminals (tokens) + * ( ) id num non-terminals E Derivation: E E + E 1 + E 1 + E + E 1 + 2 + E 1 + 2 * 3 Parse tree: E E + E 1 * E E 2 3

Ambiguity Grammar rules: E  id E  num E  E + E E  E * E E  ( E ) Leftmost derivation Rightmost derivation Derivation: E E + E 1 + E 1 + E + E 1 + 2 + E 1 + 2 * 3 Parse tree: Derivation: E E * E E * 3 E + E * 3 E + 2 * 3 1 + 2 * 3 Parse tree: E E E + E E * E 1 3 * E E + E E 2 3 1 2

Grammar rewriting Non-ambiguous grammar: Ambiguous grammar: E  id E  num E  E + E E  E * E E  ( E ) Non-ambiguous grammar: E  E + T E  T T  T * F T  F F  id F  ( E ) Derivation: E E + T 1 + T 1 + T * F 1 + F * F 1 + 2 * F 1 + 2 * 3 Parse tree: E E + T T * T F F 3 F 1 2

Parsing methods Top-down / predictive / recursive descent without backtracking : LL(1) “L” – left-to-right scan of input “L” – leftmost derivation “1” – predict based on one token look-ahead For every non-terminal and token predict the next production Bottom-up : LR(0), SLR(1), LR(1), LALR(1) “R” – rightmost derivation (in the reversed order) For every potential right hand side and token decide when a production is found

Top-down parsing Builds parse tree in preorder LL(1) example Grammar: S  if E then S else S S  begin S L S  print E L  end L  ; S L E  num if 5 then print 8 else… Token : rule if : S  if E then S else S if E then S else S 5 : E  num if 5 then S else S print : print E if 5 then print E else S …

Problem: left recursion Left recursion: E  E + T Symbol on left also first symbol on right Predictive parsing fails when two rules can start with same token E  E + T E  T Rewrite grammar using left-factoring Nullable, FIRST, FOLLOW sets Arithmetic expressions: E  E + T E  T T  T * F T  F F  id F  ( E ) Left factored grammar: E  T Ep F  id Ep  + T Ep F  ( E ) Ep  T  F Tp Tp  * F Tp Tp 

More left recursion Non-terminal with two rules starting with same prefix Left factored grammar: S  if E then S X X  X  else S Grammar: S  if E then S else S S  if E then S

Bottom-up parsing No problem with left recursion Widely used in practice LR(0), SLR(1), LR(1), LALR(1) JavaCup implements LALR(1)

Bottom-up parsing 1 + (2) + (3) E  E + (E) E  i E + (2) + (3)

Shift-reduce parsing Parser stack: symbols (terminal and non-terminals) + automaton states Parsing actions: sequence of shift and reduce operations Action determined by top of stack and k input tokens Shift: move next token to top of stack Reduce: for rule X  A B C pop C, B, A then push X Convention: $ stands for end of file

Pushdown automaton input u t w $ V control parser-table $ stack

LR parsing table state terminals non-terminals rk gm 1 . . . rk gm 1 . . . shift/reduce actions goto part sn Shift and move to state n Reduce by rule k Goto state m

Parsing table example STATE SYMBOL i + ( ) $ E T s5 err s7 1 6 s3 s2 2 E  E + T T  i T ( E ) STATE SYMBOL i + ( ) $ E T s5 err s7 1 6 s3 s2 2 accept 3 4 reduce EE+T 5 reduce T  i reduce E  T 7 8 s9 9 reduce T(E)

Items Items indicate the position inside a rule: LR(0) items are of the form A    t (LR(1) items are of the form A    t, ) 1: S  E$ 2: S  E  $ 3: S  E $  4: E   T 5: E  T  6: E   E + T 7: E  E  + T 8: E  E +  T 9: E  E + T  10: T   i 11: T  i  12: T   (E) 13: T  ( E) 14: T  (E ) 15: T  (E)  Grammar: S  E$ E  T E  E + T T  i T ( E )

Automaton states i 1 1: S  E$ 4: E   T 6: E   E + T 10: T   i 1 1: S  E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 2: S  E  $ 7: E  E  + T 6 5: E  T  T E 5 i 2 2: S  E $  $ 11: T  i  + 13: T  ( E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( 7 i 3 7: E  E +  T 10: T   i 12: T   (E) ( 14: T  (E ) 7: E  E  + T E T + 8 4 9 8: E  E + T  ) 15: T  (E) 

Identifying handles Create a finite state automaton over grammar symbols Sets of LR(0) items Use automaton to build parser tables shift For items A    t  on token t reduce For items A    on every token Any grammar has Transition diagram GOTO table Not every grammar has deterministic action table When no conflicts occur use a DPDA which pushes states on the stack

Non-LR(0) grammars When conflicts occur the grammar is not LR(0) Parsing table contains non-determinism shift-reduce conflicts reduce-reduce conflicts shift-shift conflicts? Known cases Operator precedence Operator associativity Dangling if-then-else Unary minus Solutions Develop equivalent non-ambiguous grammar Patch parsing table to shift/reduce Precedence and associativity of tokens Stronger parser algorithm: SLR/LR(1)/LALR(1)

Precedence and associativity E  E+E*E E  E+E Reduce + precedes * Shift * precedes +

Precedence and associativity E  E+E*E E  E+E Reduce + precedes * Shift * precedes + + E 1 2 * 3 = 9 + E 1 2 * 3 = 7

Precedence and associativity E  E+E*E E  E+E Reduce + precedes * Shift * precedes + Associativity E  E+E+E E  E+E Shift + right-associative Reduce + left-associative

Dangling else/if-else ambiguity Grammar: S  if E then S else S S  if E then S S  other if a then if b then e1 else e2 which interpretation should we use? (1) if a then { if b then e1 else e2 } -- standard interpretation (2) if a then { if b then e1 } else e2 shift/reduce conflict LR(1) items: token: S  if E then S  else S  if E then S  else S (any)

See you next week

Grammar hierarchy Non-ambiguous CFG LALR(1) LL(1) SLR(1) LR(0)