Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.

Slides:

Advertisements

Similar presentations

Parsing V: Bottom-up Parsing

Advertisements

Chapter 3 Syntax Analysis

Exercise: Balanced Parentheses

Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

Mooly Sagiv and Roman Manevich School of Computer Science

Yacc YACC BNF grammar example.y Other modules example.tab.c Executable

6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)

Top-Down Parsing.

By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.

Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter

CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.

COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.

Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)

Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.

Professor Yihjia Tsai Tamkang University

COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.

CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Parser construction tools: YACC

Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,

Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.

Syntax and Semantics Structure of programming languages.

LEX and YACC work as a team

LR Parsing Compiler Baojian Hua

Top-Down Parsing - recursive descent - predictive parsing

1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.

Chapter 5 Top-Down Parsing.

1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.

Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.

Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.

Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Syntax and Semantics Structure of programming languages.

–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.

Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.

1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.

YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.

1 LEX & YACC Tutorial February 28, 2008 Tom St. John.

Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.

Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.

Top-Down Parsing.

CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.

Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.

CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.

LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.

More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.

Parsing III (Top-down parsing: recursive descent & LL(1) )

Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.

COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.

YACC (Yet Another Compiler-Compiler) Chung-Ju Wu

1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,

CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.

Compiler Design BMZ 1 Chapter4: Syntax Analysis. Compiler Design BMZ 2 Syntax Analysis Source Program Target Program Semantic Analyser Intermediate Code.

Syntax and Semantics Structure of programming languages.

Announcements/Reading

Syntax Analysis Part III

Programming Languages Translator

Chapter 4 Syntax Analysis.

Context-free Languages

Syntax Analysis Part III

Top-Down Parsing CS 671 January 29, 2008.

Syntax Analysis Part III

Bison Marcin Zubrowski.

Syntax Analysis Part III

Subject Name:Sysytem Software Subject Code: 10SCS52

Syntax Analysis Part III

Compiler Structures 7. Yacc Objectives , Semester 2,

Presentation transcript:

Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006

- 1 - Reading/Announcements v Reading: Section 4.4 (top-down parsing) v Working example posted on webpage »Converts converts expressions with infix notation to expression with prefix notation »Running the example  bison –d example.y u Creates example.tab.c and example.tab.h  flex example.l u Creates lex.yy.c  g++ example.tab.c lex.yy.c –lfl u g++ required here since user code uses C++ (new,<<)  a.out < ex_input.txt

- 2 - Bison Overview declarations % rules % support code Format of.y file (same structure as lex file) foo.y foo.tab.c a.out gcc bison yyparse() is main routine

- 3 - Declarations Section v User types: As in flex, these are in a section bracketed by “%{“ and “%}” v Tokens – terminal symbols of the grammar »%token terminal1 terminal2...  Values for tokens assigned sequentially after all ASCII characters »or %token terminal1 val1 terminal2 val2... v Tip – Use ‘-d’ option in bison to get foo.tab.h that contains the token definitions that can be included in the flex file

- 4 - Declarations (2) v Start symbol »%start non-terminal v Associativity – (left, right or none) »%leftTK_PLUS »%rightTK_EXPONENT »%nonassocTK_LESSTHAN v Precedence »Order of the directives specifies precedence »%prec changes the precedence of a rule

- 5 - Declarations (3) v Attribute values – information associated with all terminal/non-terminal symbols – passed from the lexer »%union {  int ival;  char *name;  double dval; »} »Becomes YYSTYPE v Symbol attributes – types of non-terminals »%type non_terminal »Example: %type IntNumber

- 6 - Values Used by yyparse() v Error function »yyerror(char *s); v Last token value »yylval of type YYSTYPE (%union decl) v Setting yylval in flex »[a-z]{yylval.ival = yytext[0] – ‘a’; return TK_NAME;} v Then, yylval is available in bison »But in a strange way

- 7 - Rules Section v Every name appearing that has not been declared is a non-terminal v Productions »non-terminal : first_production | second_production |... ; »  production has the form  non-terminal : ;  Thus you can say, foo: production1 | /* nothing*/ ; »Adding actions  non-terminal : RHS {action routine} ;  Action called before LHS is pushed on parse stack

- 8 - Attribute Values (aka $ vars) v Each terminal/non-terminal has one v Denoted by $n where n is its rank in the rule starting by 1 »$$ = LHS »$1 = first symbol of the RHS »$2 = second symbol, etc. »Note, semantic actions have values too!!!  A: B {...} C {...} ;  C’s value is denoted by $3

- 9 - Example.y File – Partial Calculator %union { intvalue; char*symbol; } %type exp term factor %type ident... exp : exp ‘+’ term {$$ = $1 + $3; }; /* Note, $1 and $3 are ints here */ factor : ident {$$ = lookup(symbolTable, $1); }; /* Note, $1 is a char* here */

Conflicts v Bison reports the number of shift/reduce and reduce/reduce conflicts found v Shift/reduce conflicts »Occurs when there are 2 possible parses for an input string, one parse completes a rule (reduce) and one does not (shift) »Example  e:‘X’ | e ‘+’ e ;\  “X+X+X” has 2 possible parses “(X+X)+X” or “X+(X+X)”

Conflicts (2) v Reduce/reduce conflict occurs when the same token could complete 2 different rules »Example  prog : proga | progb ;  proga : ‘X’ ;  progb : ‘X’ ;  “X” can either be a proga or progb »Ambiguous grammar!!

Ambiguity Review: Class Problem S  if (E) S S  if (E) S else S S  other Anything wrong with this grammar?

Grammar for Closest-if Rule v Want to rule out: if (E) if (E) S else S v Impose that unmatched “if” statements occur only on the “else” clauses »statement  matched | unmatched »matched  if (E) matched else matched | other »unmatched  if (E) statement | if (E) matched else unmatched

Parsing Top-Down Goal: construct a leftmost derivation of string while reading in sequential token stream Partly-derived StringLookaheadparsed part unparsed part  E + S((1+2+(3+4))+5  (S) + S1(1+2+(3+4))+5  (E+S)+S1(1+2+(3+4))+5  (1+S)+S2(1+2+(3+4))+5  (1+E+S)+S2(1+2+(3+4))+5  (1+2+S)+S2(1+2+(3+4))+5  (1+2+E)+S((1+2+(3+4))+5  (1+2+(S))+S3(1+2+(3+4))+5  (1+2+(E+S))+S3(1+2+(3+4))+5 ... S  E + S | E E  num | (S)

Problem with Top-Down Parsing Want to decide which production to apply based on next symbol Ex1: “(1)”S  E  (S)  (E)  (1) Ex2: “(1)+2”S  E+S  (S)+S  (E)+S  (1)+E  (1)+2 S  E + S | E E  num | (S) How did you know to pick E+S in Ex2, if you picked E followed by (S), you couldn’t parse it?

Grammar is Problem v This grammar cannot be parsed top-down with only a single look-ahead symbol! v Not LL(1) = Left-to-right scanning, Left- most derivation, 1 look-ahead symbol v Is it LL(k) for some k? v If yes, then can rewrite grammar to allow top-down parsing: create LL(1) grammar for same language S  E + S | E E  num | (S)

Making a Grammar LL(1) S  E + S S  E E  num E  (S) S  ES’ S’   S’  +S E  num E  (S) Problem: Can’t decide which S production to apply until we see the symbol after the first expression Left-factoring: Factor common S prefix, add new non-terminal S’ at decision point. S’ derives (+S)* Also: Convert left recursion to right recursion

Parsing with New Grammar Partly-derived StringLookaheadparsed part unparsed part  ES’((1+2+(3+4))+5  (S)S’1(1+2+(3+4))+5  (ES’)S’1(1+2+(3+4))+5  (1S’)S’+(1+2+(3+4))+5  (1+ES’)S’2(1+2+(3+4))+5  (1+2S’)S’+(1+2+(3+4))+5  (1+2+S)S’((1+2+(3+4))+5  (1+2+ES’)S’((1+2+(3+4))+5  (1+2+(S)S’)S’3(1+2+(3+4))+5  (1+2+(ES’)S’)S’3(1+2+(3+4))+5  (1+2+(3S’)S’)S’+(1+2+(3+4))+5  (1+2+(3+E)S’)S’4(1+2+(3+4))+5 ... S  ES’S’   | +SE  num | (S)

Class Problem Are the following grammars LL(1)? S  Abc | aAcb A  b | c |  S  aAS | b A  a | bSA

Predictive Parsing v LL(1) grammar: »For a given non-terminal, the lookahead symbol uniquely determines the production to apply »Top-down parsing = predictive parsing »Driven by predictive parsing table of  non-terminals x terminals  productions

Parsing with Table Partly-derived StringLookaheadparsed part unparsed part  ES’((1+2+(3+4))+5  (S)S’1(1+2+(3+4))+5  (ES’)S’1(1+2+(3+4))+5  (1S’)S’+(1+2+(3+4))+5  (1+ES’)S’2(1+2+(3+4))+5  (1+2S’)S’+(1+2+(3+4))+5 S  ES’S’   | +SE  num | (S) num+()$ S  ES’  ES’ S’  +S     E  num  (S)

How to Implement This? Table can be converted easily into a recursive descent parser 3 procedures: parse_S(), parse_S’(), and parse_E() num+()$ S  ES’  ES’ S’  +S     E  num  (S)

Recursive-Descent Parser num+()$ S  ES’  ES’ S’  +S     E  num  (S) void parse_S() { switch (token) { case num: parse_E(); parse_S’(); return; case ‘(‘: parse_E(); parse_S’(); return; default: ParseError(); } lookahead token

Recursive-Descent Parser (2) num+()$ S  ES’  ES’ S’  +S     E  num  (S) void parse_S’() { switch (token) { case ‘+’: token = input.read(); parse_S(); return; case ‘)‘: return; case EOF: return; default: ParseError(); }

Recursive-Descent Parser (3) num+()$ S  ES’  ES’ S’  +S     E  num  (S) void parse_E() { switch (token) { case number: token = input.read(); return; case ‘(‘: token = input.read(); parse_S(); if (token != ‘)’) ParseError(); token = input.read(); return; default: ParseError(); }

Call Tree = Parse Tree S E+S ( S )E E + S 5 1 2E ( S ) E + S E34 parse_S parse_E parse_S parse_Eparse_S’ parse_S parse_Eparse_S’ parse_S parse_E parse_S parse_S’ parse_S

How to Construct Parsing Tables? Needed: Algorithm for automatically generating a predictive parse table from a grammar S  ES’ S’   | +S E  number | (S) num+()$ SES’ES’ S’+S  Enum (S) ??

Constructing Parse Tables v Can construct predictive parser if: »For every non-terminal, every lookahead symbol can be handled by at most 1 production v FIRST(  ) for an arbitrary string of terminals and non-terminals  is: »Set of symbols that might begin the fully expanded version of  v FOLLOW(X) for a non-terminal X is: »Set of symbols that might follow the derivation of X in the input stream FIRSTFOLLOW X

Parse Table Entries v Consider a production X   v Add   to the X row for each symbol in FIRST(  ) v If  can derive  (  is nullable), add   for each symbol in FOLLOW(X) v Grammar is LL(1) if no conflicting entries num+()$ SES’ES’ S’+S  Enum (S) S  ES’ S’   | +S E  number | (S)