Syntax-Directed Definitions CS375 Compilers. UT-CS. 1.

Slides:



Advertisements
Similar presentations
Abstract Syntax Mooly Sagiv html:// 1.
Advertisements

Mooly Sagiv and Roman Manevich School of Computer Science
Compiler Construction LR(0) + Visitor Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Context-Free Grammars Lecture 7
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom Up Parsing.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Chapter 2 A Simple Compiler
Chapter 2 Chang Chi-Chung rev.1. A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create.
Lexical and syntax analysis
CS784 (Prasad)L167AG1 Attribute Grammars Attribute Grammar is a Framework for specifying semantics and enables Modular specification.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Parser construction tools: YACC
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
Syntax and Semantics Structure of programming languages.
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
Semantic Analysis (Generating An AST) CS 471 September 26, 2007.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Overview of Previous Lesson(s) Over View  An ambiguous grammar which fails to be LR and thus is not in any of the classes of grammars i.e SLR, LALR.
Syntax and Semantics Structure of programming languages.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
CPS 506 Comparative Programming Languages Syntax Specification.
11/25/2015© Hal Perkins & UW CSEH-1 CSE P 501 – Compilers Implementing ASTs (in Java) Hal Perkins Winter 2008.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Overview of Previous Lesson(s) Over View  In syntax-directed translation 1 st we construct a parse tree or a syntax tree then compute the values of.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Syntax Directed Definition and Syntax directed Translation
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Chap. 7, Syntax-Directed Compilation J. H. Wang Nov. 24, 2015.
Bernd Fischer COMP2010: Compiler Engineering Abstract Syntax Trees.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
CSE 420 Lecture Program is lexically well-formed: ▫Identifiers have valid names. ▫Strings are properly terminated. ▫No stray characters. Program.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Announcements/Reading
Chapter 3 – Describing Syntax
A Simple Syntax-Directed Translator
CS510 Compiler Lecture 4.
Table-driven parsing Parsing performed by a finite state machine.
4 (c) parsing.
Basic Program Analysis: AST
Syntax-Directed Translation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

Syntax-Directed Definitions CS375 Compilers. UT-CS. 1

Organization Building an AST –Top-down parsing –Bottom-up parsing –Making YACC, CUP build ASTs AST class definitions –Exploiting type-checking features of OO languages Writing AST visitors –Separate AST code from visitor code for better modularity 2

LL Parsing Techniques LL parsing –Computes a Leftmost derivation Build First and Follows Sets Build Parsing table Repeatedly apply rules on remaining input. –Determines the derivation top-down –LL parsing table indicates which production to use for expanding the leftmost non-terminal –Can build parse tree from the series of derivations selected/applied for the input 3

LL Example Consider the grammar: S->F S->(E+F)‏ F->int –Suppose input is (3+5)‏ Series of derivations are: –S->(S+F)‏ –S->(F+F)‏ –S->(int(3)+F)‏ –S->(int(3)+int(5))‏ And Parse tree is 4

LR - Parsing Techniques LR parsing –Computes a Rightmost derivation –Determines the derivation bottom-up –Uses a set of LR states and a stack of symbols –LR parsing table indicates, for each state, what action to perform (shift/reduce) and what state to go to next –Apply reductions to generate rightmost derivation of input. 5

AST Review Derivation = sequence of applied productions Parse tree = graph representation of a derivation –Doesn’t capture the order of applying the productions Abstract Syntax Tree (AST) discards unnecessary information from the parse tree 1 + S E + S E Parse Tree AST 6

Building AST. We would like to build AST while parsing the input. Using the parse tree is redundant, it contains information not required for code generation. This information is a by-product of grammar. 7

SIMPLE APPROACH TO BUILDING AST 8

Building AST – Simple approach Simple approach is to have the parser return a Tree as it parses the input. Build a Tree hierarchy to support all the different type of internal nodes based on the productions. Different parsing techniques would only need minor adjustments to build AST. 9

AST Data Structures abstract class Expr { } class Add extends Expr { Expr left, right; Add(Expr L, Expr R) { left = L; right = R; } class Num extends Expr { int value; Num (int v) { value = v; } } Num 3Num 1Num 2 Add 10

AST Construction LL/LR parsing implicitly walks parse tree during parsing –LL parsing: Parse tree implicitly represented by sequence of derivation steps (preorder)‏ –LR parsing: Parse tree implicitly represented by sequence of reductions (endorder)‏ The AST is implicitly defined by parse tree Want to explicitly construct AST during parsing: –add code in parser to build AST 11

LL AST Construction LL parsing: extend procedures for nonterminals Example: Expr parse_S() { switch (token) { case num: case ‘(’: Expr left = parse_E(); Expr right = parse_S’(); if (right == null) return left; else return new Add(left, right); default: throw new ParseError(); } void parse_S() { switch (token) { case num: case ‘(’: parse_E(); parse_S’(); return; default: throw new ParseError(); } S  ES' S'   | + S E  num | ( S )‏ 12

LR AST Construction LR parsing –Need to add code for explicit AST construction AST construction mechanism for LR Parsing –With each symbol X on stack, also store AST sub-tree for X on stack –When parser performs reduce operation for , create AST subtree for  from AST fragments on stack for , pop |  | subtrees from stack, push subtree for . –Sub-tree for  can be built using nodes in  13

LR AST Construction, ctd. Example Num(2)‏ Num(3)‏ Add Num(1)‏ Add stack Num(2)‏ Num(3)‏ Add Num(1)‏ Before reduction S  E+S After reduction S  E+S S S + E S  E+S | S E  num | ( S )‏ 14

Issues Unstructured code: mixed parsing code with AST construction code Automatic parser generators –The generated parser needs to contain AST construction code –How to construct a customized AST data structure using an automatic parser generator? May want to perform other actions concurrently with the parsing phase –E.g., semantic checks –This can reduce the number of compiler passes 15

SYNTAX DIRECTED DEFINITIONS 16

Syntax-Directed Definition Solution: syntax-directed definition –Extends each grammar production with an associated semantic action (code): S  E+S { action } –The parser generator adds these actions into the generated parser –Each action is executed when the corresponding production is reduced 17

Semantic Actions Actions = code in a programming language –Same language as the automatically generated parser Examples: –Yacc = actions written in C –CUP = actions written in Java The actions can access the parser stack! –Parser generators extend the stack of states (corresponding to RHS symbols) symbols with entries for user-defined structures (e.g., parse trees)‏ The action code need to refer to the states (corresponding to the RHS grammar symbols) in the production –Need a naming scheme… 18

Naming Scheme Need names for grammar symbols to use in the semantic action code Need to refer to multiple occurrences of the same nonterminal symbol E  E 1 + E 2 Distinguish the nonterminal on the LHS E 0  E + E 19

Naming Scheme: CUP CUP: –Name RHS nonterminal occurrences using distinct, user-defined labels: expr ::= expr:e1 PLUS expr:e2 –Use keyword RESULT for LHS nonterminal CUP Example (an interpreter): expr ::= expr:e1 PLUS expr:e2 {: RESULT = e1 + e2; :} 20

Naming Scheme: yacc Yacc: –Uses keywords: $1 refers to the first RHS symbol, $2 refers to the second RHS symbol, etc. –Keyword $$ refers to the LHS nonterminal Yacc Example (an interpreter): expr ::= expr PLUS expr { $$ = $1 + $3; } 21

Building the AST Use semantic actions to build the AST AST is built bottom-up during parsing non terminal Expr expr; expr ::= NUM:i {: RESULT = new Num(i.val); :} expr ::= expr:e1 PLUS expr:e2 {: RESULT = new Add(e1,e2); :} expr ::= expr:e1 MULT expr:e2 {: RESULT = new Mul(e1,e2); :} expr ::= LPAR expr:e RPAR {: RESULT = e; :} User-defined type for semantic objects on the stack Nonterminal name 22

Example Parser stack stores value of each symbol (1+2)*3 (E+2)*3 RESULT=new Num(1)‏ (E+2)*3 (E+E)*3 RESULT=new Num(2)‏ (E)*3 RESULT=new Add(e1,e2)‏ (E)*3 E*3 RESULT=e E  num | (E) | E+E | E*E Num(1)‏ Add(, )‏ Num(2)‏ 23

AST - DESIGN 24

AST Design Keep the AST abstract Do not introduce tree node for every node in parse tree (not very abstract)‏ Add 12 3 ? E + S S E 1 2 ( S )‏ E 3 Sum Expr Sum LPar Sum RPar 3 Expr 1 Sum 2 Expr 25

AST Design Do not use single class AST_node E.g., need information for if, while, +, *, ID, NUM class AST_node { int node_type; AST_node[ ] children; String name; int value; …etc… } Problem: must have fields for every different kind of node with attributes Not extensible, Java type checking no help 26

Use Class Hierarchy Use subclassing to solve problem –Use abstract class for each “interesting” set of nonterminals (e.g., expressions)‏ E  E+E | E*E | -E | (E)‏ abstract class Expr { … } class Add extends Expr { Expr left, right; … } class Mult extends Expr { Expr left, right; … } // or: class BinExpr extends Expr { Oper o; Expr l, r; } class Minus extends Expr { Expr e; …} 27

Another Example E ::= num | (E) | E+E | id S ::= E ; | if (E) S | if (E) S else S | id = E ; | ; abstract class Expr { … } class Num extends Expr { Num(int value) … } class Add extends Expr { Add(Expr e1, Expr e2) … } class Id extends Expr { Id(String name) … } abstract class Stmt { … } class IfS extends Stmt { IfS(Expr c, Stmt s1, Stmt s2) } class EmptyS extends Stmt { EmptyS() … } class AssignS extends Stmt { AssignS(String id, Expr e)…} 28

Other Syntax-Directed Definitions Can use syntax-directed definitions to perform semantic checks during parsing –E.g., type-checking Benefit = efficiency –One compiler pass for multiple tasks Disadvantage = unstructured code –Mixes parsing and semantic checking phases –Perform checks while AST is changing –Limited to one pass in bottom-up order 29

Structured Approach Separate AST construction from semantic checking phase Traverse AST (visit) and perform semantic checks (or other actions) only after tree has been built and its structure is stable Approach is more flexible and less error-prone –It is better when efficiency is not a critical issue 30

Where We Are Source code (character stream)‏ Lexical Analysis Syntax Analysis (Parsing)‏ Token stream Abstract syntax tree (AST)‏ Semantic Analysis if (b == 0) a = b; if(b)‏)‏a=b;0== if == b0 = a b 31

AST Data Structure abstract class Expr { } class Add extends Expr {... Expr e1, e2; } class Num extends Expr {... int value; } class Id extends Expr {... String name; } abstract class Expr { } class Add extends Expr {... Expr e1, e2; } class Num extends Expr {... int value; } class Id extends Expr {... String name; } 32

Could add AST computation to class, but… abstract class Expr { …; /* state variables for visitA */ } class Add extends Expr {... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} } class Num extends Expr {... int value; void visitA(){…} } class Id extends Expr {... String name; void visitA(){…} } abstract class Expr { …; /* state variables for visitA */ } class Add extends Expr {... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} } class Num extends Expr {... int value; void visitA(){…} } class Id extends Expr {... String name; void visitA(){…} } 33

Undesirable Approach to AST Computation abstract class Expr { …; /* state variables for visitA */ …; /* state variables for visitB */ } class Add extends Expr {... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} void visitB(){ …; visitB(this.e2); …; visitB(this.e1); …} } class Num extends Expr {... int value; void visitA(){…} void visitB(){…} } class Id extends Expr {... String name; void visitA(){…} void visitB(){…} } abstract class Expr { …; /* state variables for visitA */ …; /* state variables for visitB */ } class Add extends Expr {... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} void visitB(){ …; visitB(this.e2); …; visitB(this.e1); …} } class Num extends Expr {... int value; void visitA(){…} void visitB(){…} } class Id extends Expr {... String name; void visitA(){…} void visitB(){…} } 34

The problem with this approach is incorporating different semantic actions into the classes. –Type checking –Code generation –Optimization Each class would have to implement each “action” as a separate method. Undesirable Approach to AST Computation 35

VISITOR PATTERN 36

Visitor Methodology for AST Traversal Visitor pattern: separate data structure definition (e.g., AST) from algorithms that traverse the structure (e.g., name resolution code, type checking code, etc.). Define Visitor interface for all AST traversals types. i.e., code generation, type checking etc. Extend each AST class with a method that accepts any Visitor (by calling it back)‏ Code each traversal as a separate class that implements the Visitor interface 37

Visitor Interface interface Visitor { void visit(Add e); void visit(Num e); void visit(Id e); } interface Visitor { void visit(Add e); void visit(Num e); void visit(Id e); } class CodeGenVisitor implements Visitor { void visit(Add e) {…}; void visit(Num e){…}; void visit(Id e){…}; } class CodeGenVisitor implements Visitor { void visit(Add e) {…}; void visit(Num e){…}; void visit(Id e){…}; } class TypeCheckVisitor implements Visitor { void visit(Add e) {…}; void visit(Num e){…}; void visit(Id e){…}; } class TypeCheckVisitor implements Visitor { void visit(Add e) {…}; void visit(Num e){…}; void visit(Id e){…}; } 38

Accept methods abstract class Expr { … abstract public void accept(Visitor v); } class Add extends Expr { … public void accept(Visitor v) { v.visit( this ); } } class Num extends Expr { … public void accept(Visitor v) { v.visit( this ); } } class Id extends Expr { … public void accept(Visitor v) { v.visit( this ); } } abstract class Expr { … abstract public void accept(Visitor v); } class Add extends Expr { … public void accept(Visitor v) { v.visit( this ); } } class Num extends Expr { … public void accept(Visitor v) { v.visit( this ); } } class Id extends Expr { … public void accept(Visitor v) { v.visit( this ); } } The declared type of this is the subclass in which it occurs. Overload resolution of v.visit(this); invokes appropriate visit function in Visitor v. 39

Visitor Methods For each kind of traversal, implement the Visitor interface, e.g., class PostfixOutputVisitor implements Visitor { void visit(Add e) { e.e1.accept(this); e.e2.accept(this); System.out.print( “+” ); } void visit(Num e) { System.out.print(e.value); } void visit(Id e) { System.out.print(e.id); } To traverse expression e: PostfixOutputVisitor v = new PostfixOutputVisitor(); e.accept(v); Dynamic dispatch e’.accept invokes accept method of appropriate AST subclass and eliminates case analysis on AST subclasses 40

Inherited and Synthesized Information So far, OK for traversal and action w/o communication of values But we need a way to pass information –Down the AST (inherited)‏ –Up the AST (synthesized)‏ To pass information down the AST –add parameter to visit functions To pass information up the AST –add return value to visit functions 41

Visitor Interface (2)‏ interface Visitor { Object visit(Add e, Object inh); Object visit(Num e, Object inh); Object visit(Id e, Object inh); } interface Visitor { Object visit(Add e, Object inh); Object visit(Num e, Object inh); Object visit(Id e, Object inh); } 42

Accept methods (2)‏ abstract class Expr { … abstract public Object accept(Visitor v, Object inh); } class Add extends Expr { … public Object accept(Visitor v, Object inh) { return v.visit(this, inh); } } class Num extends Expr { … public Object accept(Visitor v, Object inh) { return v.visit(this, inh); } class Id extends Expr { … public Object accept(Visitor v, Object inh) { return v.visit(this, inh); } } abstract class Expr { … abstract public Object accept(Visitor v, Object inh); } class Add extends Expr { … public Object accept(Visitor v, Object inh) { return v.visit(this, inh); } } class Num extends Expr { … public Object accept(Visitor v, Object inh) { return v.visit(this, inh); } class Id extends Expr { … public Object accept(Visitor v, Object inh) { return v.visit(this, inh); } } 43

Visitor Methods (2)‏ For each kind of traversal, implement the Visitor interface, e.g., class EvaluationVisitor implements Visitor { Object visit(Add e, Object inh) { int left = (int) e.e1.accept(this, inh); int right = (int) e.e2.accept(this, inh); return left+right; } Object visit(Num e, Object inh) { return value; } Object visit(Id e, Object inh) { return Lookup(id, (SymbolTable)inh); } To traverse expression e: EvaluationVisitor v = new EvaluationVisitor (); e.accept(v, EmptyTable()); 44

Summary Syntax-directed definitions attach semantic actions to grammar productions Easy to construct the AST using syntax-directed definitions Can use syntax-directed definitions to perform semantic checks, but better not to Separate AST construction from semantic checks or other actions that traverse the AST 45

See Also… Eclipse AST Plugin. –Shows AST for classes. –For instance : –Produces the following AST. Window -> Show View -> Other -> Java ->AST View public class SimpleExample{ public static void main(String[] args){ int x = 10; for (int i=0;i<10;i++) System.out.println(x++); } 46

47

Example of an Attribute Grammar 48

Example (cont.) 49