Abstract Syntax Leonidas Fegaras.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Semantics Static semantics Dynamic semantics attribute grammars
Abstract Syntax Mooly Sagiv html:// 1.
1 Compiler Construction Intermediate Code Generation.
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
CSE 5317/4305 L5: Abstract Syntax1 Abstract Syntax Leonidas Fegaras.
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
Parsing III (Eliminating left recursion, recursive descent parsing)
Chapter 2 A Simple Compiler
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
Syntax Directed Translation. Syntax directed translation Yacc can do a simple kind of syntax directed translation from an input sentence to C code We.
Language Translators - Lee McCluskey LANGUAGE TRANSLATORS: WEEK 21 LECTURE: Using JavaCup to create simple interpreters
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
COMP Parsing 3 of 4 Lectures 23. Using the Scanner Break input into tokens Use Scanner with delimiter: public void parse(String input ) { Scanner.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
CPS 506 Comparative Programming Languages Syntax Specification.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CSE 5317/4305 L6: Semantic Analysis1 Semantic Analysis Leonidas Fegaras.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
1 Programming Languages (CS 550) Lecture 2 Summary Mini Language Interpreter Jeremy R. Johnson.
CSE 5317/4305 L5: Abstract Syntax1 Abstract Syntax Leonidas Fegaras.
Comp 311 Principles of Programming Languages Lecture 2 Syntax Corky Cartwright August 26, 2009.
Comp 411 Principles of Programming Languages Lecture 3 Parsing
Chapter 3 – Describing Syntax
Parsing #1 Leonidas Fegaras.
COMP261 Lecture 18 Parsing 3 of 4.
A Simple Syntax-Directed Translator
Constructing Precedence Table
Lecture #12 Parsing Types.
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Compilers for Algorithmic Languages Design and Construction of Compilers Leonidas Fegaras.
Java CUP.
An Attribute Grammar for Tiny
PROGRAMMING LANGUAGES
4 (c) parsing.
Presentation by Julie Betlach 7/02/2009
Basic Program Analysis: AST
CS 3304 Comparative Languages
Syntax-Directed Translation
Top-Down Parsing CS 671 January 29, 2008.
Mini Language Interpreter Programming Languages (CS 550)
Starting JavaProgramming
CSE 3302 Programming Languages
Parsing #2 Leonidas Fegaras.
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
R.Rajkumar Asst.Professor CSE
Representation, Syntax, Paradigms, Types
CS 3304 Comparative Languages
CS 3304 Comparative Languages
Parsing #2 Leonidas Fegaras.
Adapted from slides by Nicholas Shahan and Dan Grossman
Nicholas Shahan Spring 2016
The Recursive Descent Algorithm
Recursive descent parsing
High-Level Programming Language
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
Chapter 10: Compilers and Language Translation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Abstract Syntax Leonidas Fegaras.
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Compiler design Review COMP 442/6421 – Compiler Design
Presentation transcript:

Abstract Syntax Leonidas Fegaras

Abstract Syntax Tree (AST)‏ A parser typically generates an Abstract Syntax Tree (AST): A parse tree is not an AST get token get next character AST scanner parser source file token E T E F T E F T F id(x) + id(y) * id(z)‏ + x * y z

Building Abstract Syntax Trees in Java abstract class Exp { } class IntegerExp extends Exp { public int value; public IntegerExp ( int n ) { value=n; } class TrueExp extends Exp { public TrueExp () {} class FalseExp extends Exp { public FalseExp () {} class VariableExp extends Exp { public String value; public VariableExp ( String n ) { value=n; }

Exp (cont.)‏ class BinaryExp extends Exp { public String operator; public Exp left; public Exp right; public BinaryExp ( String o, Exp l, Exp r ) { operator=o; left=l; right=r; } } class UnaryExp extends Exp { public Exp operand; public UnaryExp ( String o, Exp e ) { operator=o; operand=e; } class ExpList { public Exp head; public ExpList next; public ExpList ( Exp h, ExpList n ) { head=h; next=n; }

Exp (cont.)‏ class CallExp extends Exp { public String name; public ExpList arguments; public CallExp ( String nm, ExpList s ) { name=nm; arguments=s; } } class ProjectionExp extends Exp { public Exp value; public String attribute; public ProjectionExp ( Exp v, String a ) { value=v; attribute=a; }

Exp (cont.)‏ class RecordElements { public String attribute; public Exp value; public RecordElements next; public RecordElements ( String a, Exp v, RecordElements el )‏ { attribute=a; value=v; next=el; } } class RecordExp extends Exp { public RecordElements elements; public RecordExp ( RecordElements el ) { elements=el; }

Examples The AST for the input (x-2)+3 new BinaryExp("+", new BinaryExp("-", new VariableExp("x"), new IntegerExp(2)), new IntegerExp(3))‏ The AST for the input f(x.A,true)‏ new CallExp(“f”, new ExpList(new ProjectionExp(new VariableExp("x"), “A”), new ExpList(new TrueExp(),null)))‏

Gen A Java package for constructing and manipulating ASTs You are required to use Gen for your project It is basically a Java preprocessor that adds syntactic constructs to the Java language to make the task of handling ASTs easier uses a universal class Tree to capture any kind of AST supports easy construction of ASTs using the #<...> syntax supports pattern matching, editing, pretty-printing, etc includes a symbol table class Architecture: file.gen file.java file.class Gen javac

The Gen Tree Class abstract class Tree { } class LongLeaf extends Tree { public long value; public LongLeaf ( long n ) { value = n; } class DoubleLeaf extends Tree { public double value; public DoubleLeaf ( double n ) { value = n; } class VariableLeaf extends Tree { public String value; public VariableLeaf ( String s ) { value = s; } class StringLeaf extends Tree { public StringLeaf ( String s ) { value = s; }

AST Nodes are Instances of Tree class Node extends Tree { public String name; public Trees args; public Node ( String n, Trees a ) { tag = n; args = a; } } class Trees { public Tree head; public Trees tail; public Trees ( Tree h, Trees t ); public final static Trees nil; public Trees append ( Tree e );

Example To construct Binop(Plus,x,Binop(Minus,y,z))‏ in Java, use: new Node("Binop", Trees.nil.append(new VariableLeaf("Plus"))‏ .append(new VariableLeaf("x"))‏ .append(new Node("Binop", Trees.nil.append(new VariableLeaf("Minus"))‏ .append(new VariableLeaf("y"))‏ .append(new VariableLeaf("z")))))‏ Ugly! You should never use this kind of code in your project Binop Plus x Binop Minus y z

The #< > Brackets When you write #<Binop(Plus,x,Binop(Minus,y,z))> in your Gen file, it generates the following Java code: new Node("Binop", Trees.nil.append(new VariableLeaf("Plus"))‏ .append(new VariableLeaf("x"))‏ .append(new Node("Binop", Trees.nil.append(new VariableLeaf("Minus"))‏ .append(new VariableLeaf("y"))‏ .append(new VariableLeaf("z")))))‏new which represents the AST: Binop(Plus,x,Binop(Minus,y,z))‏

Escaping a Value Using Backquote Objects of the class Tree can be included into the form generated by the #< > brackets by “escaping” them with a backquote (`)‏ The operand of the escape operator is expected to be an object of class Tree that provides the value to “fill in” the hole in the bracketed text at that point actually, an escaped string/long/double value is also lifted to a Tree For example Tree x = #<join(a,b,p)>; Tree y = #<select(`x,q)>; Tree z = #<project(`y,A)>; are equivalent to: Tree y = #<select(join(a,b,p),q)>; Tree z = #<project(select(join(a,b,p),q),A)>;

BNF of #< > bracketed ::= "#<" expr ">" construct an AST (instance of Tree) | "#[" arg "," ... "," arg "]" construct a list of ASTs (instance of Trees) expr ::= name the representation of a variable name | long the repr. of a long integer | double the repr. of a double number | string the repr. of a string | "`" name escaping to the value of name | "`(" code ")" escaping to the value of code | name "(" arg "," ... "," arg ")“ the repr. of an AST node | "`" name "(" arg "," ... "," arg ")" the repr. of an AST node with escaped name arg ::= expr the repr. of an expression | "..." name escaping to a list of ASTs bound to name | "...(" code ")" escaping to a list of ASTs returned by code

“...” is for Trees The three dots (...) construct is used to indicate a list of children in an AST node name in “...name” must be an instance of the class Trees For example, in Trees r = #[join(a,b,p),select(c,q)]; Tree z = #<project(...r)>; z will be bound to #<project(join(a,b,p),select(c,q))>

Example For example, #<`f(6,...r,g("ab",`(k(x))),`y)> is equivalent to the following Java code: new Node(f, Trees.nil.append(new LongLeaf(6))‏ .append(r)‏ .append(new Node("g",Trees.nil.append(new StringLeaf("ab"))‏ .append(k(x))))‏ .append(y)‏ If f="h", r=#[2,z], y=#<m(1,"a")>, and k(x) returns the value #<8>, then the above term is equivalent to #<h(6,2,z,g("ab",8),m(1,"a"))>

Pattern Matching Gen provides a match statement syntax for pattern matching Patterns match the Tree representations with similar shape Escape operators applied to variables inside these patterns represent variable patterns, which “bind” to corresponding subterms upon a successful match This capability makes it particularly easy to write functions that perform source-to-source transformations

Example A function that simplifies arithmetic expressions: Tree simplify ( Tree e ) { match e { case plus(`x,0): return x; case times(`x,1): return x; case times(`x,0): return #<0>; case _: return e; } where the _ pattern matches any value. For example, simplify(#<times(z,1)>) returns #<z>

BNF case_stmt ::= "match" code “{“ case ... case "}" case ::= "case" expr ":" code expr ::= name exact match with a variable name | long exact match with a long integer | double exact match with a double number | string exact match with a string | "`" name match with the value of name | name "(" arg "," ... "," arg ")“ match an AST node | "`" name "(" arg "," ... "," arg ")" match an AST node with escaped name | "_" match any Ast arg ::= expr match an Ast | "..." name match a list of ASTs bound to name | "..." match the rest of the arguments

Examples The pattern `f(...r) matches any Node when it is matched with #<join(a,b,c)>, it binds f to the string "join" r to the Arguments #[a,b,c] The following function adds the terms #<8> and #<9> as children to any Node e: Tree add_arg ( Tree e ) { match e { case `f(...r): return #<`f(8,9,...r)>; case `x: return x; }

Another Example The following function switches the inputs of a binary join found as a parameter to a Node e: Tree switch_join_args ( Tree e ) { match e { case `f(...r,join(`x,`y),...s): return #<`f(...r,join(`y,`x),...s)>; case `x: return x; }

Misc To iterate over Trees, use Java's for-loop: for ( Tree v: #[a,b,c] ) System.out.println(v); For conditional pattern matching, use if-then-else with a fail: match e { Case `f(`x): if (x instanceof VariableLeaf) fail; return #<`f(y)>; case `z: return z; }

Adding Semantic Actions to a Parser int E () { int left = T(); if (current_token == '+') { read_next_token(); return left + E(); } else if (current_token == '-') { return left - E(); } else error(); }; int T () { if (current_token=='num') { int n = num_value; return n; } else error(); }; Right-associative grammar: E ::= T + E | T - E T ::= num After left factoring: E ::= T E' E' ::= + E | - E Recursive descent parser:

Adding Semantic Actions to a Parser int E () { return Eprime(T()); }; int Eprime ( int left ) { if (current_token=='+') { read_next_token(); return Eprime(left + T()); } else if (current_token=='-') { return Eprime(left - T()); } else return left; }; int T () { if (current_token=='num') { int n = num_value; return n; } else error(); }; Left-associative grammar: E ::= E + T | E - T T ::= num After left recursion elimination: E ::= T E' E' ::= + T E' | - T E' | Recursive descent parser:

Table-Driven Predictive Parsers Use the parse stack to push/pop both actions and symbols but they use a separate semantic stack to execute the actions push(S); read_next_token(); repeat X = pop(); if (X is a terminal or '$')‏ if (X == current_token)‏ else error(); else if (X is an action)‏ perform the action; else if (M[X,current_token] == "X ::= Y1 Y2 ... Yk")‏ { push(Yk); ... push(Y1); } until X == '$';

Example Need to embed actions { code; } in the grammar rules Suppose that pushV and popV are the functions to manipulate the semantic stack The following is the grammar of an interpreter that uses the semantic stack to perform additions and subtractions: E ::= T E' $ { print(popV()); } E' ::= + T { pushV(popV() + popV()); } E' | - T { pushV(-popV() + popV()); } E' | T ::= num { pushV(num); } For example, for 1+5-2, we have the following sequence of actions: pushV(1); pushV(5); pushV(popV()+popV()); pushV(2); pushV(-popV()+popV()); print(popV());

Bottom-Up Parsers can only perform an action after a reduction We can only have rules of the form X ::= Y1 ... Yn { action } where the action is always at the end of the rule; this action is evaluated after the rule X ::= Y1 ... Yn is reduced How? In addition to state numbers, the parser pushes values into the parse stack If we want to put an action in the middle of the right-hand-side of a rule, we use a dummy non-terminal, called a marker For example, X ::= a { action } b is equivalent to X ::= M b M ::= a { action }

CUP Both terminals and non-terminals are associated with typed values these values are instances of the Object class (or of some subclass of the Object class)‏ the value associated with a terminal is in most cases an Object, except for an identifier which is a String, for an integer which is an Integer, etc the typical values associated with non-terminals in a compiler are ASTs, lists of ASTs, etc You can retrieve the value of a symbol s at the right-hand-side of a rule by using the notation s:x, where x is a variable name that hasn't appeared elsewhere in this rule The value of the non-terminal defined by a rule is called RESULT and should always be assigned a value in the action eg if the non-terminal E is associated with an Integer object, then E ::= E:n PLUS E:m {: RESULT = n+m; :}

Machinery The parse stack elements are of type struct( state: int, value: Object )‏ int is the state number Object is the value When a reduction occurs, the RESULT value is calculated from the values in the stack and is pushed along with the GOTO state Example: after the reduction by E ::= E:n PLUS E:m {: RESULT = n+m; :} the RESULT value is stack[top-2].value + stack[top].value which is the new value pushed in the stack along with the GOTO state

ASTs in CUP Need to associate each non-terminal symbol with an AST type non terminal Ast exp; non terminal Arguments expl; exp ::= exp:e1 PLUS exp:e2 {: RESULT = new Node(plus_exp,e1,e2); :} | exp:e1 MINUS exp:e2 {: RESULT = new Node(minus_exp,e1,e2); :} | id:nm LP expl:el RP {: RESULT = new Node(call_exp,el.reverse()‏ .cons(new VariableLeaf(nm))); :} | INT:n {: RESULT = new LongValue(n.intValue()); :} ; expl ::= expl:el COMMA exp:e {: RESULT = el.cons(e); :} | exp:e {: RESULT = nil.cons(e); :}