Presentation is loading. Please wait.

Presentation is loading. Please wait.

Abstract Syntax Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc04.html.

Similar presentations


Presentation on theme: "Abstract Syntax Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc04.html."— Presentation transcript:

1 Abstract Syntax Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc04.html

2 Outline The general idea Bison Motivating example Interpreter for arithmetic expressions The need for abstract syntax Abstract syntax for arithmetic expressions

3 Semantic Analysis during Recursive Descent Parsing Scanner returns “semantic values” for some tokens The function of every non-terminal returns the “corresponding subtree value” When A  B C D is applied the function for A can use the values returned by B, C, and D –The function can also pass parameters, e.g., to D(), reflecting left contexts

4 E  num E’ E’   E’  + num E’ int E() { switch (tok) { case num : temp=tok.val; eat(num); return EP(temp); default: error(…); }} int EP(int left) { switch (tok) { case $: return left; case + : eat(+); temp=tok.val; eat(num); return EP(left + temp); default: error(…) ;}}

5 Semantic Analysis during Bottom-Up Parsing Scanner returns “semantic values” for some tokens Use parser stack to store the “corresponding subtree values” When A  B C D is reduced the function for A can use the values returned by B, C, and D No action in the middle of the rule

6 E  E + num E  num Example E 7 + num 5 E 12

7 Bison Specification Declarations % Productions % C -Routines

8 Interpreter (in Bison) %{ declarations of yylex() and yyeror() %} %union { int num; string id;} %token ID %token NUM %type e f t %start e % e : e ‘+’ t {$$ = $1 + $3 ; } | e ‘-’ t { $$ = $1 - $3 ; } | t { $$ = $1; } ; t : t ‘*’ f { $$ = $1 * $3; } | t ‘/’ f { $$= $1 / $3; } | f { $$ = $1; } ; f : NUM { $$ = $1; } | ID { $$ = lookup($1); } | ‘-’ e { $$ = - $2; } | ‘(‘ e ‘)’ { $$ = $2; } ;

9 Interpreter (compact spec.) %{ declarations of yylex() and yyeror() %} %union { int num; string id;} %token ID %token NUM %type e %start e %left PLUS MINUS %left MUL DIV %right UMINUS % e : e PLUS e {$$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e { $$= $1 / $3; } | NUM | ID { $$ = lookup($1); } | MINUS e % prec UMINUS { $$ = - $2; } | ‘(‘ e ‘)’ { $$ = $2; } ;

10 stackinputaction $ 7+11+17$shift num 7 $ +11+17$ reduce e  num e 7 $ +11+17$shift + e 7 $ 11+17$shift num 11 + e 7 $ +17$ reduce e  num

11 stackinputaction e 11 + e 7 $ +17$ reduce e  e+e e 18 $ +17$shift + e 18 $ 17$shift num 17 + e 18 $ $ reduce e  num

12 stackinputaction e 17 + e 18 $ $ reduce e::= e+e e 35 $ $ accept

13 So why can’t we write all the compiler code in Bison/CUP?

14 Historical Perspective Originally parsers were written w/o tools yacc, bison,... make tools acceptable But it is still difficult to write compilers in parser actions (top-down and bottom-up) –Natural grammars are ambiguous –No modularity principle –Many useful programming language features prevent code generation while parsing Use before declaration gotos

15 Abstract Syntax Intermediate program representation Defines a tree - Preserves program hierarchy Generated by the parser Declared using an (ambiguous) context free grammar (relatively flat) –Not meant for parsing Keywords and punctuation symbols are not stored (Not relevant once the tree exists) Big programs can be also handled (possibly via virtual memory)

16 Issues Concrete vs. Abstract syntax tree Need to store concrete source position Abstract syntax can be defined by: –Ambiguous context free grammar –C recursive data type –“Constructor” functions Debugging routines linearize the tree

17 Abstract Syntax for Arithmetic Expressions Exp  id (IdExp) Exp  num (NumExp) Exp  Exp Binop Exp (BinOpExp) Binop  + (Plus) Binop  - (Minus) Binop  * Binop  / (Times) (Div) Exp  Unop Exp (UnOpExp) Unop  - (UnMin)

18 /* absyn.h */ typedef enum {A_Plus, A_Minus, A_Times, A_Div} Binop; typedef enum {A_UnMin} Unop; typedef enum {A_IdExp, A_NumExp, A_BinOpExp, A_UnopExp} ExpKind; typedef struct exp { ExpKind kind ; union { struct {string repr ;} IdExp; struct {int number ; } NumExp; struct {Binop op; struct exp *left; struct exp *right;} BinExp; struct {Unop op; struct exp *arg;} UnExp; } exp; } *Exp; extern Exp idExp(string repr); extern Exp numExp(int number); extern Exp binOp(Binop, struct exp *, struct exp*); extern Exp unOp(Unop, struct exp*);

19 %{ declarations of yylex() and yyeror() #include “absyn.h” %} %union { int num; string id; Exp exp;} %token ID %token NUM %type e f t %start e % e : e ‘+’ t {$$ = binop(A_Plus, $1, $3) ; } | e ‘-’ t { $$ = binop(A_Minus, $1, $3) ; } | t { $$ = $1; } ; t : t ‘*’ f { $$ = binop(A_Times,$1, $3); } | t ‘/’ f { $$= binop(A_Div, $1, $3); } | f { $$ = $1; } ; f : NUM { $$ = numExp($1); } | ID { $$ = idExp($1); } | ‘-’ e { $$ = unOp(A_UnMinus, $2); } | ‘(‘ e ‘)’ { $$ = $2; } ;

20 %{ declarations of yylex() and yyeror() #include “absyn.h” %} %union { int num; string id; Exp exp; } %token ID %token NUM %type e %start e %left PLUS MINUS %left MUL DIV %right UMINUS % e : e PLUS e {$$ = binop(A_Plus, $1, $3) ; } | e MINUS e { $$ = binop(A_Minus, $1, $3) ; } | e MUL e { $$ = binop(A_Times, $1, $3); } | e DIV e { $$= binop(A_Div, $1, $3); } | NUM {$$ = numExp($1); } | ID { $$ = idExp($1); } | MINUS e % prec UMINUS { $$ = unOp(A_UnMin,$2); } | ‘(‘ e ‘)’ { $$ = $2; } ;

21 Summary Flex and Bison simplify the task of writing compiler/interpreter front-ends Abstract syntax provides a clear interface with other compiler phases –Supports general programming languages But the design of an abstract syntax for a given PL may take some time


Download ppt "Abstract Syntax Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc04.html."

Similar presentations


Ads by Google