- ppt download

SEMANTIC PROCESSING Chuen-Liang Chen Department of Computer Science
and Information Engineering National Taiwan University Taipei, TAIWAN

Action symbols to determine when to call semantic routines
1. <program> ® #start begin <statement list> end 2. <statement list> ® <statement> { <statement> } 3. <statement> ® <ident> := <expression> #assign ; 4. <statement> ® read ( <id list> ) ; 5. <statement> ® write ( <expr list> ) ; 6. <id list> ® <ident> #read_id { , <ident> #read_id } 7. <expr list> ® <expression> #write_id { , <expression> #write_id } 8. <expression> ® <primary> { <add op> <primary> #gen_infix } 9. <primary> ® ( <expression> ) 10. <primary> ® <ident> 11. <primary> ® INTLITERAL #process_literal 12. <add op> ® + #process_op 13. <add op> ® - #process_op 14. <ident> ® ID #process_id 15. <system goal> ® <program> SCANEOF #finish possibly, with some modifications

Semantic record to keep semantic information associated with grammar symbol #define MAXIDLEN 33 typedef char string[MAXIDLEN]; /* for operators */ typedef struct operator { enum op { PLUS, MINUS } operator; } op_rec; /* for <primary> and <expression> */ enum expr { IDEXPR, LITERALEXPR, TEMPEXPR }; typedef struct expression { enum expr kind; union { string name; /* for IDEXPR, TEMPEXPR */ int val; /* for LITERALEXPR */ }; } expr_rec;

Parser + semantic routines
void expression(void) { token t; /* <expression> ::= <primary> { <add op> <primary> } */ primary(); for (t = next_token(); t == PLUSOP || t == MINUSOP; t = next_token()) { add_op(); } void expression (expr_rec *result) { expr_rec left_operand, right_operand; op_rec op; /* <expression> ::= <primary> { <add op> <primary> #gen_infix } */ primary(&left_operand) while (next_token() == PLUSOP || next_token() == MINUSOP) { add_op(&op); primary(&right_operand); left_operand = gen_infix(left_operand, op, right_operand); } *result = left_operand; QUIZ: where is syntatic structure?

Semantics - meaning syntax : semantics = structure : meaning
implementation of “meaning” -- attribute attached to each node of (abstract) syntax tree operations on “meaning” -- understand associating semantic information (attribute) to each node initially, on some nodes (leaves, usually) propagation until “decorated” check “meaningful” checking static semantics may only dependent on attribute or also dependent on structure interpret generating code (intermediate representation or final output of compiler)

Derivation tree v.s. abstract syntax tree
<assign stmt> id := <exp> <prim> <term> const + * <if stmt> if <cond> then <stmts> endif if-then-endif

Brief example of semantic processing
example -- Y := 3 * X + I abstract syntax tree: u output: ( 3, int ) Þ ( 3.0, real ) 5 3.0 * X ® T1 6 ( I, int ) Þ ( II, real ) 10 T1 + II ® T2 11 T2 ® Y 14 post-order traversal after step 7, the lowest level are useless encountered tree is not the whole tree, usually := id + * const check 13 check 9 check 4 (Y, real) 1 (3, int) 2 (X, real) 3 (T1, real) 7 (I, int) 8 (T2, real) 12

Semantic processing techniques
semantic record -- representation of meaning semantic routine -- executor for semantic processing when to call? do what? semantic stack communications among semantic routines local variables, parameters (for non-table-driven parser) semantic stack (for table-driven parser)

Semantic record (1/2) representation for attribute
parameters among semantic routines unify declaration is required when passing through semantic stack example -- #define MAXIDLEN 33 typedef char string[MAXIDLEN]; typedef struct operator { enum op { PLUS, MINUS } operator; } op_rec; enum expr { IDEXPR, LITERALEXPR, TEMPEXPR }; typedef struct expression { enum expr kind; union { string name; /* for IDEXPR and TEMPEXPR */ int val; /* for LITERALEXPR */ } expr_rec; enum semantic_record_kind { OPREC, EXPRREC, ERROR }; typedef struct sem_rec { enum semantic_record_kind record_kind; union { op_rec op_record; /* OPREC */ expr_rec expr_record; /* EXPRREC */ /* empty variant */ /* ERROR */ }; } semantic_record;

Semantic record (2/2) 3 : X : + : 3+X : EXPRREC 00000011 LITERALEXPR
OPREC PLUS EXPRREC X IDEXPR EXPRREC T1 TEMPEXPR

Semantic routines (1/7) action symbols in grammar
the same for top-down and bottom-up parser except triggering places for top-down parsing may appear anywhere in production rule, due to predictive nature push onto parse stack when the production rule is predicted execute and pop out of parse stack when it is on the top for bottom-up parsing be able to appear only after a product rule is fully recognized i.e., at the very end of right-hand side state -- all possible partially matched production rules rewriting of some grammar rules is required <stmt> ® if <exp> #start_if then <stmts> endif #finish_if Þ <stmt> ® <if_head> then <stmts> endif #finish_if <if_head> ® if <exp> #start_if /* called semantic hook */ Yacc automatically does the rewriting

Semantic routines (2/7) example grammar with parameterized action symbols <program> ® #start begin <statement list> end <statement list> ® <statement> <statement tail> <statement tail> ® <statement> <statement tail> | l <statement> ® <ident> := <expression>; #assign($1,$3) <statement> ® read ( <id list> ); <statement> ® write ( <expr list> ); <id list> ® <ident> #read_id($1) <id tail> <id tail> ® , <ident> #read_id($2) <id tail> | l <expr list> ® <expression> #write_expr($1) <expr tail> <expr tail> ® , <expression> #write_expr($2) <expr tail> | l <expression> ® <primary> #copy($1,$2) <primary tail> #copy($2,$$) <primary tail> ® <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$) <primary tail> ® l <primary> ® ( <expression> ) #copy($2,$$) <primary> ® <ident> #copy($1,$$) <primary> ® INTLlTERAL #process_literal($$) <add op> ® PLUSOP #process_op($$) <add op> ® MINUSOP #process_op($$) <ident> ® ID #process_id($$) <system goal> ® <program> $ #finish

Semantic routines (3/7) example semantic routines
#include <assert.h> /* <primary> ® ( <expression> ) #copy($2,$$) */ void copy(semantic_record *source, semantic_record *dest) { /* Copy information from one part of the Semantic Stack to another */ *dest = *source; } /* <ident> ® ID #process_id($$) */ void process_id(semantic_record *id_record) /* Declare ID & build corresponding semantic record */ check_id(token_buffer); id_record->record_kind = EXPRREC; id_record->expr_record.kind = IDEXPR; strcpy(id_record->expr_record.name,token_buffer);

Semantic routines (4/7) /* <primary> ® INTLlTERAL #process_literal($$) */ void process_literal(semantic_record *id_record) { /* Convert literal to a numeric representation and build semantic record. */ id_record->record_kind = EXPRREC; id_record->expr_record.kind = LITERALEXPR; sscanf(token_buffer, "%d", &id_record->expr_record.val); } /* <add op> ® PLUSOP #process_op($$) */ void process_op(semantic_record *op) /* Produce operator descriptor. */ op->record_kind = OPREC; if (current_token == PLUSOP) op->op_record.operator = PLUS; else op->op_record.operator = MINUS;

Semantic routines (5/7) /*<primary tail> ® <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)*/ void gen_infix( const semantic_record e1, const semantic_record op, const semantic_record e2, semantic_record *result ) { assert(e1.record_kind == EXPRREC); assert(op.record_kind = OPREC); assert(e2.record_kind == EXPRREC); /* Result is an expr_rec with temp variant set. */ result->record_kind = EXPRREC; result->expr_record.kind = TEMPEXPR; /* Generate code for infix operation. * Get result temp and semantic record for result. */ strcpy(result->expr_record.name, get_temp()); generate(extract(op), extract(e1), extract(e2), result->expr_record.name); }

Semantic routines (6/7) /* <statement> ® <ident> := <expression>; #assign($1,$3) */ void assign(const semantic_record target, const semantic_record source) { assert(target.record_kind == EXPRREC); assert(target.expr_record.kind = IDEXPR); assert(source.record_kind == EXPRREC); /* Generate code for assignment. */ generate("Store", extract(source), target.expr_record.name, ""); } /* <id list> ® <ident> #read_id($1) <id tail> */ void read_id(const semantic_record in_var) assert(in_var.record_kind == EXPRREC); assert(in_var.expr record.kind = IDEXPR); /* Generate code for read. */ generate("Read", in_var.expr_record.name, "Integer", " "); /* <expr tail> ® , <expression> #write_expr($2) <expr tail> | l */ void write_expr(const semantic_record out_expr) assert(out_expr.record_kind == EXPRREC); generate("Write", extract(out_expr), "Integer", " ");

Semantic routines (7/7) tracing example <stmt> <id>1 := ;
<exp>9 #a($1,$3) ID #i($$) 2 #c($1,$2) <pt>3,8 #c($2,$$) INT #l($$) <ao>4 6 #g($$,$1,$2,$3) <pt>7 #c($3,$$) PLUS #o($$) <id>5 #c($1,$$) l 1 2,3 4 5,6 7,8,9 EXPRREC Y IDEXPR EXPRREC LITERALEXPR OPREC PLUS EXPRREC X IDEXPR EXPRREC T1 TEMPEXPR

Semantic stack the place to interchange information among semantic routines be not necessarily treated as an abstract stack action-controlled : controlled by action routines parser-controlled : controlled by the parser driver action-controlled semantic stack open the interface of stack to all semantic action routines disadvantages: 1. difficult to change 2. action routines have to manipulate the stack QUIZ: detailed implementation

Tracing example (1/2) Step Remaining Input Parse Stack Action
(1) begin A:=BB-314+A; end $ <s.g.> Predict 22 (2) begin A:=BB-314+A; end $ $ Predict 1 (3) begin A:=BB-314+A; end $ begin <s.l.> end $ Match (4) A:=BB-314+A; end $ <s.l.> end $ Predict 2 (5) A:=BB-314+A; end $ <s> <s.t.> end $ Predict 5 (6) A:=BB-314+A; end $ ID := <e> ; <s.t.> end $ Match (7) :=BB-314+A; end $ := <e> ; <s.t.> end $ Match (8) BB-314+A; end $ <e> ; <s.t.> end $ Predict 14 (9) BB-314+A; end $ <p.t.> ; <s.t.> end $ Predict 18 (10) BB-314+A; end $ ID <p.t.> ; <s.t.> end $ Match (11) -314+A; end $ <p.t.> ; <s.t.> end $ Predict 15 (12) -314+A; end $ <a.o.> <p.t.> ; <s.t.> end $ Predict 21 (13) -314+A; end $ - <p.t.> ; <s.t.> end $ Match

Example of shift-reduce parsing (3/3)
grammar G0 1. <program> ® begin <stmts> end $ 2. <stmts> ® SimpleStmt ; <stmts> 3. <stmts> ® begin <stmts> end ; <stmts> 4. <stmts> ® l tracing steps Step Parse Stack Remaining Input Action (1) 0 begin SimpleStmt ; SimpleStmt ; end $ Shift 1 (2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5 (3) 0,1,5 ; SimpleStmt ; end $ Shift 6 (4) 0,1,5,6 SimpleStmt ; end $ Shift 5 (5) 0,1,5,6,5 ; end $ Shift 6 (6) 0,1,5,6,5,6,l end $ /* goto(6,<stmts>) = 10 */ Reduce 4 (7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2 (8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2 (9) 0,1,2 end $ Shift 3 (10) 0,1,2,3 $ Accept QUIZ: compare LL and LR parse stack

Parser-controlled semantic stack
for LR parser example -- Y := 3 * X + I scanned grammar -- <S> ® id := <E> # assign <E> ® <E> + <T> # add still useful semantic information -- 1. the location of Y 2. the location of temporary to store 3*X parse stack -- id := <E> + O.K. ; can be combined with parse stack QUIZ: how to combine? for LL parser example (continued) parse stack -- <T> #add #assign need new technique

LL parser-controlled semantic stack (1/11)
LL driver void lldriver(void) { int left_index = -1, right_index = -1; int current_index, top_index; /* Push the Start Symbol onto * an empty parse stack.*/ push(s); /* Initialize the semantic stack. */ current_index = 0; top_index = 1; while (! stack_empty() ) { /* Let a be the current input token. */ X = pop(); if (is_nonterminal(X) && T[X][a] = X ® Y Ym) { /* Expand nonterminal */ Push EOP(left, right, current, top_index) on the parse stack; Push Ym Yl on the parse stack; left_index = current_index; right_index = top_index; top_index += m; /* m is # of non-action symbols */ current_index = right_index; } else if (is_terminal(X) && X == a) { Place token information from scanner in sem_stack[current_index]; current_index++; scanner(&a); /* Get next token */ } else if (X == EOP) { Restore left, right, current, top_index from the EOP symbol; /* Move to next symbol in RHS */ /* of previous production */ } else if (is_action_symbol(X)) Call Semantic Routine corresponding to X; else /* Process syntax error */ }

action: pridict <system goal> ® <program> $ #finish parse stack semantic stack <program> top_index $ $ #finish <program> right_index, current_index EOP(-1,-1,0,1) <system goal> left_index

action: pridict <program> ® #start begin <statement list> end parse stack semantic stack #start “begin” top_index <stmt list> "end" "end" <stmt list> EOP(0,1,1,3) "begin" right_index, current_index $ $ #finish <program> left_index EOP(-1,-1,0,1) <system goal>

action: do #start; match begin parse stack semantic stack top_index <stmt list> "end" "end" <stmt list> current_index EOP(0,1,1,3) "begin" right_index $ $ #finish <program> left_index EOP(-1,-1,0,1) <system goal>

action: predict <statement list> ® <statement> <statement tail> parse stack semantic stack top_index <statement> <stmt tail> <stmt tail> <statement> right_index, current_index EOP(1,3,4,6) "end" "end" <stmt list> left_index EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

action: predict <statement> ® <ident> := <expression>; #assign($1,$3) parse stack semantic stack <ident> top_index ":=" ";" <expression> <expression> ";" ":=" #assign($1,$3) <ident> right_index, current_index EOP(4,6,6,8) <stmt tail> <stmt tail> <statement> left_index EOP(1,3,4,6) "end" "end" <stmt list> EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

action: predict <ident> ® ID #process_id($$) parse stack semantic stack ID #proc_id($$) top_index EOP(6,8,8,12) ID right_index, current_index ":=" ";" <expression> <expression> ";" ":=" #assign($1,$3) <ident> left_index EOP(4,6,6,8) <stmt tail> <stmt tail> <statement> EOP(1,3,4,6) "end" "end" <stmt list> EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

action: match ID parse stack semantic stack #proc_id($$) top_index, current_index EOP(6,8,8,12) ID right_index ":=" ";" <expression> <expression> ";" ":=" #assign($1,$3) <ident> left_index EOP(4,6,6,8) <stmt tail> <stmt tail> <statement> EOP(1,3,4,6) "end" "end" <stmt list> EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

action: do #proc_id; restore EOP(6,8,8,12) parse stack semantic stack top_index ":=" ";" <expression> <expression> ";" ":=" current_index #assign($1,$3) <ident> right_index EOP(4,6,6,8) <stmt tail> <stmt tail> <statement> left_index EOP(1,3,4,6) "end" "end" <stmt list> EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

action: match := parse stack semantic stack top_index ";" <expression> <expression> current_index ";" ":=" #assign($1,$3) <ident> right_index EOP(4,6,6,8) <stmt tail> <stmt tail> <statement> left_index EOP(1,3,4,6) "end" "end" <stmt list> EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

action: predict <expression> ® <primary> #copy($1,$2) <primary tail> #copy($2,$$) parse stack semantic stack <primary> top_index #copy($1,$2) <primary tail> <primary tail> <primary> right_index, current_index #copy($2,$$) ";" EOP(6,8,10,12) <expression> left_index ";" ":=" QUIZ: space complexity QUIZ: how to improve? QUIZ: comparison, LL v.s. LR action v.s. parser- controlled #assign($1,$3) <ident> EOP(4,6,6,8) <stmt tail> <stmt tail> <statement> EOP(1,3,4,6) "end" "end" <stmt list> EOP(0,1,1,3) "begin" $ $ #finish <program> EOP(-1,-1,0,1) <system goal>

Intermediate code to separate high-level language-dependent realizations from low-level machine-dependent realizations forms of intermediate codes postfix notation three-address code 1 opcode + 2 input operands + 1 result operand abstract syntax tree (DAG) QUIZ: comparison

Example tuple language (1/3)
an alternative representation of three-address code varying number of operands ADDI, ADDF, SUBI, SUBF, MULTI, MULTF, DIVI, DIVF, MOD, REM, EXPI, EXPF, AND, OR, XOR, EQ, NE, GT, GE, LT, LE RESULT := ARG1 OP ARG2 UMINUS ARG2 := -ARG1 NOT ARG2 := not ARG1 ASSIGN ARG3 := ARG1, size is ARG2 FLOAT ARG2 := FLOAT(ARG1 ) [ARG 1 in an integer] ADDRESS ARG2 := the address of ARG1 RANGETEST abort execution if ARG3 < ARG1 or ARG3 > ARG2 LABEL ARG1 is used to label the next tuple JUMP jump to tuple labeled ARG1 JUMP0 jump to ARG2 if ARG1 = 0 JUMP1 jump to ARG2 if ARG1 = 1 CASEJUMP ARG1 is case selector expression CASELABEL ARG1 is a case statement label CASERANGE ARG1 is lower bound of label range, ARG2 is upper bound of range

CASEEND no arguments, end of case statement PROCENTRY enter subprogram at nesting level ARG1 PROCEXIT exit subprogram at nesting level ARG1 STARTCALL ARG1 is temporary to reference activation record REFPARAM ARG1 is actual parameter ARG2 is parameter offset ARG3 is reference to activation record COPYIN ARG1 is actual parameter COPYOUT ARG1 is actual parameter COPYINOUT ARG1 is actual parameter PROCJUMP ARG1 is subprogram start address (a label) ARG2 is reference to activation record

begin (READI, A) read(A,B); (READI, B) if A > B then (GT, A, B, t1) C := A + 5; (JUMP0, t1, L1) else (ADDI, A, 5, C) C := B + 5; (JUMP, L2) end if; (LABEL, L1) write(2 * (C -1)); (ADDI, B, 5, C) end (LABEL, L2) (SUBI, C, 1, t2) (MULTI, 2, t2, t3) (WRITEI, t3)

Similar presentations

Presentation on theme: ""— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Similar presentations

Presentation on theme: ""— Presentation transcript:

Similar presentations

About project

Feedback