Code Generation How to produce intermediate or target code
Code Generation2 Outline Intermediate code Three-address code P-code Code generation techniques Using attribute grammar Tree traversal Macro expansion Address Calculation Code generation for control statements Code generation for logical expressions
Code Generation3 Intermediate Code Intermediate representation for programs Why intermediate code Reduce amount of work if optimization is done for intermediate code Easy for retargeting compilers Forms of intermediate code Abstract syntax tree Linearization of abstract syntax tree
Code Generation4 Three-address code: Form of Instructions x = y op z x, y, and z are addresses of Variables (perhaps temporaries) Locations in programs y and z must be differed from x Operators can be: Arithmetic operators (3-address) Relational operators (3-address) Conditional operators (2-address) If_false … goto … Jump (1-address) Input/Output Halt Labels can be assigned to a location
Code Generation5 Example of 3-address Code read (x); if x>0 then {fact:=1; repeat {fact:=fact*x; x:=x-1; } until x==0; write(fact); } read x t1=x>0 if_false t1 goto L1 fact=1 label L2 t2=fact*x fact=t2 t3=x-1 x=t3 t4=x==0 if_false t4 goto L2 write fact label L1 halt
Code Generation6 P-Code Code for a hypothetical stack machine For Pascal compilers No variable name is required Instructions Load stack Arithmetic and relational operators Jumps Operations are performed on topmost values on stack 0-address or 1-address instructions
Code Generation7 P-code Instruction Load: push stack Load value Load address Load constant Store: save top of stack in memory Destructive store Nondestructive store Arithmetic operations Add Subtract Multiply Compare Greater Less equal Label Jump Unconditional jump Conditional jump I/O Read write Stop
Code Generation8 Example of P-code read (x); if x>0 then {fact:=1; repeat {fact:=fact*x; x:=x-1; } until x==0; write(fact); } loada x read loadv x loadc 0 greater jumpONfalse L1 loada fact loadc 1 store label L2 loada fact loadv fact loadv x mult store loada x loadv x loadc 1 sub store loadv x loadc 0 equ jumpF L2 loadv fact write label L1 stop
Code Generation9 Code Generation Using Synthesizes Attributes An attribute is created for the sequence of characters representing generated code. An attribute grammar is written to generate the intermediate/target code. The value of the attribute is passed from child nodes upto their parent node to construct a larger chunk of code
Code Generation10 Attribute Grammar for P-code Grammar Rules exp 1 -> id = exp 2 exp -> aexp aexp 1 - >aexp 2 + factor aexp -> factor factor -> ( exp ) factor -> num factor -> id Semantic Rules exp 1.code =“loada” || id.strval ++ exp 2.code ++ ”stn” exp.code = aexp.code aexp 1.code = aexp 2.code ++ factor.code ++ “add” aexp.code = factor.code factor.code = exp.code factor.code = “loadc ” || num.strval factor.code = “loadv ” || id.strval
Code Generation11 Generating P-Code:Example exp = id exp aexp factor + () aexp factor + num id num loada x loadv x loadc 3 adi stn loadc 4 adi loadc 4 loada x loadv x loadc 3 adi stn loadv x loadc 3 adi loadv x loadc 3 loadv x loadc 3 adi loada x loadv x loadc 3 adi stn loada x loadv x loadc 3 adi stn loada x loadv x loadc 3 adi stn loadc 4 adi
Code Generation12 Attribute Grammar for 3-address Code Grammar Rules exp 1 -> id = exp 2 exp -> aexp aexp 1 - >aexp 2 +fac tor aexp -> factor factor -> ( exp ) factor -> num factor -> id Semantic Rules exp 1.name = exp 2.name; exp 1.code=exp 2.code++id.strval ||”=“||exp 2.name exp.name = aexp.name; exp.code = aexp.code exp 1.name = newtemp(); aexp 1.code=aexp 2.code++facto r.code++ aexp 1.name||”=“||aexp 2.name||”+“||factor.name aexp.name=factor.name;aexp.c ode=factor.code factor.name =exp.name; factor.code = exp.code factor.name = num.strval; factor.code = “” factor.name = id.strval; factor.code = “”
Code Generation13 Generating 3-address Code:Example exp = id exp aexp factor + () aexp factor + num id num x 4 x t1 t1=x+3 3 t1 t1=x+3 t1 t1=x+3 x=t1 t1 t1=x+3 x=t1 t1 t1=x+3 x=t1 t2 t1=x+3 x=t1 t2=t1+4 t2 t1=x+3 x=t1 t2=t1+4 4
Code Generation14 Practical Code Generation: Tree Traversal procedure genCode ( t : node ) { if ( t is not null) then {generate code to prepare for code of left child; genCode (left child of t ); generate code to prepare for code of right child; genCode (right child of t ); generate code to implement the action of t ; }
Code Generation15 Practical Code Generation for P-code gencode ( T : Tree ) { if ( T is not null) { switch ( T. type ) case plusnode : { gencode ( T. lchild ); gencode ( T. rchild ); emitCode(“add”); } case asgnnode : { emitcode( “loada”, T. strval ); gencode ( T.r child ); emitcode (“stn”); } case constnode : { emitcode (“loadc”, t. strval ); } case idnode : { emitcode (“loadv”, t. strval ); } default : { emitcode (“error”); } }
Code Generation16 Macaro Expansion From intermediate code to target code Example: 3-address code: x=y+n P code: loada x loadv y loadc n adi sto
Code Generation17 Address Calculation Addressing operations Array references Record structure references
Code Generation18 Addressing Operations 3-address code address of x &x indirect address (content pointed by x) *x P-code address of x loada x indirect load ind x (load *(top+ x)) indexed address ixa x (load top*x+(top-1))
Code Generation19 Array References 3-address code x=a[i] t1=i*elesize(a) t2=&a+t1 x=*t2 a[i]=x t1=i*elesize(a) t2=&a+t1 *t2=x P-code x=a[i] loada x loada a loadv i ixa elesize(a) ind 0 sto a[i]=x loada a loadv i ixa elesize(a) loadv x sto Find offset Find address Find offset Find address Find content
Code Generation20 More Complex Array References 3-address code a[i+1]=a[j+2]+3 t1=j+2 t2=t1*elesize(a) t3=&a+t2 t4=*t3 t5=t4+3 t6=i+1 t7=t6*elesize(a) t8=&a+t7 *t8=t5 P code a[i+1]=a[j+2]+3 loada a loadv i loadc 1 adi ixa elesize(a) loada a loadv j loadc 2 adi ixa elesize(a) ind 0 loadc 3 adi sto t4=a[t1] a[t6]=t5t1=a[j+2] t2=&a[i+1] a[t6]=t5 t4=a[t1] t2=&a[i+1] t1=a[j+2]
Code Generation21 Structure References typedef struct rec {int i; char c; int j; } Entry; Entry x; 3-address code Address of a field t1=&x+offset(x,j) Content of a field t1=&x+offset(x,j) t2=*t1 P code Address of a field loada x loadc offset(x.j) ixa 1 Content of a field loada x ind offset(x.j) x.i x.j x.c base address of x Offset of x.j Offset of x.c
Code Generation22 Code Generation for Control Statements If Statements While Statements Logical Expressions
Code Generation23 Code Generation for If-Statements IF ( E ) S1 ELSE S2 3-address code <code evaluating E and assigning to t1> if_false t1 goto L1 goto L2 label L1 label L2 P code jumpF L1 jump L2 label L1 label L2
Code Generation24 Code Generation for While-Statements WHILE ( E ) S 3-address code label L1 <code evaluating E and assigning to t1> if_false t1 goto L2 goto L1 label L2 P code label L1 jumpF L2 jump L1 label L2
Code Generation25 Generating Labels Forward jump Label must be generated before defining the label For intermediate code Generate label at the jump instruction Store the label until the actual location is found For target code (assembly) Leave the destination address in the jump instruction When the destination address is found, go back and fill the address (backpatching) Short jump or long jump? Leave space enough for long jump If only short jump is required, the extra space is filled with nop. label L1 jumpF L2 jump L1 label L2
Code Generation26 Code Generation for Logical Expressions Data types and operators Use Boolean data type and operators if included in the target/intermediate language Use integer 0/1 and bitwise opeartors Short-circuit evaluation IF a AND b THEN S If a is false, the whole exp is false there is no need to evaluate b IF a OR b THEN S If a is true, the whole exp is true there is no need to evaluate b Intermediate code for IF a AND b THEN S if_false a goto L1 if_false b goto L1 Label L1 Intermediate code for IF a OR b THEN S if_false a goto L1 goto L2 Label L1 if_false b goto L3 Label L2 Label L3