Code generation and data types Translation and Address Calculation
Outline Intermediate codes Code generation, using attribute grammar 3-address code P-code Code generation, using attribute grammar Data types Primitive data types Arrays and Strings Records Address calculation 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Intermediate Codes 3-address code P-code 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Intermediate Code Intermediate representation for programs Why intermediate code Reduce amount of work if optimization is done for intermediate code Easy for retargeting compilers Forms of intermediate code Abstract syntax tree Linearization of abstract syntax tree 2301380 Chapter 6 Code Generation and Data Types 01/08/62
3-address code x = y op z Labels can be assigned to a location. x, y, and z are addresses of Variables (perhaps temporaries) Locations in programs y and z must be differed from x. Labels can be assigned to a location. Operators can be: Arithmetic operators (3- address) Relational operators (3- address) Conditional operators (2- address) If_false … goto … Jump (1-address) Input/Output Halt 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Example: 3-address code read (x); if x>0 then { fact:=1; repeat { fact:=fact*x; x:=x-1; } until x==0; write(fact); read x t1=x>0 if_false t1 goto L1 fact=1 label L2 t2=fact*x fact=t2 t3=x-1 x=t3 t4=x==0 if_false t4 goto L2 write fact label L1 halt 2301380 Chapter 6 Code Generation and Data Types 01/08/62
P-code Code for a hypothetical stack machine For Pascal compilers No variable name is required Instructions Load stack Arithmetic and relational operators Jumps Operations are performed on topmost values on stack 0-address or 1-address instructions 2301380 Chapter 6 Code Generation and Data Types 01/08/62
P-code Instructions Load: push stack Load value Load address Load constant Store: save top of stack in memory Destructive store Nondestructive store Arithmetic operations Add Subtract Multiply Compare Greater Less equal Label Jump Unconditional jump Conditional jump I/O Read Write Stop 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Example: P-code read (x); loada x read if x>0 then loadv x loadc 0 greater jumpONfalse L1 { fact:=1; loada fact loadc 1 store repeat Label L2 { fact:=fact*x; loada fact loadv fact loadv x mult x:=x-1; loada x loadv x loadc 1 sub store } until x==0; loadv x loadc 0 equ jumpF L2 write(fact); loadv fact write } Label L1 stop 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation Using Attribute Grammar 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation Using Synthesized Attributes An attribute is created for the sequence of characters representing generated code. An attribute grammar is written to generate the intermediate/target code. The value of the attribute is passed from child nodes up to their parent node to construct a larger chunk of code. 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Attribute Grammar: P-code Grammar Rules Semantic Rules exp1 id = exp2 exp1.code = “loada” || id.strval ++ exp2.code ++ ”stn” exp aexp exp.code = aexp.code aexp1 aexp2 + factor aexp1 .code = aexp2 .code ++ factor.code ++ “add” aexp factor aexp.code = factor.code factor ( exp ) factor.code = exp.code factor num factor.code = “loadc ” || num.strval factor id factor.code = “loadv ” || id.strval 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Generating P-code: Example exp loada x loadv x loadc 3 adi stn loadc 4 aexp loada x loadv x loadc 3 adi stn loadc 4 aexp factor + loada x loadv x loadc 3 adi stn factor num loada x loadv x loadc 3 adi stn exp ) ( loadv x loadc 3 adi exp = id loadv x loadc 3 adi aexp loadc 3 loadv x aexp + factor loadv x factor num id 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Attribute Grammar: 3-address code Grammar Rules Semantic Rules exp1 id = exp2 exp1.name = exp2.name; exp1.code = exp2.code ++ id.strval || = || exp2.name exp aexp exp.name = aexp.name; exp.code = aexp.code aexp1 aexp2 + factor exp1.name = newtemp(); aexp1.code = aexp2.code ++ factor.code ++ aexp1.name || = || aexp2 .name || + || factor.name aexp factor aexp.name=factor.name; aexp.code=factor.code factor ( exp ) factor.name =exp.name; factor.code = exp.code factor num factor.name = num.strval; factor.code = “” factor id factor.name = id.strval; factor.code = “” 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Generating 3-address Code: Example t1=x+3 x=t1 t2=t1+4 aexp t1 t1=x+3 x=t1 4 aexp factor + 4 t1 t1=x+3 x=t1 factor num t1 t1=x+3 x=t1 exp ) ( t1 t1=x+3 exp id = t1 t1=x+3 aexp x 3 + aexp factor x factor num id 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation: Tree Traversal procedure genCode (T:node) { if (T != null) then { generate code to prepare for code of left child; genCode(T. leftChild); generate code to prepare for code of right child; genCode(T. rightChild); generate code to implement the action of T; } 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation: P-code gencode(T:Tree) { if (T is not null) { switch (T.type) case plusnode: { gencode(T.lchild); gencode(T.rchild); emitCode(“add”); } case asgnnode: { emitcode(“loada”,T.strval); emitcode(“stn”); } case constnode: { emitcode(“loadc”,t.strval); } case idnode: { emitcode(“loadv”,t.strval); } default: { emitcode(“error”); } } 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Control Statements: code generation If statements While loops Logical expressions 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation for If Statements 3-address code P-code IF ( E ) S1 ELSE S2 <code evaluating E and assigning to t1> if_false t1 goto L1 <code for S1> goto L2 label L1 <code for S2> label L2 <code evaluating E> jumpF L1 <code for S1> jump L2 label L1 <code for S2> label L2 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation for While Loops 3-address code P-code WHILE ( E ) S label L1 <code evaluating E and assigning to t1> if_false t1 goto L2 <code for S> goto L1 label L2 label L1 <code evaluating E> jumpF L2 <code for S> jump L1 label L2 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Generating Labels Forward jump For intermediate code label L1 <code for E> jumpF L2 <code for S> jump L1 label L2 Forward jump Label must be generated before defining the label For intermediate code Generate label at the jump instruction Store the label until the actual location is found For target code (assembly) Leave the destination address in the jump instruction When the destination address is found, go back and fill the address (backpatching) Short jump or long jump? Leave space enough for long jump If only short jump is required, the extra space is filled with NOP. 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Code Generation for Logical Expressions Data types and operators Use Boolean data type and operators if included in the target/intermediate language Use integer 0/1 and bitwise operators Short-circuit evaluation IF a AND b THEN S If a is false, the whole exp is false there is no need to evaluate b IF a OR b THEN S If a is true, the whole exp is true Intermediate code for IF a AND b THEN S <code for a> if_false a goto L1 <code for b> if_false b goto L1 <code for S> Label L1 IF a OR b THEN S goto L2 if_false b goto L3 Label L2 Label L3 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Data Types Primitive data types Strings Arrays Records 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Primitive Data Types Integer Floating Point Decimal Boolean Character Short, long, signed, unsigned Floating Point IEEE floating point format Decimal Boolean Character ASCII, Unicode 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Ordinal Types Integers User-defined ordinal types: range of possible values can be associated with I+ Enumeration types user-defined set of possible values. Subrange types Ordered contiguous subsequence of an ordinal type. 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Strings Can be either Operations Primitive type: Ada, FORTRAN, Basic Array: Pascal, C, C++ Class: Java (String) Operations Assignment Comparison Catenation Substring reference Pattern Matching 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Descriptor for Strings Descriptor is the information stored in a compiler. Static string Type: static string Length Address Limited dynamic string Type: dynamic string Length: maximum and current length Static string length address Dynamic string Max length Current length address 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Arrays Types of subscripts Implementation 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Subscripts Types of subscripts Range of subscripts Integers FORTRAN, C, Java Ordinal types Enumeration types Pascal, Ada Static C, FORTRAN Java Arrays Dynamic Ada FORTRAN ALLOCATBLE Perl, JavaScript Java ArrayList 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Implementation Number of subscripts Descriptor Most languages have no limit, but some versions of FORTRAN have. Descriptor Type: Array Element type Index type Number of dimensions Index ranges, for each dimension Address Array Element type Index type Number of dimensions Index range 1 … Index range n Address 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Array: Address calculation Row Column 1 2 … j n i m Address of A[i, j] = s + ((i-1)n + j)*size(elementType) 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Records heterogeneous aggregate of data elements in which the individual elements are identified by names. Example: record student { int id; char firstname[40]; char lastname[40]; float grade; } 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Implementation Descriptor Address Calculation Type: record For each field: Name, type, offset Address Address Calculation S + field offset Record Name Type Offset … Address Field 1 Field n 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Address Calculation Addressing operations Array reference Record reference 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Addressing Operations 3-address code P-code Address of x &x Indirect address *x Address of x loada x Indirect load ind x (load *(top+x)) Indexed address ixa x (load top*x+(top-1)) 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Array References 3-address code P-code x=a[i] a[i]=x Find offset t1=i*elesize(a) t2=&a+t1 x=*t2 a[i]=x *t2=x x=a[i] loada x loada a loadv i ixa elesize(a) ind 0 sto a[i]=x loadv x Find offset Find address Find address Find content Find offset Find address Find address 2301380 Chapter 6 Code Generation and Data Types 01/08/62
More Complex Array References 3-address code P- code a[i+1]=a[j+2]+3 t1=j+2 t2=t1*elesize(a) t3=&a+t2 t4=*t3 t5=t4+3 t6=i+1 t7=t6*elesize(a) t8=&a+t7 *t8=t5 a[i+1]=a[j+2]+3 loada a loadv i loadc 1 adi ixa elesize(a) loadv j loadc 2 ind 0 loadc 3 sto push &a[i+1] t4=a[t1] push a[j+2] a[t6]=t5 2301380 Chapter 6 Code Generation and Data Types 01/08/62
Record References 3-address code P-code x.j x.c x.i Address of a field t1=&x+offset(x,j) Content of a field t2=*t1 Address of a field loada x loadc offset(x.j) ixa 1 Content of a field ind offset(x.j) x.j x.c Offset of x.j x.i base address of x Offset of x.c 2301380 Chapter 6 Code Generation and Data Types 01/08/62