© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Program design and analysis zCompilation flow. zBasic statement translation. zBasic optimizations.

Slides:



Advertisements
Similar presentations
SYMBOL TABLES &CODE GENERATION FOR EXECUTABLES. SYMBOL TABLES Compilers that produce an executable (or the representation of an executable in object module.
Advertisements

ARM versions ARM architecture has been extended over several versions.
Overheads for Computers as Components 2nd ed.
Passing by-value vs. by-reference in ARM by value C code equivalent assembly code int a;.section since a is not assigned an a:.skip initial.
The University of Adelaide, School of Computer Science
MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
COMP3221 lec16-function-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 16 : Functions in C/ Assembly - II
1 ARM Movement Instructions u MOV Rd, ; updates N, Z, C Rd = u MVN Rd, ; Rd = 0xF..F EOR.
Run-time Environment for a Program different logical parts of a program during execution stack – automatically allocated variables (local variables, subdivided.
Computer Architecture Lecture 7 Compiler Considerations and Optimizations.
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
UEE072HM Linking HLL and ALP An example on ARM. Embedded and Real-Time Systems We will mainly look at embedded systems –Systems which have the computer.
Introduction to Embedded Systems Intel Xscale® Assembly Language and C Lecture #3.
MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
1 Today’s lecture  Last lecture we started talking about control flow in MIPS (branches)  Finish up control-flow (branches) in MIPS —if/then —loops —case/switch.
Computer Architecture CSCE 350
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 4 Assembly Language Programming 2.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
CPSC Compiler Tutorial 8 Code Generator (unoptimized)
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zCompilation flow. zBasic statement translation. zBasic optimizations.
Elec2041 lec-11-mem-I.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 11: Memory Access - I
CS 536 Spring Code generation I Lecture 20.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
UEE072HM. Embedded and Real-Time Systems We will mainly look at embedded systems –Systems which have the computer system embedded within their application.
Embedded Computer Systems Chapter1: Embedded Computing Eng. Husam Y. Alzaq Islamic University of Gaza.
Data Transfer & Decisions I (1) Fall 2005 Lecture 3: MIPS Assembly language Decisions I.
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zDesigning embedded programs is more difficult and challenging.
LANGUAGE TRANSLATORS: WEEK 24 TRANSLATION TO ‘INTERMEDIATE’ CODE (overview) Labs this week: Tutorial Exercises on Code Generation.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Computer Architecture Instruction Set Architecture Lynn Choi Korea University.
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Program design and analysis Software components. Representations of programs. Assembly.
COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.
Computer Science 313 – Advanced Programming Topics.
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zCompilation flow. zBasic statement translation. zBasic optimizations.
F28PL1 Programming Languages Lecture 4: Assembly Language 3.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Compiler Construction Code Generation Activation Records
Prologue Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Intel Xscale® Assembly Language and C. The Intel Xscale® Programmer’s Model (1) (We will not be using the Thumb instruction set.) Memory Formats –We will.
Writing Functions in Assembly
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
The University of Adelaide, School of Computer Science
Chapter 12 Variables and Operators
Introduction to Compilers Tim Teitelbaum
Writing Functions in Assembly
ARM Assembly Programming
Instructions - Type and Format
Programming Languages (CS 550) Mini Language Compiler
Code Generation.
Passing by-value vs. by-reference in ARM
Stack Frame Linkage.
EECE.3170 Microprocessor Systems Design I
C-to-LC3 Compiler Over the course of the next two weeks, you will build a program that will compile C code to LC-3 assembly language Don't panic! You.
What time is it?. What time is it? Major Concepts: a data structure model: basic representation of data, such as integers, logic values, and characters.
EECE.3170 Microprocessor Systems Design I
Overheads for Computers as Components 2nd ed.
Overheads for Computers as Components 2nd ed.
Program Design and Analysis Chapter 5
Branch instructions Branch : B{<cond>} label
Intermediate Code Generation
Chapter 6 Programming the basic computer
Procedure Support From previous study of high-level languages, we know the basic issues: - declaration: header, body, local variables - call and return.
Programming Languages (CS 360) Mini Language Compiler
Presentation transcript:

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Program design and analysis zCompilation flow. zBasic statement translation. zBasic optimizations. zInterpreters and just-in-time compilers.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Compilation zCompilation strategy (Wirth): ycompilation = translation + optimization zCompiler determines quality of code: yuse of CPU resources; ymemory access scheduling; ycode size.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Basic compilation phases HLL parsing, symbol table machine-independent optimizations machine-dependent optimizations assembly

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Statement translation and optimization zSource code is translated into intermediate form such as CDFG. zCDFG is transformed/optimized. zCDFG is translated into instructions with optimization decisions. zInstructions are further optimized.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Arithmetic expressions a*b + 5*(c-d) expression DFG *- * + a b cd 5

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed Arithmetic expressions, cont’d. ADR r4,a MOV r1,[r4] ADR r4,b MOV r2,[r4] ADD r3,r1,r2 DFG *- * + a b cd 5 ADR r4,c MOV r1,[r4] ADR r4,d MOV r5,[r4] SUB r6,r4,r5 MUL r7,r6,#5 ADD r8,r7,r3 code

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Control code generation if (a+b > 0) x = 5; else x = 7; a+b>0x=5 x=7

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed Control code generation, cont’d. ADR r5,a LDR r1,[r5] ADR r5,b LDR r2,b ADD r3,r1,r2 BLE label3 a+b>0x=5 x=7 LDR r3,#5 ADR r5,x STR r3,[r5] B stmtent LDR r3,#7 ADR r5,x STR r3,[r5] stmtent...

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Procedure linkage zNeed code to: ycall and return; ypass parameters and results. zParameters and returns are passed on stack. yProcedures with few parameters may use registers.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Procedure stacks proc1 growth proc1(int a) { proc2(5); } proc2 SP stack pointer FP frame pointer 5 accessed relative to SP

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. ARM procedure linkage zAPCS (ARM Procedure Call Standard): yr0-r3 pass parameters into procedure. Extra parameters are put on stack frame. yr0 holds return value. yr4-r7 hold register values. yr11 is frame pointer, r13 is stack pointer. yr10 holds limiting address on stack size to check for stack overflows.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Data structures zDifferent types of data structures use different data layouts. zSome offsets into data structure can be computed at compile time, others must be computed at run time.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. One-dimensional arrays zC array name points to 0th element: a[0] a[1] a[2] a = *(a + 1)

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Two-dimensional arrays zColumn-major layout: a[0,0] a[0,1] a[1,0] a[1,1] = a[i*M+j]... M N

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Structures zFields within structures are static offsets: field1 field2 aptr struct { int field1; char field2; } mystruct; struct mystruct a, *aptr = &a; 4 bytes *(aptr+4)

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Expression simplification zConstant folding: y8+1 = 9 zAlgebraic: ya*b + a*c = a*(b+c) zStrength reduction: ya*2 = a<<1

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Dead code elimination zDead code: #define DEBUG 0 if (DEBUG) dbg(p1); zCan be eliminated by analysis of control flow, constant folding. 0 dbg(p1); 1 0

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Procedure inlining zEliminates procedure linkage overhead: int foo(a,b,c) { return a + b - c;} z = foo(w,x,y);  z = w + x + y;

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Loop transformations zGoals: yreduce loop overhead; yincrease opportunities for pipelining; yimprove memory system performance.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Loop unrolling zReduces loop overhead, enables some other optimizations. for (i=0; i<4; i++) a[i] = b[i] * c[i];  for (i=0; i<2; i++) { a[i*2] = b[i*2] * c[i*2]; a[i*2+1] = b[i*2+1] * c[i*2+1]; }

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Loop fusion and distribution zFusion combines two loops into 1: for (i=0; i<N; i++) a[i] = b[i] * 5; for (j=0; j<N; j++) w[j] = c[j] * d[j];  for (i=0; i<N; i++) { a[i] = b[i] * 5; w[i] = c[i] * d[i]; } zDistribution breaks one loop into two. zChanges optimizations within loop body.

© 2000 Morgan Kaufman Overheads for Computers as Components 2 nd ed. Loop tiling zBreaks one loop into a nest of loops. zChanges order of accesses within array. yChanges cache behavior.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Loop tiling example for (i=0; i<N; i++) for (j=0; j<N; j++) c[i] = a[i,j]*b[i]; for (i=0; i<N; i+=2) for (j=0; j<N; j+=2) for (ii=0; ii<min(i+2,n); ii++) for (jj=0; jj<min(j+2,N); jj++) c[ii] = a[ii,jj]*b[ii];

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Array padding zAdd array elements to change mapping into cache: a[0,0]a[0,1]a[0,2] a[1,0]a[1,1]a[1,2] before a[0,0]a[0,1]a[0,2] a[1,0]a[1,1]a[1,2] after a[0,2] a[1,2]

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Register allocation zGoals: ychoose register to hold each variable; ydetermine lifespan of varible in the register. zBasic case: within basic block.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Register lifetime graph w = a + b; x = c + w; y = c + d; time a b c d w x y 123 t=1 t=2 t=3

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Instruction scheduling zNon-pipelined machines do not need instruction scheduling: any order of instructions that satisfies data dependencies runs equally fast. zIn pipelined machines, execution time of one instruction depends on the nearby instructions: opcode, operands.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Reservation table zA reservation table relates instructions/time to CPU resources. Time/instrAB instr1X instr2XX instr3X instr4X

© 2008 Wayne Wolf Overheads for Computers as Components Software pipelining zSchedules instructions across loop iterations. zReduces instruction latency in iteration i by inserting instructions from iteration i+1.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Instruction selection zMay be several ways to implement an operation or sequence of operations. zRepresent operations as graphs, match possible instruction sequences onto graph. * + expressiontemplates *+ * + MULADD MADD

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Using your compiler zUnderstand various optimization levels (- O1, -O2, etc.) zLook at mixed compiler/assembler output. zModifying compiler output requires care: ycorrectness; yloss of hand-tweaked code.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Interpreters and JIT compilers zInterpreter: translates and executes program statements on-the-fly. zJIT compiler: compiles small sections of code into instructions during program execution. yEliminates some translation overhead. yOften requires more memory.