Download presentation
Presentation is loading. Please wait.
Published byMaud Newman Modified over 9 years ago
2
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zDesigning embedded programs is more difficult and challenging as compared to PC/workstation programs. zDesign challenges: -rich functionality -Meet system deadlines -Fit into restricted memory -Meet power requirements -Debugging
3
© 2000 Morgan Kaufman Overheads for Computers as Components Assembly and linking zSteps in compilation: HLL compiler assembler HLL assembly linker Executable binary loader assembly code assemblyObject code Symbolic to bit level translation Exec built from many files, addresses linked, …
4
© 2000 Morgan Kaufman Overheads for Computers as Components Multiple-module programs zPrograms may be composed from several files addresses for instructions/data are done by linker zAddresses become more specific during processing: yrelative addresses are measured relative to the start of a module; yabsolute addresses are measured relative to the start of the CPU address space.
5
© 2000 Morgan Kaufman Overheads for Computers as Components Assemblers zMajor tasks: ygenerate binary for symbolic instructions; ytranslate labels into addresses; yhandle pseudo-ops (data, etc.). zAssembly labels: ORG 100 label1ADR r4,c Pseudo-op Starting point in memory
6
© 2000 Morgan Kaufman Overheads for Computers as Components Two-pass assembly zPass 1: yScan code to determine address of each module generate symbol table zPass 2: yAssembles instructions using labels computed in first pass generate binary instructions
7
© 2000 Morgan Kaufman Overheads for Computers as Components Symbol table ADD r0,r1,r2 xxADD r3,r4,r5 CMP r0,r3 yySUB r5,r6,r7 assembly code xx0x8 yy0x10 symbol table PLC
8
© 2000 Morgan Kaufman Overheads for Computers as Components Symbol table generation zUse program location counter (PLC) to determine address of each location. zScan program, keeping count of PLC. zAddresses are generated at assembly time, not execution time.
9
© 2000 Morgan Kaufman Overheads for Computers as Components Symbol table example ADD r0,r1,r2 xxADD r3,r4,r5 CMP r0,r3 yySUB r5,r6,r7 xx0xb yy0x13 PLC=0x7PLC=0xbPLC=0xfPLC=0x13 NOTE: Assume that each instruction takes 4 memory locations Symbol table
10
© 2000 Morgan Kaufman Overheads for Computers as Components Pseudo-operations zPseudo-ops do not generate instructions: yORG sets program location. yEQU generates symbol table entry without advancing PLC. y Example: add r0, r1, r2 FOO EQU 5 BAZSUB r3,r4,#FOO - Label FOO=5 will be added to symbol table - BAZ will remain the same as when 2 nd statement is not present
11
© 2000 Morgan Kaufman Overheads for Computers as Components Linking zPrograms are written as several smaller pieces zPrograms may use library routines (.o files) zLinker combines several object modules into a single executable module. zJobs: yput modules in order; yresolve labels across modules.
12
© 2000 Morgan Kaufman Overheads for Computers as Components external reference entry point Externals and entry points xxxADD r1,r2,r3 B a yyy%1 Some labels are both defined and used in the same file Some will be defined in a single file but used elsewhere aADR r4,yyy ADD r3,r4,r5
13
© 2000 Morgan Kaufman Overheads for Computers as Components Module ordering zControlling where code modules are loaded in memory is important, e.g., : - Interrupt code must be placed at precise locations in memory - Code has to be placed in specific memory locations in EPROM/RAM.
14
© 2000 Morgan Kaufman Overheads for Computers as Components … Module ordering zCode modules must be placed in absolute positions in the memory space. zLoad map or linker flags (given by user) control the order of modules. module1 module2 module3
15
© 2000 Morgan Kaufman Overheads for Computers as Components Dynamic linking zSome operating systems link modules dynamically at run time: yshares one copy of library among all executing programs Saves storage space as we do not have to link separate copies of commonly used routines such as I/o to every executable program yallows programs to be updated with new versions of libraries. ySome delay is introduced before program starts execution
16
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zWe need to control: zwhere interrupt code is placed zwhere data and instructions are placed zexecution speed, memory/power consumption zCompilation flow. zBasic statement translation. zBasic optimizations.
17
© 2000 Morgan Kaufman Overheads for Computers as Components Compilation: High level Low level code zCompilation strategy: ycompilation = translation + optimization zCompiler determines quality of code: yuse of CPU resources, e.g., register usage ymemory access scheduling; ycode size. Better instruction sequenceHigh level language translation
18
© 2000 Morgan Kaufman Overheads for Computers as Components Basic compilation phases HLL parsing, symbol table, semantic analysis machine-independent optimizations machine-dependent optimizations assembly e.g., arithmetic simplifications (use one memory location for x in x= c*x )
19
© 2000 Morgan Kaufman Overheads for Computers as Components Statement translation and optimization zSource code is translated into intermediate form such as CDFG. zCDFG is transformed/optimized. zCDFG is translated into instructions with optimization decisions machine dependent
20
© 2000 Morgan Kaufman Overheads for Computers as Components Arithmetic expressions a*b + 5*(c-d) expression DFG *- * + a b cd 5 -Some machines perform memory to memory arithmetic -In many machines, e.g., ARM, we have to load variables into registers w x y z Temporary variables
21
© 2000 Morgan Kaufman Overheads for Computers as Components 2 3 4 1 Arithmetic expressions, cont’d. ADR r4,a MOV r1,[r4] ADR r4,b MOV r2,[r4] MUL r3,r1,r2 DFG *- * + a b cd 5 ADR r4,c MOV r1,[r4] ADR r4,d MOV r5,[r4] SUB r6,r4,r5 MUL r7,r6,#5 ADD r8,r7,r3 code 1 2 3 4 An obvious optimization: Use a register whose value is not needed
22
© 2000 Morgan Kaufman Overheads for Computers as Components Control code generation if (a+b > 0) x = 5; else x = 7; a+b>0x=5 x=7
23
© 2000 Morgan Kaufman Overheads for Computers as Components 3 21 Control code generation, cont’d. CDFG model (1-2-3/other walk) ADR r5,a LDR r1,[r5] ADR r5,b LDR r2,b ADD r3,r1,r2 BLE label3 a+b>0x=5 x=7 LDR r3,#5 ADR r5,x STR r3,[r5] B stmtent label3LDR r3,#7 ADR r5,x STR r3,[r5] stmtent... T1 2 3
24
© 2000 Morgan Kaufman Overheads for Computers as Components Procedure linkage: CPU’s subroutine call is not usually sufficient to support procedures in programming languages zNeed code to: ycall and return; ypass parameters and results. zParameters and returns are passed on stack. yProcedures with few parameters may use registers.
25
© 2000 Morgan Kaufman Overheads for Computers as Components A Typical Procedure Linkage: f(arg_0, …, arg_n-1) zPush arg_n-1 (onto stack) z … z Push arg_1 z Push arg_0 zStore return address in a register z Branch to procedure B (f, Return address) zDe-allocate arguments POP n words off the stack zUse a register for return variable of f(.)
26
© 2000 Morgan Kaufman Overheads for Computers as Components Stack-based management - handled by compiler zA block of memory called “stack frame” is required for every call of a function. zThe stack grows and shrinks during execution difficult to predict worst case at compile time Used Space SP Total Stack space
27
© 2000 Morgan Kaufman Overheads for Computers as Components Procedure stacks proc1 growth proc1(int a) { int b, c; b=5; proc2(b); c=2+b; … } proc2 SP stack pointer: defines end of current frame FP frame pointer: defines end of last frame 5 accessed relative to SP
28
© 2000 Morgan Kaufman Overheads for Computers as Components ARM procedure linkage zAPCS (ARM Procedure Call Standard): BL foo yr0-r3 pass parameters into procedure. Extra parameters are put on stack frame. yr0 holds return value. yr4-r7 hold register values. yr11 is frame pointer, r13 is stack pointer. yr10 holds limiting address on stack size to check for stack overflows.
29
© 2000 Morgan Kaufman Overheads for Computers as Components 8051, PIC: Static Allocation- no run- time memory allocation yHow each byte of RAM is used is established at compile time yGlobal and static data are allocated in fixed locations. yLocal data is stored in a block set for each function, i.e., local data x is stored in the same place for each invocation non- reentrant code
30
© 2000 Morgan Kaufman Overheads for Computers as Components Data layout in memory: 1D, 2D arrays, structures zCompiler: must translate references to data structures to references in raw memory address computations zDifferent types of data structures use different data layouts. zSome offsets into data structure can be computed at compile time, others must be computed at run time.
31
© 2000 Morgan Kaufman Overheads for Computers as Components One-dimensional arrays zC array name points to 0th element: a[0] a[1] a[2] aptr = *(aptr + 1)
32
© 2000 Morgan Kaufman Overheads for Computers as Components Two-dimensional arrays: different possibilities z2-dim array is accessed thru 1-dim array Row-major layout: a[0,0] a[0,1] a[1,0] a[1,1] = a[i*M+j]... M N Calculation done at run time when i, j change
33
© 2000 Morgan Kaufman Overheads for Computers as Components Structures zFields within structures are static offsets: field1 field2 aptr struct { int field1; char field2; } mystruct; struct mystruct a, *aptr = &a; 4 bytes *(aptr+4) Addition done at compile time (a.filed2)
34
© 2000 Morgan Kaufman Overheads for Computers as Components Expression simplification: Machine independent zConstant folding: y8+1 = 9 zAlgebraic: ya*b + a*c = a*(b+c) zStrength reduction: ya*2 a<<1 y for (i=0 ; i<NOPS+1; i++) … Addition done at compile time
35
© 2000 Morgan Kaufman Overheads for Computers as Components Dead code elimination zDead code: #define DEBUG 0 if (DEBUG) dbg(p1); zCan be eliminated by analysis of control flow, constant folding. 0 dbg(p1); 1 0 Compile flag
36
© 2000 Morgan Kaufman Overheads for Computers as Components Procedure inlining zEliminates procedure linkage overhead: int foo(a,b,c) { return a + b - c;} z = foo(w,x,y); Inlining result: z = w + x - y; zC++ provides an inline construct. zMore code/memory size zmay slow down cached systems
37
© 2000 Morgan Kaufman Overheads for Computers as Components Loop transformations zLoops are compact but can contribute to a large fraction of computation time zGoals: yreduce loop overhead; yimprove memory system performance.
38
© 2000 Morgan Kaufman Overheads for Computers as Components Loop unrolling zReduces loop overhead, enables some other optimizations. for (i=0; i<4; i++) a[i] = b[i] * c[i]; a[0]=b[0]*c[0], …, a[3]=b[3]*c[3] iteration and test contribute to overhead
39
© 2000 Morgan Kaufman Overheads for Computers as Components Loop fusion zFusion combines two loops into 1: for (i=0; i<N; i++) a[i] = b[i] * 5; for (j=0; j<N; j++) w[j] = c[j] * d[j]; for (i=0; i<N; i++) { a[i] = b[i] * 5; w[i] = c[i] * d[i]; }
40
© 2000 Morgan Kaufman Overheads for Computers as Components Register allocation zGoals: ychoose register to hold each variable; ydetermine lifespan of variable in the register.
41
© 2000 Morgan Kaufman Overheads for Computers as Components Register lifetime graph w = a + b; x = c + w; y = c + d; time a b c d w x y 123 t=1 t=2 t=3 Naive allocation: Assign each variable to separate registers (7 registers) Maximum 3 registers in use 3 registers required
42
© 2000 Morgan Kaufman Overheads for Computers as Components Using your compiler zUnderstand various optimization levels (- O1, -O2, etc.) zStudy the assembly language output.
43
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zOptimizing for execution time. zOptimizing for energy/power. zOptimizing for program size.
44
© 2000 Morgan Kaufman Overheads for Computers as Components Motivation zEmbedded systems must often meet deadlines. yFaster may not be fast enough. zNeed to be able to analyze execution time. yWorst-case, not typical. zNeed techniques for reliably improving execution time.
45
© 2000 Morgan Kaufman Overheads for Computers as Components Run times will vary zProgram execution times depend on several factors: yInput data values different execution paths yState of the instruction, data caches. yPipelining effects. yOther: e.g., Floating point operations- most sensitive to data values
46
© 2000 Morgan Kaufman Overheads for Computers as Components Measuring program speed zCPU simulator. yI/O may be hard. yMay not be totally accurate. zHardware timer start/stop at beginning/end of code yRequires board, instrumented program. zLogic analyzer. yLimited logic analyzer memory buffer. yRelies on code being able to produce identifiable bus events.
47
© 2000 Morgan Kaufman Overheads for Computers as Components Program performance metrics zAverage-case: yFor typical data values, whatever they are. zWorst-case: for meeting deadlines yFor any possible input set. zBest-case: yFor any possible input set. zToo-fast programs may cause critical races at system level timing requirements should be met as well
48
© 2000 Morgan Kaufman Overheads for Computers as Components Performance analysis zElements of program performance (Shaw): yexecution time = program path + instruction timing zPath is the sequence of instructions and depends on data values. zInstruction timing depends on: data dependency, pipeline/cache behavior.
49
© 2000 Morgan Kaufman Overheads for Computers as Components Program paths: longest path != longest execution time zConsider for loop: for (i=0, f=0, i<N; i++) f = f + c[i]*x[i]; zLoop initiation block executed once. zLoop test executed N+1 times. zLoop body and variable update executed N times. i<N i=0; f=0; f = f + c[i]*x[i]; i = i+1; N Y
50
© 2000 Morgan Kaufman Overheads for Computers as Components Loop optimizations zLoops are good targets for optimization. zBasic loop optimizations: ycode motion; yinduction-variable elimination; ystrength reduction (x*2 -> x<<1).
51
© 2000 Morgan Kaufman Overheads for Computers as Components Code motion: Moving unnecessary code out of a loop for (i=0; i<N*M; i++) z[i] = a[i] + b[i]; i<N*M i=0; z[i] = a[i] + b[i]; i = i+1; N Y i<X i=0; X = N*M Performed at every iteration, i.e., N*(M-1) executions
52
© 2000 Morgan Kaufman Overheads for Computers as Components Induction variable elimination zInduction variable: value derived from loop iteration, e.g., loop index. zConsider loop: for (i=0; i<N; i++) for (j=0; j<M; j++) z[i,j] = b[i,j]; zRather than re-compute i*M+j for each array in each iteration, share induction variable between arrays, increment at end of loop body.
53
© 2000 Morgan Kaufman Overheads for Computers as Components Induction variable elimination for (i=0; i<N; i++) for (j=0; j<M; j++){ zbinduct = i*M + j; *(zptr+zbinduct)=*(bptr+zbinduct); } OR: Eliminate multiplication (strength reduction) zbinduct=0; for (i=0; i<N; i++) for (j=0; j<M; j++){ *(zptr+zbinduct)=*(bptr+zbinduct); zbinduct++; }
54
© 2000 Morgan Kaufman Overheads for Computers as Components Performance optimization hints zSimple high-level statements may be time consuming, e.g., dyn. mem. alloc malloc( ) zUse registers efficiently. zUse page mode memory accesses arrange data contiguously so that they go in one page zAnalyze cache behavior
55
© 2000 Morgan Kaufman Overheads for Computers as Components Energy/power optimization zEnergy: ability to do work. yMost important in battery-powered systems. zPower: energy per unit time. yImportant even in wall-plug systems---power becomes heat Controlling power increases reliability and cost yMemory accesses are a major component of power consumption Reduce memory access yTurn-off parts of the system when not needed
56
© 2000 Morgan Kaufman Overheads for Computers as Components Measuring energy consumption zExecute a small loop, measure current: while (TRUE) a(); I CPU Ammeter Calculate power: Power with a() in the while loop – Power without a() in the while loop
57
© 2000 Morgan Kaufman Overheads for Computers as Components Sources of energy consumption zRelative energy per operation: ymemory transfer: 33 most expensive yexternal I/O: 10 ySRAM write: 9 ySRAM read: 4.4 ymultiply: 3.6 yadd: 1 yregister access (<1) most efficient
58
© 2000 Morgan Kaufman Overheads for Computers as Components Cache behavior is important zCache is power hungry built from SRAM zEnergy consumption has a sweet spot as cache size changes: ycache too small: program thrashes, burning energy on external memory accesses; ycache too large: cache itself burns too much power. y There is an optimum cache size
59
© 2000 Morgan Kaufman Overheads for Computers as Components Optimizing for energy zFirst-order optimization: yhigh performance = low energy. zNot many instructions trade speed for energy. zGenerally speaking: Making program run faster also reduces energy consumption.
60
© 2000 Morgan Kaufman Overheads for Computers as Components Optimizing for program size zGoal: yreduce hardware cost of memory; yreduce power consumption of memory units. zTwo opportunities: ydata; yinstructions.
61
© 2000 Morgan Kaufman Overheads for Computers as Components Data size minimization zReuse constants, variables, data buffers in different parts of code. zSelect buffer size carefully yRequires careful verification of correctness. zGenerate data using instructions- on-the- fly Although code takes space, there may be some net space saving
62
© 2000 Morgan Kaufman Overheads for Computers as Components Reducing code size zAvoid function inlining. zEncapsulate functions in subroutines Due to overhead there is a minimum size function body for which a subroutine makes sense zChoose CPU with compact instructions. zUse specialized instructions where possible e.g. Multiply-Accumulate is smaller and faster than separate arithmetic operations
63
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zProgram validation and testing.
64
© 2000 Morgan Kaufman Overheads for Computers as Components Goals zMake sure software works as intended. yWe will concentrate on functional testing--- performance testing is harder. zWhat tests are required to adequately test the program?
65
© 2000 Morgan Kaufman Overheads for Computers as Components Basic testing procedure zProvide the program with inputs. zExecute the program. zCompare the outputs to expected results.
66
© 2000 Morgan Kaufman Overheads for Computers as Components Types of software testing zBlack-box: tests are generated without knowledge of program internals. zClear-box (white-box): tests are generated from the program structure.
67
© 2000 Morgan Kaufman Overheads for Computers as Components Program design and analysis zSoftware modem.
68
© 2000 Morgan Kaufman Overheads for Computers as Components Theory of operation zFrequency-shift keying: yseparate frequencies for 0 and 1. time 0 1
69
© 2000 Morgan Kaufman Overheads for Computers as Components FSK encoding zGenerate waveforms based on current bit: bit-controlled waveform generator 0110101
70
© 2000 Morgan Kaufman Overheads for Computers as Components FSK decoding A/D converter zero filter one filter detector 0 bit detector 1 bit Passes frequencies in the range representing 1 and reject those representing 0.
71
© 2000 Morgan Kaufman Overheads for Computers as Components Transmission scheme zSend data in 8-bit bytes. Arbitrary spacing between bytes. zByte starts with 0 start bit. zReceiver measures length of start bit to synchronize itself to remaining 8 bits. start (0)bit 1bit 2bit 3bit 8... Length denotes start bit Samples taken at midpoints
72
© 2000 Morgan Kaufman Overheads for Computers as Components Requirements
73
© 2000 Morgan Kaufman Overheads for Computers as Components Specification Line-in* input() Receiver sample-in() bit-out() 11 Transmitter bit-in() sample-out() Line-out* output() 11
74
© 2000 Morgan Kaufman Overheads for Computers as Components System architecture zInterrupt handlers for samples: yinput and output. zTransmitter. zReceiver.
75
© 2000 Morgan Kaufman Overheads for Computers as Components Transmitter zWaveform generation by table lookup. yfloat sine_wave[N_SAMP] = { 0.0, 0.5, 0.866, 1, 0.866, 0.5, 0.0, -0.5, -0.866, -1.0, - 0.866, -0.5, 0}; time
76
© 2000 Morgan Kaufman Overheads for Computers as Components Receiver zFilters (FIR for simplicity) use circular buffers to hold data. zTimer measures bit length. zState machine recognizes start bits, data bits.
77
© 2000 Morgan Kaufman Overheads for Computers as Components Hardware platform zCPU. zA/D converter. zD/A converter. zTimer.
78
© 2000 Morgan Kaufman Overheads for Computers as Components Component design and testing zEasy to test transmitter and receiver on host. zTransmitter can be verified with speaker outputs. zReceiver verification tasks: ystart bit recognition; ydata bit recognition.
79
© 2000 Morgan Kaufman Overheads for Computers as Components System integration and testing zUse loopback mode to test components against each other. yLoopback in software or by connecting D/A and A/D converters.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.