Download presentation
Presentation is loading. Please wait.
Published byReginald Rodgers Modified over 9 years ago
1
Bitwidth Analysis with Application to Silicon Compilation Mark Stephenson Jonathan Babb Saman Amarasinghe MIT Laboratory for Computer Science
2
June 19th, 2000www.cag.lcs.mit.edu/bitwise Goal For a program written in a high level language, automatically find the minimum number of bits needed to represent: –Each static variable in the program –Each operation in the program.
3
June 19th, 2000www.cag.lcs.mit.edu/bitwise Usefulness of Bitwidth Analysis Higher Language Abstraction Enables other compiler optimizations 1. Synthesizing application-specific processors 2. Optimizing for power-aware processors 3. Extracting more parallelism for SIMD processors
4
June 19th, 2000www.cag.lcs.mit.edu/bitwise Bitwidth Opportunities Runtime profiling reveals plenty of bitwidth opportunities. For the SPECint95 benchmark suite, –Over 50% of operands use less than half the number of bits specified by the programmer.
5
June 19th, 2000www.cag.lcs.mit.edu/bitwise Analysis Constraints Bitwidth results must maintain program correctness for all input data sets –Results are not runtime/data dependent A static analysis can do very well, even in light of this constraint
6
June 19th, 2000www.cag.lcs.mit.edu/bitwise Bitwidth Extraction Use abundant hints in the source language to discover bitwidths with near optimal precision. Caveats – Analysis limited to fixed-point variables. – We assume source program correctness.
7
June 19th, 2000www.cag.lcs.mit.edu/bitwise The Hints Bitwidth refining constructs 1.Arithmetic operations 2.Boolean operations 3.Bitmask operations 4.Loop induction variable bounding 5.Clamping operations 6.Type castings 7.Static array index bounding
8
June 19th, 2000www.cag.lcs.mit.edu/bitwise 1. Arithmetic Operations Example int a; unsigned b; a = random(); b = random(); a = a / 2; b = b >> 4; a: 32 bits b: 32 bits a: 31 bits b: 32 bits a: 31 bits b: 28 bits
9
June 19th, 2000www.cag.lcs.mit.edu/bitwise 2. Boolean Operations Example int a; a = (b != 15); a: 32 bits a: 1 bit
10
June 19th, 2000www.cag.lcs.mit.edu/bitwise int a; a = random() & 0xff; 3. Bitmask Operations Example a: 32 bits a: 8 bits
11
June 19th, 2000www.cag.lcs.mit.edu/bitwise Applicable to for loop induction variables. Example int i; for (i = 0; i < 6; i++) { … } 4. Loop Induction Variable Bounding i: 32 bits i: 3 bits
12
June 19th, 2000www.cag.lcs.mit.edu/bitwise 5. Clamping Optimization Multimedia codes often simulate saturating instructions. Example int valpred if (valpred > 32767) valpred = 32767 else if (valpred < -32768) valpred = -32768 valpred: 32 bits valpred: 16 bits
13
June 19th, 2000www.cag.lcs.mit.edu/bitwise 6. Type Casting (Part I) Example int a; char b; a = b; a: 32 bits b: 8 bits a: 8 bits b: 8 bits
14
June 19th, 2000www.cag.lcs.mit.edu/bitwise 6. Type Casting (Part II) Example int a; char b; b = a; a: 32 bits b: 8 bits a: 8 bits b: 8 bits
15
June 19th, 2000www.cag.lcs.mit.edu/bitwise 7. Array Index Optimization An index into an array can be set based on the bounds of the array. Example int a, b; int X[1024]; X[a] = X[4*b]; a: 32 bits b: 32 bits a: 10 bits b: 8 bits
16
June 19th, 2000www.cag.lcs.mit.edu/bitwise Data-flow analysis Three candidate lattices –Bitwidth –Vector of bits –Data-ranges Propagating Data-Ranges a = a + 1 a: 4 bits a: 5 bits Propagating bitwidths
17
June 19th, 2000www.cag.lcs.mit.edu/bitwise Data-flow analysis Three candidate lattices –Bitwidth –Vector of bits –Data-ranges Propagating Data-Ranges a = a + 1 a: 1X a: XXX Propagating bit vectors
18
June 19th, 2000www.cag.lcs.mit.edu/bitwise Data-flow analysis Three candidate lattices –Bitwidth –Vector of bits –Data-ranges Propagating Data-Ranges a = a + 1 a: Propagating data-ranges Four bits are required
19
June 19th, 2000www.cag.lcs.mit.edu/bitwise Propagating Data-Ranges Propagate data-ranges forward and backward over the control-flow graph using transfer functions described in the paper Use Static Single Assignment (SSA) form with extensions to: –Gracefully handle pointers and arrays. –Extract data-range information from conditional statements.
20
June 19th, 2000www.cag.lcs.mit.edu/bitwise a2 = a1:(a1 0) a3 = a2 + 1 Example of Data-Range Propagation a0 = input() a1 = a0 + 1 a1 < 0 a4 = a1:(a1 0) c0 = a4 a5 = (a3,a4) b0 = array[a5] Range-refinement functions true
21
June 19th, 2000www.cag.lcs.mit.edu/bitwise a2 = a1:(a1 0) a3 = a2 + 1 Example of Data-Range Propagation a0 = input() a1 = a0 + 1 a1 < 0 a4 = a1:(a1 0) c0 = a4 a5 = (a3,a4) b0 = array[a5] array’s bounds are [0:9] true
22
June 19th, 2000www.cag.lcs.mit.edu/bitwise What to do with Loops? Finding the fixed-point around back edges will often saturate data-ranges.
23
June 19th, 2000www.cag.lcs.mit.edu/bitwise What to do with Loops? Finding the fixed-point around back edges will often saturate data-ranges. Example a0 = 0 y0 = 1 y1 = (y0, y2) a1 = (a0, a3) y1 < 100 a2 = a1 + 5 y2 = y1 + 1 a: 0..0 y: 1..1 a: 0..0 a: 0..5 y: 1..2 a: 0..10 y: 1..3 a: 0..20 y: 1..5 a: 0..25 y: 1..6 a: 0.. y: 1.. a = 0 for (y = 1; y < 100; y++) a = a + 5;
24
June 19th, 2000www.cag.lcs.mit.edu/bitwise Our Loop Solution Find the closed-form solutions to commonly occurring sequences. –A sequence is a mutually dependent group of instructions. Use the closed-form solutions to determine final ranges.
25
June 19th, 2000www.cag.lcs.mit.edu/bitwise Finding the Closed-Form Solution a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4
26
June 19th, 2000www.cag.lcs.mit.edu/bitwise Finding the Closed-Form Solution a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4
27
June 19th, 2000www.cag.lcs.mit.edu/bitwise a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4 Non-trivial to find the exact ranges Finding the Closed-Form Solution
28
June 19th, 2000www.cag.lcs.mit.edu/bitwise a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4 Non-trivial to find the exact ranges Finding the Closed-Form Solution
29
June 19th, 2000www.cag.lcs.mit.edu/bitwise a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4 Can easily find conservative range of Finding the Closed-Form Solution
30
June 19th, 2000www.cag.lcs.mit.edu/bitwise a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4 Figure out the iteration count of each loop. Solving the Linear Sequence
31
June 19th, 2000www.cag.lcs.mit.edu/bitwise a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4 Find out how much each instruction contributes to sequence using iteration count. Solving the Linear Sequence * =
32
June 19th, 2000www.cag.lcs.mit.edu/bitwise a = 0 for i = 1 to 10 a = a + 1 for j = 1 to 10 a = a + 2 for k = 1 to 10 a = a + 3...= a + 4 Sum all the contributions together, and take the data- range union with the initial value. Solving the Linear Sequence * = ( + + ) =
33
June 19th, 2000www.cag.lcs.mit.edu/bitwise Results Standalone Bitwise compiler. –Bits cut from scalar variables –Bits cut from array variables With the DeepC silicon compiler.
34
June 19th, 2000www.cag.lcs.mit.edu/bitwise Percentage of Original Scalar Bits
35
June 19th, 2000www.cag.lcs.mit.edu/bitwise Percentage of Original Array Bits
36
June 19th, 2000www.cag.lcs.mit.edu/bitwise DeepC Compiler Targeted to FPGAs Suif Frontend C/Fortran program Pointer alias and other high-level analyses MachSuif Codegen DeepC specialization Raw parallelization Traditional CAD optimizations Physical Circuit Bitwidth Analysis Verilog
37
June 19th, 2000www.cag.lcs.mit.edu/bitwise FPGA Area 0 200 400 600 800 1000 1200 1400 1600 1800 2000 adpcm (8) bubblesort (32) convolve (16) histogram (16) intfir (32) intmatmul (16) jacobi (8) life (1) median (32) mpegcorr (16) newlife (1) parity (32) pmatch (32) sor (32) Area (CLB count) Without bitwiseWith bitwise Benchmark (main datapath width)
38
June 19th, 2000www.cag.lcs.mit.edu/bitwise FPGA Clock Speed (50 MHz Target) Without bitwiseWith bitwise 0 25 50 75 100 125 150 adpcm bubblesort convolve histogram intfir intmatmul jacobi life median mpegcorr newlife parity pmatch sor XC4000-09 Clock Speed (MHZ)
39
June 19th, 2000www.cag.lcs.mit.edu/bitwise Power Savings 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 bubblesorthistogramjacobipmatch Average Dynamic Power (mW) Without bitwidth analysisWith bitwidth analysis
40
June 19th, 2000www.cag.lcs.mit.edu/bitwise Related Work Data-range propagation for branch prediction [Patterson] Symbolic data-range analysis [Rugina et al.] Bitwidth propagation [Ananian] Bit-vector propagation [Rahzdan, Budiu et al.]
41
June 19th, 2000www.cag.lcs.mit.edu/bitwise Summary Developed Bitwise: a scalable bitwidth analyzer –Standard data-flow analysis –Loop analysis –Incorporate pointer analysis Demonstrate savings when targeting silicon from high-level languages –57% less area –up to 86% improvement in clock speed –less than 50% of the power
42
June 19th, 2000www.cag.lcs.mit.edu/bitwise
43
June 19th, 2000www.cag.lcs.mit.edu/bitwise Power Savings C ASIC –IBM SA27E process 0.15 micron drawn –200 MHz Methodology –C RTL –RTL simulation Register switching activity –Synthesis reports dynamic power
44
June 19th, 2000www.cag.lcs.mit.edu/bitwise Mismatched Bitwidths When operands of an instruction are of differing sizes –type conversion instructions are added, converting both operands to an integer of the widest of the two, and with the appropriate sign
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.