Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chris Savarese, Yashesh Shroff, Greg Lawrence

Similar presentations


Presentation on theme: "Chris Savarese, Yashesh Shroff, Greg Lawrence"— Presentation transcript:

1 Chris Savarese, Yashesh Shroff, Greg Lawrence
MAP ART Mapping Architectural Properties to an Algorithm for Redundant Triangulation Chris Savarese, Yashesh Shroff, Greg Lawrence Advisor: Dr. Jan Rabaey April 27, 2000 CS252

2 Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work

3 Introduction Goal: Given a basic localization algorithm, explore architectural alternatives for the minimization of energy consumption. The concept of localization Energy saving techniques What we did…

4 Outline Introduction Background Time and Energy Profiling Background
Parallel Architectures Conclusions: Our Dream Architecture Future Work Background

5 The Localization Algorithm
U N2 N1 N3 N1(x1,y1,z1) N2(x2,y2,z2) U (x,y,z) N3(x3,y3,z3) (x1-xn) (y1-yn) (z1-zn) (xn-1-xn) (yn-1-yn) (zn-1-zn) . .. x y z = b1 bn-1 Am3 U31 Bn-11 [Am3] [Qm3] ·[R33] Solve: U = R-1QT b QRdcmp()

6 The StrongARM Architecture
Power: 200mW, 0.25m, 1.5V Clock Speed: 200 MHz Cache: 16 KB I-cache 8 KB D-cache 32-way set-associative, round-robin replacement 512B, 2-way Minicache 31/16 GPR (32-bit) Auto-increment addressing No FP processor MAC

7 The Tensilica Xtensa Architecture
Processor Configuration Power: 200mW, 0.25 m, 1.5V Clock Speed: 170 MHz Cache: 16 KB I-cache 16 KB D-cache Direct mapped 32 Registers (32-bits) Xtensibility  Use of TIE instructions No FP processor Zero overhead loops

8 Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work Time and Energy Profiling

9 Profiling Results Profiler Output: StrongARM Processor 68J
_fmul % % % lubksb % % % _fneq % % _fdiv % % _fmul % % _frsb % % StrongARM Processor 68J Xtensa Processor 144J Floating Point Energy = nom. core power  #cycles  clock period

10 Fixed Point Arithmetic
Floating Point vs. Fixed Point Add / Sub are straightforward Multiply / Divide require shifting Why can we use it for localization? Low accuracy requirements Limited range in measurements (< 10m) Small matrices  small error propagation S E Mantissa

11 Fixed Point Profiling Results
Profiler Output: _fmul % % % lubksb % % % _fneq % % _fdiv % % _fmul % % _frsb % % StrongARM Processor 68J Xtensa Processor 144J Floating Point StrongARM Processor 43J Xtensa Processor 69J Fixed Point (37% less) (52% less) Energy = nom. core power  #cycles  clock period

12 Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work Parallel Architectures

13 Parallel Architectures
- Write sequential code in Matlab - Extract data-dependencies - Workload analysis CP1 CP2 CP3 P

14 Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work Conclusions: Our Dream Architecture

15 Our Dream Architecture
Floating point hardware MAC hardware Zero overhead loops Auto increment Register file size Cache  Direct mapped

16 Future Work FPGA implementation Xtensa customizations TIE instructions
Floating Point Coprocessor Realistic algorithm for PicoRadio

17 Many Thanks To… Dr. Bart Kienhuis, EECS Post Doc
Ptolemy and other tools: Parallel issues Fred Burghardt, BWRC Technical Staff PicoRadio Testbed Marlene Wan, BWRC Student StrongARM Energy Profiling Vandana Prabhu, BWRC Student Tensilica Tools The Berkeley Wireless Research Center


Download ppt "Chris Savarese, Yashesh Shroff, Greg Lawrence"

Similar presentations


Ads by Google