Download presentation
Presentation is loading. Please wait.
1
Chris Savarese, Yashesh Shroff, Greg Lawrence
MAP ART Mapping Architectural Properties to an Algorithm for Redundant Triangulation Chris Savarese, Yashesh Shroff, Greg Lawrence Advisor: Dr. Jan Rabaey April 27, 2000 CS252
2
Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work
3
Introduction Goal: Given a basic localization algorithm, explore architectural alternatives for the minimization of energy consumption. The concept of localization Energy saving techniques What we did…
4
Outline Introduction Background Time and Energy Profiling Background
Parallel Architectures Conclusions: Our Dream Architecture Future Work Background
5
The Localization Algorithm
U N2 N1 N3 N1(x1,y1,z1) N2(x2,y2,z2) U (x,y,z) N3(x3,y3,z3) (x1-xn) (y1-yn) (z1-zn) (xn-1-xn) (yn-1-yn) (zn-1-zn) . .. x y z = b1 bn-1 Am3 U31 Bn-11 [Am3] [Qm3] ·[R33] Solve: U = R-1QT b QRdcmp()
6
The StrongARM Architecture
Power: 200mW, 0.25m, 1.5V Clock Speed: 200 MHz Cache: 16 KB I-cache 8 KB D-cache 32-way set-associative, round-robin replacement 512B, 2-way Minicache 31/16 GPR (32-bit) Auto-increment addressing No FP processor MAC
7
The Tensilica Xtensa Architecture
Processor Configuration Power: 200mW, 0.25 m, 1.5V Clock Speed: 170 MHz Cache: 16 KB I-cache 16 KB D-cache Direct mapped 32 Registers (32-bits) Xtensibility Use of TIE instructions No FP processor Zero overhead loops
8
Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work Time and Energy Profiling
9
Profiling Results Profiler Output: StrongARM Processor 68J
_fmul % % % lubksb % % % _fneq % % _fdiv % % _fmul % % _frsb % % StrongARM Processor 68J Xtensa Processor 144J Floating Point Energy = nom. core power #cycles clock period
10
Fixed Point Arithmetic
Floating Point vs. Fixed Point Add / Sub are straightforward Multiply / Divide require shifting Why can we use it for localization? Low accuracy requirements Limited range in measurements (< 10m) Small matrices small error propagation S E Mantissa
11
Fixed Point Profiling Results
Profiler Output: _fmul % % % lubksb % % % _fneq % % _fdiv % % _fmul % % _frsb % % StrongARM Processor 68J Xtensa Processor 144J Floating Point StrongARM Processor 43J Xtensa Processor 69J Fixed Point (37% less) (52% less) Energy = nom. core power #cycles clock period
12
Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work Parallel Architectures
13
Parallel Architectures
- Write sequential code in Matlab - Extract data-dependencies - Workload analysis CP1 CP2 CP3 P
14
Outline Introduction Background Time and Energy Profiling
Parallel Architectures Conclusions: Our Dream Architecture Future Work Conclusions: Our Dream Architecture
15
Our Dream Architecture
Floating point hardware MAC hardware Zero overhead loops Auto increment Register file size Cache Direct mapped
16
Future Work FPGA implementation Xtensa customizations TIE instructions
Floating Point Coprocessor Realistic algorithm for PicoRadio
17
Many Thanks To… Dr. Bart Kienhuis, EECS Post Doc
Ptolemy and other tools: Parallel issues Fred Burghardt, BWRC Technical Staff PicoRadio Testbed Marlene Wan, BWRC Student StrongARM Energy Profiling Vandana Prabhu, BWRC Student Tensilica Tools The Berkeley Wireless Research Center
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.