Design Exploration of a Human-machine Interface (HMI) Application Francis Li Sam Madden.

Design Exploration of a Human-machine Interface (HMI) Application Francis Li Sam Madden

The Application Data glove interface –Wired, bulky SmartDust scenario –A mote on each fingertip Investigate implementations Explore design alternatives

Proof-of-Concept Prototype By SmartDust group –Atmel AVR Microprocessor –RFM TR1000 Radio –6 accelerometers –Host PC performs processing Analysis –Power: 45 mW measured –Continuous operation of processor, accelerometers, communication with host

Application Analysis Processing (on PC) –Do 20 times per second, for each accelerometer Read in X and Y samples (10 bits each) Compute rolling average to smooth input data Convert averages to polar coordinates –Dominates cost: sqrt, acos, atan –Secondary cost: floating point operations –Periodically, calculate gesture via simple template matching (static hand positions)

Application Analysis (cont) Communication (from Atmel to PC) –20 samples / sec 6 accelerometers 4 bytes/sample  480 bytes/sec –115.6 kb/sec RF link –Radio = 12mA @ 3V, when transmitting  1.2 mW for radio alone Real world power >> 1.2 mW, due to software and analog overhead ( real world analysis later )

Optimization Process Match Application to HW

Optimization Process Match Application to HW Match Hardware to Application

Optimization Process Match Application to HW –Local computation to reduce communication Match Hardware to Application

Optimization Process Match Application to HW –Local computation to reduce communication –Floating point  Fixed Point Match Hardware to Application

Optimization Process Match Application to HW –Local computation to reduce communication –Floating point  Fixed Point Match Hardware to Application –Distributed vs. Centralized

Optimization Process Match Application to HW –Local computation to reduce communication –Floating point  Fixed Point Match Hardware to Application –Distributed vs. Centralized –TI vs. Atmel

Optimization Process Match Application to HW –Local computation to reduce communication –Floating point  Fixed Point Match Hardware to Application –Distributed vs. Centralized –TI vs. Atmel –DSP

Communication vs.Computation Estimates of local processing cost on Atmel (via simulation of GCC program) Average: 2223 instr. x 2 CalcPolar: 19017 instr.  2.83x10 6 instructions Report gesture once per second FindGestureError: 5444 instr. 10 gestures, 6 accelerometers  5444 60  3.26x10 5 instr. Memory operations are 2 cyles/instruction Total cycles ~ 3.7M  4Mhz  13.5 mW Communication = 8 bits/sec  negligible cost Loop 620 / sec

Communication vs.Computation 2 Cost of communication to Host PC (measured) 4317 nJ/bit From Culler, Hill, Szewczyk, Woo, “System Architecture For Networked Sensors.”  4317nJ/bit 480 bytes/sec 8 = 16.57 mW Processor still sucks power –Current implementation requires 13.5mW –Using sleep, only 1.17 mW  17.74 mW total

Distributed vs. Centralized Move some processing to each sensor –6 processors Each computing average, polar transform Transmitting 4 x 8 = 32bits once/second Using Atmel processor on each mote –Computation ~.5M cycles/sec  2mA @ 2.7V  5.4mW –Communication Very small: 4317nJ 32 =.13 mW –5.53 mW/mote = 33.2 mW total (Bad Idea!)

TI Microcontroller Evaluation A microcontroller with better specs –MSP430P112 330  A/Mhz active mode 1.5  A standby (6 ns wakeup) Used IAR Systems compiler, profiler, development environment Analysis –Centralized 3.3V, 4 Mhz: 3.8 mW –Distributed 2.5V, 1 Mhz: 0.48 mW per mote Six processors  2.9 mW

TI DSP Evaluation TMS320C54x Used TI Code Composer Studio, compiler, simulator Power –Active Mode, 3.3V 10 Mhz: 33 mW –IDLE1, 0.36 mW Analysis –Centralized: 7.8 mW –Distributed: 1.6 mW per mote Six processors = 9.6 mW total

TI DSP Evaluation Part 2 TMS320C55x (two parallel MACs) Same tools, with C55x compiler, simulator Power: No details available... –Advertised: 0.9V, 0.05 mW/Mhz Analysis –Centralized: 1170240 cycles (vs 2290440 54x) 2 Mhz: 0.1 mW –Distributed: 195040 cycles (vs 381740 54x) 1 Mhz: 0.05 mW Six processors: 0.3 mW total

Other Explorations Hand optimized code –Possible to massively reduce computation cost –FP/Transcendentals conspicuously painful –Outside scope of our exploration Radio Hardware –Bluetooth ~ 100 times more efficient Reconfigurable Computing Other circuitry (e.g. accelerometers)

Results Summary Cost, in mW of various implementations 17.74 using sleep mode, 28 without 31/104 % improvement with same hardware 170x improvement with new hardware

Conclusions By finding better mappings from SW  HW  Application, big performance gains are possible. Effective use of local processor resources can reduce communication overheads, which are significant. DSPs and other specialized processors can be a big win and don’t require hand-coded assembly or reconfigurable design

Design Exploration of a Human-machine Interface (HMI) Application Francis Li Sam Madden.

Similar presentations

Presentation on theme: "Design Exploration of a Human-machine Interface (HMI) Application Francis Li Sam Madden."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Design Exploration of a Human-machine Interface (HMI) Application Francis Li Sam Madden.

Similar presentations

Presentation on theme: "Design Exploration of a Human-machine Interface (HMI) Application Francis Li Sam Madden."— Presentation transcript:

Similar presentations

About project

Feedback