Avrora Scalable Sensor Simulation with Precise Timing Ben L. Titzer UCLA CENS Seminar, February 18, 2005 IPSN 2005
Background - WSNs Wireless Sensor Networks –Microcontroller and battery powered –Wireless communication –Event-driven programming model –Programmed with TinyOS and nesC Mica2 Dot - based on Atmel AVR microcontroller
Background - Microcontrollers Microcontrollers are small –128KB code, 4KB RAM, 4KB EEPROM –Processor, memory, IO on a single chip –4 - 16mhz clockspeed –Interrupt-driven programming model –No operating system
Motivation Developing sensor software is hard –Constrained resources, bare hardware –Narrow interface for debugging –Delicately timed driver code –Distributed communication Precise measurements are difficult Current tools do a poor job –TOSSIM, AtEmu
The Question Can we achieve simulation of entire sensor networks?
The Question Can we achieve simulation of entire sensor networks? --[1] And make it precise?
The Question Can we achieve simulation of entire sensor networks? --[1] And make it precise? --[2] And make it flexible?
The Question Can we achieve simulation of entire sensor networks? --[1] And make it precise? --[2] And make it flexible? --[3] And make it fast?
The Question Can we achieve simulation of entire sensor networks? --[1] And make it precise? --[2] And make it flexible? --[3] And make it fast? --[4] And make it scalable?
The Goals of Avrora Build a simulator for sensor networks –Cycle accurate –Energy accurate –Simulates sensor devices –Scales to large sensor networks Allow detailed profiling and instrumentation
[1] Precision Can we make it precise? –Instruction-level simulation –Cycle accurate –Accurate device models –Accurate radio / interference model –Well-known
[2] Flexibility Can we make the simulator flexible? –Well-designed software architecture –Clear interfaces –Implemented in Java, object-oriented –Instrumentation infrastructure “Nonintrusive Precision Instrumentation of Microcontroller Software” submitted to LCTES 2005
Avrora Software Architecture Microcontroller Simulator Interpreter IOReg interface SPI Timer Ports On-chip devices Event queue interface RadioLEDs Pin interface Off-chip devices Platform On-chip devices are controlled by the program through IOReg objects Off-chip devices are controlled through individual pins or through UART and SPI interfaces Time-triggered behavior is accomplished by inserting events into the event queue SPI interface
[3] Speed - Event Queue How can we achieve speed while retaining cycle accuracy? –Naïve implementation scales poorly –Event interface simplifies devices –Better performance –Key to achieving parallelism for sensor network simulations
Event Queue Illustration Timer0EventProfilingEventUARTEvent Simulator Interpreter DeltaQueue Interpreter tracks cycles consumed by each instruction Decrement head of queue and fire event(s) when necessary Retains cycle accuracy Allows for sleep optimization
Single-node Performance
[4] Scalability Sensor networks have many nodes (10’s-1000’s) Software controlled radios Micro-second level interactions High-fidelity simulation needed for precise measurements
The AtEmu Approach Introduce global clock Step all nodes one clock cycle at a time Compute radio waveform (bit level) Problems: –Slow –Scales poorly - O(n^2) interactions –No parallelism
Observations Communication has latency Nodes only influence each other through communications Other than that, nodes run in parallel Hmm….
Parallel Simulation Allow all nodes to run in parallel –One thread per node –Extends single-node simulation to network –Better overall simulation performance New Problem: –Synchronization necessary to preserve timing and order of communications –Efficient solutions?
Send-Receive Problem Nodes send bytes to each other –No node should be allowed to run too far ahead of other nodes that might try to send a byte to it Node A Node B T=k T=k+L Receive A1 T=0 Send A1 Node B should never be more than L cycles ahead of A
Sampling Problem Nodes can sample current radio traffic –Sample cannot be computed until all possible transmitters have passed the time when sampling was begun Node A Node B T=k T=k+S RSSI T=0 Send A1 Node B cannot complete sample until node A passes time k
Reality RSSI sampled infrequently Nodes both send and receive Latency L to send a byte on mica2 – hz / 2400bps = 3072 cycles Sampling time S to estimate RSSI –13 ADC cycles * 64 = 832 cycles
Two Approaches Synchronization Intervals –Threads can’t run too far ahead –Period has to be smaller than L –Utilize event queue of each simulator Wait for Neighbors –Each thread waits for neighbors when necessary (sample or receive) –Requires fast global data structure Avrora uses both
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L A B C E D Network Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI A B C E D Network Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI A B C E D Network Send C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI A B C E D Network Send C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI A B C E D Network Send C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI A B C E D Network Send C1 Send C2 C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI C1 A B C E D Network Send C1 Send C2 C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI C1 RSSI A B C E D Network Send C1 Send C2 RSSI C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI C1 RSSI A B C E D Network Send C1 Send C2 RSSI C1 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI Send E1 C1 RSSI A B C E D Network Send C1 Send C2 RSSI C2+E1 C1C2 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI Send E1 C1 RSSI A B C E D Network Send C1 Send C2 RSSI C2+E1 C1C2 Delivery point Synchronization point Starting point
Synchronization Illustration Node A Node B Node C Node D Node E T=0T=1LT=2LT=3L RSSI Send E1 C1 RSSI A B C E D Network Send C1 Send C2 RSSI C2+E1 C1C2 Delivery point Synchronization point Starting point
Results - Scalability
Results - Parallelism
Measurements Accurate timing useful for –AEON: power and lifetime estimation –MAC layer tuning –Debugging driver code –Latency estimation for in-network processing –Real-time monitoring
Channel Utilization
Partial Preamble Loss Real radios take time to lock on –First few bits of transmission lost –Subsequent bytes misaligned –MAC software layer must compensate Latency L between transmission and first reception larger –Admits more concurrency in simulation
R: A2+A3 Adaptive Synchronization Assume first k [k l, k h ] bits lost of first bytes transmitted Latency for first byte is then: L f = L + k l * cycles bit Node A Node B T=kT=k+L R: A2+A3 T=0 S: A1 S: A2 S: A3 S: A4 T=k+2L T=k+3L klkl
Back to the Question Can we achieve simulation of entire sensor networks? --[1] And make it precise? yes --[2] And make it flexible? yes --[3] And make it fast? yes --[4] And make it scalable? yes
Future Work Performance Improvements –Sleeping nodes, M:N thread model –Single-node improvements Port to other mote platforms Co-simulation with real network Implement partial preamble loss –Measure properties of k
Acknowledgements NSF: money Jens Palsberg: patience Daniel Lee: device implementations Simon Han: testing, timing validation Olaf Lansiedel: AEON energy model CENS: access to a stupidly big Sun V880 machine Sun: for donating said machine