SimGate: Full-System, Cycle-Close Simulation of the Stargate Sensor Network Intermediate Node Ye Wen, Selim Gurun, Navraj Chohan, Chandra Krintz, Rich.

Slides:



Advertisements
Similar presentations
Combining Statistical and Symbolic Simulation Mark Oskin Fred Chong and Matthew Farrens Dept. of Computer Science University of California at Davis.
Advertisements

Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
TOSSIM A simulator for TinyOS Presented at SenSys 2003 Presented by : Bhavana Presented by : Bhavana 16 th March, 2005.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Accurate Emulation of Wireless Sensor Networks Hejun Wu Joint work with Qiong Luo, Pei Zheng*, Bingsheng He, and Lionel M. Ni Department of Computer Science.
G Robert Grimm New York University Disco.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
TinyOS Software Engineering Sensor Networks for the Masses.
1-1 Embedded Software Development Tools and Processes Hardware & Software Hardware – Host development system Software – Compilers, simulators etc. Target.
PRASHANTHI NARAYAN NETTEM.
TOSSIM: Visualizing the Real World Philip Levis, Nelson Lee, Dennis Chi and David Culler UC Berkeley NEST Retreat, January 2003.
Lecture 7 Lecture 7: Hardware/Software Systems on the XUP Board ECE 412: Microcomputer Laboratory.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Replay Debugging for Distributed Systems Dennis Geels, Gautam Altekar, Ion Stoica, Scott Shenker.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Sensor Network Simulation Simulators and Testbeds Jaehoon Kim Jeeyoung Kim Sungwook Moon.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Avrora Scalable Sensor Simulation with Precise Timing Ben L. Titzer UCLA CENS Seminar, February 18, 2005 IPSN 2005.
EstiNet Network Simulator & Emulator 2014/06/ 尉遲仲涵.
Peter S. Magnusson, Magnus Crhistensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan Högberg, Frederik larsson, Anreas Moestedt. Presented.
Introduction to USB Development. USB Development Introduction Technical Overview USB in Embedded Systems Recent Developments Extensions to USB USB as.
Computer System Architectures Computer System Software
Intelligent Shipping Container Project IMPACT & INTEL.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
University of Maryland Compiler-Assisted Binary Parsing Tugrul Ince PD Week – 27 March 2012.
Slides created by: Professor Ian G. Harris Test and Debugging  Controllability and observability are required Controllability Ability to control sources.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania
1 A System for Simulation, Emulation, and Deployment of Heterogeneous Wireless Sensor Networks Lewis Girod, Thanos Stathopoulos, Nithya Ramanathan, Jeremy.
Cisco S2 C4 Router Components. Configure a Router You can configure a router from –from the console terminal (a computer connected to the router –through.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Overview of Sensor Networks David Culler Deborah Estrin Mani Srivastava.
IPDPS 2005, slide 1 Automatic Construction and Evaluation of “Performance Skeletons” ( Predicting Performance in an Unpredictable World ) Sukhdeep Sodhi.
Simulation of Distributed Application and Protocols using TOSSIM Valliappan Annamalai.
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Computer Architecture System Interface Units Iolanthe II approaches Coromandel Harbour.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
1 A Run-Time Feedback Based Energy Estimation Model for Embedded Systems Selim Gürün Chandra Krintz Department of Computer Science U.C. Santa Barbara International.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Harmony: A Run-Time for Managing Accelerators Sponsor: LogicBlox Inc. Gregory Diamos and Sudhakar Yalamanchili.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Xiong Junjie Node-level debugging based on finite state machine in wireless sensor networks.
Full and Para Virtualization
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
In-Network Query Processing on Heterogeneous Hardware Martin Lukac*†, Harkirat Singh*, Mark Yarvis*, Nithya Ramanathan*† *Intel.
1 Software Reliability in Wireless Sensor Networks (WSN) -Xiong Junjie
Source Level Debugging of Parallel Programs Roland Wismüller LRR-TUM, TU München Germany.
(1) SIMICS Overview. (2) SIMICS – A Full System Simulator Models disks, runs unaltered OSs etc. Accuracy is high (e.g., pollution effects factored in)
EPICS and LabVIEW Tony Vento, National Instruments
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Presenter: Yi-Ting Chung Fast and Scalable Hybrid Functional Verification and Debug with Dynamically Reconfigurable Co- simulation.
Goals: Provide a Full Range of Development Environments for Testing Goals: Provide a Full Range of Development Environments for Testing EmTOS: Bringing.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
Introduction to Operating Systems Concepts
Virtualization.
Simulation of Distributed Application and Protocols using TOSSIM
Agenda Why simulation Simulation and model Instruction Set model
I/O Systems I/O Hardware Application I/O Interface
Operating Systems Chapter 5: Input/Output Management
Bus-Based Computer Systems
Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Reverse engineering through full system simulations
Operating System Overview
Presentation transcript:

SimGate: Full-System, Cycle-Close Simulation of the Stargate Sensor Network Intermediate Node Ye Wen, Selim Gurun, Navraj Chohan, Chandra Krintz, Rich Wolski UC Santa Barbara IC-SAMOS 2006

2 Why Simulation? Sensor networks have unique characteristics —Resource-constrained, tiny devices —Heterogeneous, ad-hoc networks of thousands of nodes —Remote deployment locations Sensor network research requires substantial engineering, investment, and learning curve —Configuring/installing network devices a hassle —Many bugs not detected until run-time —HW lacks user-interface, debugging requires HW modification —Analyzing erroneous behavior not easy Simulation has significant advantages

3 Simulation + Provides a controlled environment —Explore new ideas with no physical deployment —Observe (and reproduce) hard-to-create behavior + Cost-effective solution —A single Mica-2 node ~ $125 (many needed in a real setup) —Sensors and sensor gateways significantly more expensive - Not the same as real-life execution —Simplifying model assumptions (e.g. in network, power models) —May not include all real world scenarios —May require that applications be recompiled

4 Our goals Simulate heterogeneous sensor networks —Including both intermediate gateway node (like Stargate) and basic sensor node (like motes) —Model and simulate the interaction between different nodes Scalable full-system simulation that runs applications transparently —Must boot and run the OS and all device drivers —Must communicate with other simulated devices in a network —The application should not be able to distinguish whether it is running on a simulated or a real sensor net Simulated devices run real code and interact in the same way as physical, deployed devices —Requires a model of radio interaction (hard!) —Requires accurate simulation/emulation of each (possibly heterogenous) device

5 Stargate Simulation Src: Crossbow, Inc. Stargate Block Diagram CPU: Arm v5TE instruction set with Xscale DSP extension. No thumb instruction set support yet Flash: Memory-mapped I/O. State machine based on Intel Verilog model. Estimate flash latency using empirical data Boots and runs Familiar Linux

6 Functional vs. Cycle-Close Simulation MMU/Pipeline simulation is expensive —7-8 stages, 3 parallel pipelines —32 Entry TLB, 128 entry Branch Target Buffer, 32KB cache and 8 entry fill-write buffer —Important for cycle-close simulation Not needed when cycle accuracy not a concern —Disable MMU simulation to improve simulation performance —Selectively turn off at compile time run time MMU simulator monitors HW performance monitors —Enabling/disabling HPM turns on/off MMU simulation

7 Machine Code Interpreter while( ! stop_sim) { instr=load_instr(cpc); //fetch evtq->fire(); //fire events mach()->get_sysIO()->do_cycle(); //IO cycle … switch(BITS(instr,20,27)) { //execute instructions } if (pipex_enable) { pipex->sim(instr); } else { evtq->advance_clock(3); }

8 Stargate-Mote Ensemble Mica-2 Mote simulation —Atmel processor —Serial interface —Packet radio —Boots and runs TinyOS Both simulators run applications transparently Currently implemented: a simple radio model Communication: —Stargate cannot communicate with Motes via Radio —It instead uses a serial connection

9 Ensemble Architecture

10 Multi-Simulation Manager Couples device simulators —Provides create, join, start, stop —Forks a thread for each simulator —A configuration file specifies which binary to boot Provides a unified debug interface —Manager dispatches debug commands to simulator threads —Watch changes/control execution flow Supports check-pointing —Threads save/reload current state on manager’s request —Improves booting time

11 Ensemble Synchronization Clock synchronization —Execution rates of simulators should be proportional to real devices —Lock-step method: synchronize clocks on each serial byte transfer period —Serial transfer rate: 57.6 Kbits/seconds (128 Mote cycles) Ensemble simulation requires clock synchronization to slowest simulation thread —Stargate simulator is the bottleneck (most complex) Communication —Packets assembled using receivers local clock —Packet rate: 19.2 Kbits/seconds

12 Methodology Validation: Simulation of Stargate and Mica-2 Motes working together —Standalone gateway scenario  A Stargate and a Mica-2 mote attached via UART —Packet forwarding engine scenario  A Stargate + Mica-2 gateway communicating to another Mica-2 via radio Benchmarks —Mediabench/Mibench to evaluate SimGate —Open source applications to evaluate SimGate/SimMote ensemble —Short/long forms

13 Full System Stargate Simulation ApplicationMeasured CyclesSimulated CyclesError and Margin Audio code demodulation (adpcm ) 3.07* * % +/- 0.36% Jpeg Decode2.55* * % +/- 1.16% Jpeg Encode5.41* * % +/- 0.41% Error rates are low: worst case 12.5% For many applications, measurements and simulations are statistically indistinguishable

14 Full System Stargate+Mote Simulation ApplicationMeasured Cycles Simulated Cycles Error and Margin Ping9.41* * % +/- 6.6% Localization2.13* * % +/- 1.5% Send-and-Store1.63* * % +/- 1.25% Blocking (similar to RPC) and non-blocking communication benchmarks Worst case: 3.6%

15 Full System Stargate+2Motes Simulation ApplicationMeasured CyclesSimulated CyclesError and Margin Ping3.23* * % +/- 2.9% Localization2.36* * % +/- 3.3% Send-and-Store1.89* * % +/- 2.4% Worst case: 3.5% Error margins slightly larger than 1-Mote case in average: 2.1% vs. 2.4%

16 Execution Performance Slowdown with respect to execution time in a real device

17 Related Work Sensor Network Simulation —ATEMU, Avrora  Full system, multi-simulation, lock-step synchronization  No sensor network gateway support —EmTos  A wrapper library for TOSSIM and EmStar  All applications must be recompiled to host machine code and linked to EmTos Other Simulation —Skyeye  Full system ARM emulator including LCD and debugger  Not intended for sensor networks and multi-simulation

18 Summary Real sensor network environment not attractive during application development phase —Physical deployment challenges —Debugging difficulties Simgate provides full-system, functional and cycle- close simulation without any code modification —Cycle estimation error: 9% —Simgates and simmotes: 4% 20X slower than real device 58X slower when cycle close simulation enabled

19 Current and Future Work Scalability —Simulate large scale network using cluster computers —Partial results with only basic nodes (DiSenS); investigating the support for Stargate Radio model —Under development but a really hard problem —If the community develops one first, we will incorporate it Power model —A significant requirement for sensor devices —Planned for near future Debugging and IDE —Ongoing work: S 2 DB to debug the complete network —An IDE to build for developing applications easily

20 Questions?

21 Stargate Simulation Xscale processor —Arm v5TE instruction set with Xscale DSP extensions  No thumb instruction set support yet —MMU, GPIO, interrupt controller, real-time clock Flash memory —Memory-mapped I/O —State machine based on Intel Verilog model —Estimate Flash latency using empirical data Wireless card Serial interface Boots and runs Familiar Linux