Programming Memory-Constrained Networked Embedded Systems

Slides:



Advertisements
Similar presentations
Adam Dunkels*, Oliver Schmidt, Thiemo Voigt*, Muneeb Ali**
Advertisements

Embedded System Lab. What is an embedded systems? An embedded system is a computer system designed for specific control functions within a larger system,
Architectures and Applications for Wireless Sensor Networks ( ) Node Programming Models Chaiporn Jaikaeo Department of Computer.
Run-Time Dynamic Linking for Reprogramming Wireless Sensor Networks
Overview: Chapter 7  Sensor node platforms must contend with many issues  Energy consumption  Sensing environment  Networking  Real-time constraints.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
MotoHawk Training Model-Based Design of Embedded Systems.
Chapter 13 Embedded Systems
Page 1 Building Reliable Component-based Systems Chapter 16 - Component based embedded systems Chapter 16 Component based embedded systems.
Embedded Network Controller with Web Interface Bradley University Department of Electrical & Computer Engineering By: Ed Siok Advisor: Dr. Malinowski.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
How to Code on TinyOS Xufei Mao Advisor: Dr. Xiang-yang Li CS Dept. IIT.
An Energy Consumption Framework for Distributed Java-Based Systems Chiyoung Seo Software Architecture Research Group University of Southern California.
Tiny OS Optimistic Lightweight Interrupt Handler Simon Yau Alan Shieh CS252, CS262A, Fall The.
第 1 /28 頁 Implementation LAN91c111-NE driver on Altera cyclone NIOS SoC development board 蕭詣懋 SoC EE CCU 5/23/2005 蕭詣懋
CSE Fall Introduction - 1 What is an Embedded Systems  Its not a desktop system  Fixed or semi-fixed functionality (not user programmable)
Embedded Real-time Systems The Linux kernel. The Operating System Kernel Resident in memory, privileged mode System calls offer general purpose services.
Chapter 13 Embedded Systems
Systems Programming Course Gustavo Rodriguez-Rivera.
Threading Abstractions for Embedded Operating Systems on Memory Constrained Devices Andrew Barton-Sweeney September 21, 2006.
1 Chapter 13 Embedded Systems Embedded Systems Characteristics of Embedded Operating Systems.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Asst.Prof.Dr.Ahmet Ünveren SPRING Computer Engineering Department Asst.Prof.Dr.Ahmet Ünveren SPRING Computer Engineering Department.
01 Introduction to Java Technology. 2 Contents History of Java What is Java? Java Platforms Java Virtual Machine (JVM) Java Development Kit (JDK) Benefits.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
JAVA v.s. C++ Programming Language Comparison By LI LU SAMMY CHU By LI LU SAMMY CHU.
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
Computer Organization
A System Architecture for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister
ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 1 Introduction to Computer Science.
LWIP TCP/IP Stack 김백규.
Java Introduction Lecture 1. Java Powerful, object-oriented language Free SDK and many resources at
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
LWIP TCP/IP Stack 김백규.
The Contiki Operating System
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Developments in networked embedded system technologies and programmable logic are making it possible to develop new, highly flexible data acquisition system.
CS533 Concepts of Operating Systems Jonathan Walpole.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 2: Operating-System Structures.
System Architecture Directions for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister Presented by Yang Zhao.
Simon Han – Ram Kumar Rengaswamy – Roy Shea – Mani Srivastava – Eddie Kohler –
Application Block Diagram III. SOFTWARE PLATFORM Figure above shows a network protocol stack for a computer that connects to an Ethernet network and.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
System Architecture Directions for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister Presenter: James.
Xiong Junjie Node-level debugging based on finite state machine in wireless sensor networks.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Full and Para Virtualization
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Integration of Wireless Sensor Networks to the Internet of Things using a 6LoWPAN Gateway Integration of Wireless Sensor Networks to the Internet of Things.
1 Software Reliability in Wireless Sensor Networks (WSN) -Xiong Junjie
System Architecture Directions for Networked Sensors.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Why does it need? [USN] ( 주 ) 한백전자 Background Wireless Sensor Network (WSN)  Relationship between Sensor and WSN Individual sensors are very limited.
Contiki OS Sharvil Patel, Michael Ray, Emily Rowland 1.
Software Architecture of Sensors. Hardware - Sensor Nodes Sensing: sensor --a transducer that converts a physical, chemical, or biological parameter into.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Introduction to Operating Systems Concepts
Computer System Structures
Object Oriented Programming in
LWIP TCP/IP Stack 김백규.
Chapter 2: The Linux System Part 1
Outline Chapter 2 (cont) OS Design OS structure
Introduction to Computer Systems
Outline Operating System Organization Operating System Examples
System calls….. C-program->POSIX call
Lecture 12 Input/Output (programmer view)
Presentation transcript:

Programming Memory-Constrained Networked Embedded Systems Adam Dunkels PhD thesis defense February 15, 2007

Embedded systems Things with computers that are not computers themselves Refrigerators, toys, industrial robots, ... 98% of all microprocessors go into embedded systems Embedded systems are everywhere! 50% much smaller than PC microprocessors 8-bit microprocessors 1024 bytes vs 1073741824 (~1 billion) bytes

Tiny microprocessors are huge

Networked, programming What if we could make them talk to each other? A wide range of new fascinating applications Memory-constraints make programming the small embedded systems a challenge Typical example: 60k ROM, 2k RAM

Programming – programming in the small Individual program language statements Not programming in the large Not software engineering

What I’ve done TCP/IP networking for memory-constrained networked embedded systems Developed two embedded TCP/IP stacks: lwIP, uIP Simplifying event-driven programming for memory-constrained systems Protothreads, a novel programming mechanism Per-process multi-threading for event-driven systems Loadable modules for embedded operating systems Developed an embedded operating system with loadable module support: Contiki

Results of this thesis TCP/IP for embedded systems Now possible to use in systems an order of magnitude smaller Trade-off: memory for performance Protothreads – a novel programming abstraction Decrease program complexity Very small memory & performance overhead Dynamically loadable modules in the Contiki operating system First system in the community to have this Energy overhead of dynamic linking low Significant impact Software used by 100+ companies world-wide, in research projects, university courses; the papers are published at high-caliber conferences, ...

The details...

Networked embedded systems Some embedded systems already talk to each other Wireless car keys, the TV remote, mobile phones, ... The vision: wireless sensor networks Sensing, processing, radio on a single device Enable new applications

Wireless sensor networks – Applications Environmental monitoring Follow contamination flows Habitat observation Oceanography

Wireless sensor networks – Applications Health monitoring of buildings Cracks in bridges Mix sensors right into the concrete … etc

Wireless sensor networks may be just a vision ... ... but networked embedded systems are a reality!

The networked refrigerator Dave Hudson, principal software engineer for Ubicom Ltd, 26 September 2001: “Actually, refrigerators are probably one of the most network- connected appliances I know Not domestic refrigerators, but the commercial type that supermarkets use We’ve supplied tens of thousands of RS485-connected control and monitoring systems for such refrigerators This market is now headed towards Ethernet and TCP/IP connectivity because it has a tremendous benefits in terms of manageability and interoperability between different suppliers' equipment.”

TCP/IP for memory-constrained networked embedded systems 1 TCP/IP for memory-constrained networked embedded systems

Traditional TCP/IP stacks are large Linux TCP/IP stack 100k code, 400k RAM µCLinux kernel 400k code, 1 megabyte RAM 60k ROM, 2k RAM...

µIP – Bottom-up approach Unconventional design Bottom-up design Single packet buffer Event-driven application interface Network IP ICMP UDP TCP Application

µIP results 5k code, 100 bytes – 2k RAM RFC compliant TCP, UDP, IP An order of magnitude smaller than existing work RFC compliant TCP, UDP, IP Possible contrary to conventional wisdom Single-segment design of µIP unfortunate interaction with TCP’s delayed ACK mechanism

But: ability to communicate more important than throughput µIP trades memory for throughput Low memory usage, low throughput Small systems: not that much data Example – CubeSat: µIP with 100 bytes buffer 9600 bps RF link

Event-driven “In TinyOS, we have chosen an event model so that high levels of concurrency can be handled in a very small amount of space. A stack-based threaded approach would require that stack space be reserved for each execution context.” J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, and K. Pister. System architecture directions for networked sensors. [ASPLOS 2000]

Problems with the event-driven model? “This approach is natural for reactive processing and for interfacing with hardware, but complicates sequencing high-level operations, as a logically blocking sequence must be written in a state- machine style.” P. Levis, S. Madden, D. Gay, J. Polastre, R. Szewczyk, A. Woo, E. Brewer, and D. Culler. The Emergence of Networking Abstractions and Techniques in TinyOS. [NSDI 2004]

Simplifying event-driven programming of memory-constrained systems 2 Simplifying event-driven programming of memory-constrained systems

Threads vs events… Very much like programming with GOTOs Threads: sequential code flow Events: unstructured code flow Very much like programming with GOTOs

Explicit state machines for flow control The problem: using explicit state machines for flow control Created ad hoc by the programmer No formal specification Must be inferred from reading code Very much like using GOTOs

Contiki: Combining event-driven and threads Event-based kernel Low memory usage Single stack Multi-threading is a library For those applications that needs it One thread, one extra stack The first system in the sensor network community to do this

However... Threads still require stack memory Unused stack space wastes memory 200 bytes out of 2048 bytes is a lot! A multi-threading library very difficult to port Requires use of assembly language Hardware specific Platform specific Compiler specific

Protothreads: A new programming abstraction A design point between events and threads Programming primitive: conditional blocking wait PT_WAIT_UNTIL(condition) Single stack Low memory usage, just like events Sequential flow of control No explicit state machine, just like threads Programming language helps us: if and while

An example protothread int a_protothread(struct pt *pt) { PT_BEGIN(pt); PT_WAIT_UNTIL(pt, condition1); if(something) { PT_WAIT_UNTIL(pt, condition2); } PT_END(pt); /* … */ /* … */ /* … */ /* … */

Proof-of-concept implementation of protothreads in ANSI C Implementation pure ANSI C Uses the C preprocessor No need for a special preprocessor No assembly language Very portable Nothing is changed between platforms, C compilers However, two deviations from mechanism Automatic variables not stored across blocking waits Limitations on the use of switch statements

How well do protothreads work?

Reduction of complexity States before States after Transitions before Transitions after Reduction in lines of code XNP 25 20 32% TinyDB DBBufferC 23 24 24% Mantis CC1000 driver 15 19 23% SOS CC1000 driver 26 9 32 14 16% Contiki TR1001 driver 12 3 22 49% uIP SMTP client 10 45% Contiki codeprop 6 4 11 29% Explicit flow-control state machines could be almost completely removed Found state machine-related bugs in two of the programs when rewriting with protothreads

Execution time overhead is a few cycles State machine (CPU cycles) Protothreads (CPU cycles) gcc -Os 92 98 gcc –O1 91 94 Contiki TR1001 radio driver average execution time, MSP430 CPU cycles Overhead: 3 – 6 CPU cycles Protothreads useful even in time-critical code

Now we can program... But how do we get the programs onto the devices?

Loadable modules in the Contiki operating system 3 Loadable modules in the Contiki operating system

Traditional reprogramming Physically attach to the device Provide a special voltage to the chip Rewrite the memory of the chip Do this for all your devices out there What if we have 100 devices in 100 buildings? 10000 devices...

Transmitting programs over the network Load the software

Traditional systems: entire system a monolithic binary Most systems statically linked at compile-time Entire system is a monolithic binary Makes code smaller But: hard to change Must re-upload entire system

Contiki: run-time loadable program modules Core resident in memory Programs know the core The core do not know the programs Individual programs can be loaded/unloaded The first system in the sensor network community to do this Core

Can we use a standard mechanism for the dynamic loading? Can we do dynamic loading the ”Linux” way in Contiki? Despite the resource constraints Run-time linking of ELF files Availability of tools, knowledge If we could, what would the overhead be? Compared to a tailored loading mechanism Compared to virtual machines

In comparison: two virtual machines CVM – Contiki VM A stack-based, typical virtual machine A compiler for a subset of Java The leJOS Java VM Adapted to run in ROM Executes Java byte code Bundled .class files

Memory footprint is small ROM size of dynamic linker ~ 2k code ~ 4k symbol table Full Contiki system, automatically generated ELF loading feasible for memory-constrained systems Module ROM RAM Tailored loader 670 CVM 1344 8 ELF loader 5694 78 Java VM 13284 59

Quantifying the energy consumption Measure the energy consumption: Radio reception, measured on CC2420, TR1001 Better estimate based on average Deluge overhead Storing data to EEPROM Linking, relocating object code Loading code into flash ROM Executing the code Two platforms: ESB, Telos Sky (both MSP430)

Energy consumption of the dynamic linker

Loading, linking native code vs virtual machine code ELF (mJ) CVM (mJ) Java (mJ) Reception 29 2 22 Storing Linking 3 Loading 1 5 Total 35 27 Energy consumption in mJ for loading an object tracking application

Execution time overhead Computationally “heavy” code 8x8 vector convolution Code that use a native code library Object tracking application Most of the code is spent running native code Energy per iteration (µJ) Native 0.75 CVM 65 Java 73 Energy per iteration (µJ) Native 0.54 CVM 0.95 Java 2.0

Break even points, vector convolution “ELF16”

Break-even points, object tracking “ELF16”

Wrapping up

Future work Investigating the memory requirements/performance trade-off More memory = better performance? Single-buffer approach for other communication mechanisms Bottom-up approach to build other programming abstractions High-level sensor network programming

Conclusions Results TCP/IP for memory-constrained systems Protothreads: simplifies event-driven programming Dynamic loading/linking of code modules Low-complexity mechanisms for low-complexity systems Simple in hindsight! But it takes a lot of hard work to get there Some interesting future work ahead of us

The end of my part

Background – the TCP/IP stack Network IP ICMP UDP TCP Application UDP – best-effort datagrams TCP – connection oriented, reliable byte-stream, full-duplex Flow control, congestion control, etc IP – best-effort packet delivery Forwarding, fragmentation The hard parts are IP and TCP

The secrets of µIP Shared packet buffer Lower throughput Event-driven application programming interface

The secrets of µIP part I – A shared packet buffer All packets – both outbound and inbound – use the same buffer Size of buffer determines throughput Outbound packet Incoming packet Packet buffer

The secrets of µIP part I – A shared packet buffer II Implicit locking: single-threaded access Grab packet from network – put into buffer Process packet Put reply packet in the same buffer Send reply packet into network Packet buffer

The secrets of µIP part II – Throughput µIP trades throughput for RAM Low RAM usage = low throughput Small systems = not that much data! Ability to communicate more important than throughput!

The smallest µIP configuration (that I know of) CubeSat kit by Pumpkin Inc Pico satellite construction kit 128 bytes of RAM for µIP

The secrets of µIP part III – Application Programming Interface I µIP does not have BSD sockets BSD sockets are built on threads Threads induce overhead (RAM) Instead – event-driven API Execution is always initiated by µIP Applications are called by µIP, call must return Protosockets – BSD socket-like API based on protothreads

The secrets of µIP part III – Application Programming Interface II void example2_app(void) { struct example2_state *s = (struct example2_state *)uip_conn->appstate; if(uip_connected()) { s->state = WELCOME_SENT; uip_send("Welcome!\n", 9); return; } if(uip_acked() && s->state == WELCOME_SENT) { s->state = WELCOME_ACKED; if(uip_newdata()) { uip_send("ok\n", 3); if(uip_rexmit()) { switch(s->state) { case WELCOME_SENT: uip_send("Welcome!\n", 9); break; case WELCOME_ACKED: uip_send("ok\n", 3); }

The secrets of µIP part III – Application Programming Interface III Event-driven API sometimes is problematic Not all programs are well-suited to it Programs are explicit state machines Protosockets: sockets-like API using protothreads Extremely lightweight stackless threads 2 bytes per-thread state, no stack Protothreads allow “blocking” functions, even when called from µIP

The secrets of µIP part III – Application Programming Interface IV PT_THREAD(smtp_protothread(void)) { PSOCK_BEGIN(s); PSOCK_READTO(s, '\n'); if(strncmp(inputbuffer, “220”, 3) != 0) { PSOCK_CLOSE(s); PSOCK_EXIT(s); } PSOCK_SEND(s, “HELO ”, 5); PSOCK_SEND(s, hostname, strlen(hostname)); PSOCK_SEND(s, “\r\n”, 2); if(inputbuffer[0] != '2') {

The secrets of µIP part III – Application Programming Interface V API built from the bottom (network) and up Protothreads and protosocket API provides sequential programming Less overhead than “real” threads and the “real” socket API

Threads require per-thread stack memory Four threads, each with its own stack Thread 1 Thread 2 Thread 3 Thread 4

Events require one stack Threads require per-thread stack memory Four event handlers, one stack Four threads, each with its own stack Thread 1 Thread 2 Thread 3 Thread 4 Stack is reused for every event handler Eventhandler 1 Eventhandler 4 Eventhandler 2 Eventhandler 3

Protothreads require one stack Threads require per-thread stack memory Four protothreads, one stack Four threads, each with its own stack Thread 1 Thread 2 Thread 3 Just like events Thread 4 Events require one stack Four event handlers, one stack Protothread 1 Protothread 4 Protothread 3 Protothread 2

Six-line implementation Protothreads implemented using the C switch statement struct pt { unsigned short lc; }; #define PT_INIT(pt) pt->lc = 0 #define PT_BEGIN(pt) switch(pt->lc) { case 0: #define PT_EXIT(pt) pt->lc = 0; return 2 #define PT_WAIT_UNTIL(pt, c) pt->lc = __LINE__; case __LINE__: \ if(!(c)) return 0 #define PT_END(pt) } pt->lc = 0; return 1

Code footprint Average increase ~200 bytes Inconclusive

What’s wrong with using state machines? There is nothing wrong with state machines! State machines are a powerful tool Amenable to formal analysis, proofs But: state machines typically used to control the logical progam flow in many event-driven programs Like using gotos instead of structured programming The state machines not formally specified Must be infered from reading the code These state machines typically look like flow charts anyway We’re not the first to see this Protothreads: use language constructs for flow control

Why not just use multithreading? Multithreading the basis of (almost) all embedded OS/RTOSes! WSN community: Mantis, BTNut (based on multithreading); Contiki (multithreading on a per-application basis) Nothing wrong with multithreading Multiple stacks require more memory Networked = more concurrency than traditional embedded Can lead to more expensive hardware Preemption Threads: explicit locking; Protothreads: implicit locking Protothreads are a new point in the design space Between event-driven and multithreaded

Implementing protothreads Modify the compiler? There are many compilers to modify… (IAR, Keil, ICC, Microchip, GCC, …) Special preprocessor? Requires us to maintain the preprocessor software on all development platforms Within the C language? The best solution, if language is expressive enough Possible?

Are protothreads useful in practice? We know that at least thirteen different embedded developers have adopted them AVR, PIC, MSP430, ARM, x86 Portable: no changes when crossing platforms, compilers MPEG decoding equipment, real-time systems Others have ported protothreads to C++, Objective C Probably many more From mailing lists, forums, email questions Protothreads recommended twice in embedded “guru” Jack Ganssle’s Embedded Muse newsletter