Download presentation
Presentation is loading. Please wait.
Published byAlvin Ramsey Modified over 6 years ago
1
DFuse and MediaBroker: System support for sensor-based distributed computing
Kishore Ramachandran ( College of Computing Georgia Tech Colleagues: Rajnish Kumar, Bikash Agarwalla, Junsuk Shin, David Hilley, Dave Lillethun, Jin Nakazawa, Bin Liu, Xiang Song, Nova Ahmed, Seth Horrigan, Matt Wolenetz, Arnab Paul, Sameer Adhikari, Ilya Bagrak, Martin Modahl, Phil Hutto Anticipated presentation duration: 1 minute TODO: Apply a different specific design template/icons? TODO: Proof credits: Students: Rajnish Kumar, Matt Wolenetz, Ilya Bagrak, Martin Modahl, Bikash Agarwalla, Junsuk Shin, Arnab Paul, Sameer Adhikari, Nissim Harel, Hasnain Mandviwala, Yavor Angelov, David Hilley List faculty/industry participants? Funding acks: “The work has been funded in part by an NSF ITR grant CCR , NSF grant CCR , HP/Compaq Cambridge Research Lab, the Yamacraw project of the State of Georgia, and the Georgia Tech Broadband Institute. The equipment used in the experimental studies is funded in part by an NSF Research Infrastructure award EIA , and Intel Corp.” Full abstract text (just in case, from Oct 9 iteration 2 from Kishore): That the future of information technology will be dominated by invisible or pervasive computing is a belief that is being shared by several research groups. We focus on an important problem in this space, namely, efficient system support for the distributed heterogeneous computing elements that make up this environment. We address the interactive, dynamic, and stream-oriented nature of this application class and develop appropriate system support. The DFuse framework provides a data fusion API along with an algorithm enabling application directed energy-aware role assignment to the nodes of a sensor network. The MediaBroker framework supports type-aware data transport with capabilities for data transformations and type extension. These abstractions have been implemented on top of D-Stampede distributed programming system. In this talk I will present elements of the D-Stampede programming system and the DFuse and MediaBroker frameworks. I will also present preliminary results of the system support we have built so far both from a programming ease as well as a performabnce standpoint.
2
Computing/Communication Continuum
Sensor Network HPC resources High connectivity Low connectivity / Wireless Cameras, sensor nodes High Performance Computing (HPC) resources Ambient Computing Infrastructure
3
What does this enable? Application context Sample applications
distributed sensors with varying capabilities control loop involving sensors, actuators rapid response time at computational perception speeds Sample applications Video based surveillance Transportation Emergency Response Collaborative search and rescue Evacuation management “Aware” Environments Anticipated presentation duration: 1.5 minutes TODO: tweak if necessary to address added content Verbatim notes from Purdue talk: [This slide animates, revealing some sample applications.] From abstract: We address the interactive, dynamic, and stream-oriented nature of this application class and develop appropriate system support. TODO: leverage mixed metaphor? Killer app/predatory octopus … Possible TODO: pick one of these (or some other app) and elaborate, with pictures, etc… The idea here is to present a high-value application with minimal contemporary support, and with requirements that MediaBroker and DFuse satisfy. Urgency for power management needs to be a characteristic. Scenarios with lots of computational resources, critically necessary results and rapid response should apply to the chosen application. In this draft, I chose to focus on the larger application requirements that our systems target. --
4
Application Characteristics
Physically distributed heterogeneous devices Interfacing and integrating with the physical environment Diverse stream types (low to high BW) Diverse computation, communication and power capabilities (from embedded sensors to clusters) Stream fusion/transformation, with loadable code Resource scarcities Dynamic join/leave of application components TODO: tweak to address added content! (and repeat outline during segues and in overall conclusion) Anticipated presentation duration: 2 minutes [This slide has animation, revealing the 2 projects presented.] TODO: Very wordy. Make exciting and improve flow. Verbatim notes from Purdue talk: DFuse: addresses (primarily) dynamic mapping of application fusion points to available resources (infrastructure adaptation) MediaBroker: addresses (primarily) data distribution for this set of characteristics TODO: is MB better described primarily as a discovery and transformation middleware that facilitates data distribution? --
5
Key Requirements Middleware Network Level Ambient HPC resources
Programming infrastructure Distributed data fusion Stream data management Network Level Protocol stack Energy efficient routing Ambient HPC resources Grid Stampede DFuse MediaBroker SensorStack Streamline
6
Stampede (TPDS 2003) Distributed programming system covering the continuum Temporal stream data transport Multilingual (C, C++, Java) program components sharing data abstractions Multiple platforms (x86-Linux, ARM-Linux, x86-Windows, x86-Solaris, Alpha-Tru64) thread put(ts, item) get(ts, item) consume(ts) many to many connections time sequenced data correlation of streams automatic GC Channel
7
Key Requirements Middleware Network Level Ambient HPC resources
Programming infrastructure Distributed data fusion Stream data management Network Level Protocol stack Energy efficient routing Ambient HPC resources Grid Stampede DFuse MediaBroker SensorStack Streamline
8
Fusion Channel (a ‘Virtual Sensor’)
DFuse (ACM SenSys 2003) Sink Filter Collage Cameras Future Sensor Networks Today’s handhelds, gateways Ubiquitous high-bw sensors Challenges Overlaying the application onto the physical network Programming abstraction for data fusion Fusion Channel (a ‘Virtual Sensor’) Producers (sensors or other fusion channels) Consumers (actuators or other fusion channels) . . . f()
9
Fusion Module Fusion Module Placement Module Placement Module Hardware
Sink Filter Collage Cameras Fusion ( Arg[ ]) { .. } Cost Function Fusion Module Fusion Module Placement Module Resource Monitor, Routing Layer Interface Operating System Hardware Placement Module Operating System Hardware Resource Monitor, Routing Layer Interface DFuse functions: Placement of fusion and relay points Plumbing as required Dynamic migration of fusion points
10
Status of DFuse Fusion and placement modules implemented on top of Stampede A prototype iPAQ farm (simulating future sensor network) runs DFuse Stampede and DFuse available for downloads MSSN, a simulator for sensor networks middleware design guidance (BaseNets’04, IJNM’05, Wolenetz’s thesis) Available for downloads
11
Key Requirements Middleware Network Level Ambient HPC resources
Programming infrastructure Distributed data fusion Stream data management Network Level Protocol stack Energy efficient routing Ambient HPC resources Grid Stampede DFuse MediaBroker SensorStack Streamline
12
MediaBroker An architecture for stream management
A clearing house for sensors and actuators in a given space Stream registry, discovery, plumbing, sharing, Dynamic connection of sources (producers) and sinks (consumers) Dynamic injection, and safe execution of transformation code Feature extraction, fusion Dynamic sharing of transformations and streams
13
Elements Type server: stores data types, relationships, and transformation code Transformation engine: allow safe execution of injected code on cluster nodes Scheduler: manages workload, and allows prioritizing transformation requests Data brokers: manages connections between producers and consumers Producer Consumer Transformation Engine Data Broker Data Broker Data Items Type Server Transformation Requests Transformation Code Transformation Engine Scheduler Transformation Engine
14
What it enables Dynamic instantiations and sharing of transformations
15
MediaBroker Status MediaBroker V.1 MediaBroker++
A subset of the functionalities Application example: Family intercom IEEE PerCom 2004, PMC 2005 MediaBroker++ Currently under development EventWeb built on top
16
Key Requirements Middleware Network Level Ambient HPC resources
Programming infrastructure Distributed data fusion Stream data management Network Level Protocol stack Energy efficient routing Ambient HPC resources Grid Stampede DFuse MediaBroker SensorStack Streamline
17
SensorStack Adaptability Vs. Stackability of protocol layers
Adaptability deals with cross-layer data A must for wireless sensor networks Stackability deals with cross-layer functionalities A must for modular design Principle behind SensorStack Decouple data from functionalities
18
SensorStack without Cross-layering Support
Fusion requirement Fusion layer Application Neighborhood Information, Topology Data periodicity, Size, delay tolerance AODV routing Flood routing Data transmission requirement Fusion data transmission requirement Link quality information, Neighborhood status change, topology Time Sync Service MAC Time synchronization accuracy, accuracy requirement
19
Information Exchange Service (IES)
Design Goals: Efficient use of limited memory Simplifying information sharing interface Extensibility Asynchronous delivery of the information Complex event notification
20
IES Design Data management module Event management module
Stackability by using standard data interface Publish/subscribe based shared memory system Fully-associative cache for performance Event management module Adaptability by notifying when to adapt Complex event notification Reactive memory access
21
DRE: Data request event RSE: Rule satisfied event
DAE: Data available event EMM: Event management module
22
SensorStack with Cross-layering support
Implemented in TinyOS and iPAQ Linux Initial results (increasing application lifetime) very promising
23
Application Lifetime Improvement with Cross-layer Information
DFuse performance without Cross-layer information DFuse performance with Cross-layer information for role-migration
24
Key Requirements Middleware Network Level Ambient HPC resources
Programming infrastructure Distributed data fusion Stream data management Network Level Protocol stack Energy efficient routing Ambient HPC resources Grid Stampede DFuse MediaBroker SensorStack Streamline
25
Grid Infrastructure MediaBroker DFuse Stampede SensorStack
26
Streamline (MMNC 2006) Scheduling problem Input: Output:
Computation and communication requirements of various stages of a coarse-grain dataflow graph Application-specified constraints Current resource (processing and bandwidth) availability Resource specific constraints Output: Placement of the stages of the pipeline on available HPC resources Performance criteria: latency and throughput of the application
27
Streamline Scheduler S0 S1 S2 S3 Stage Prioritization {S2 S0 S1 S3} R0 R3 R1 Resource Filtering R2 {S2 {R0 R2 R3}} Resource Selection {S2 R0} Expects to maximize throughput by assigning best resource to most needy stage Additional policies concerning resources, applications, and local schedulers can be incorporated in the cost of a particular assignment
28
Streamline System Architecture
Results Outperforms condor by an order of magnitude for both compute and communication bound kernels, particularly under non-uniform load conditions Performs close to Simulated Annealing but at considerably low scheduling time (by a factor of 1000)
29
Streaming Grid Streamline scheduler integrated into Globus toolkit
Example of a mock traffic monitoring app as a service composition using Web Services Blue boxes are the Streaming Grid services
30
Demo
31
Demo Output (live if stars are aligned!)
What is the takeaway? Several technologies working together Service composition, Streamline scheduling, Web Services, Globus toolkit to process the video stream and show the output in the browser Streaming grid instantiates, connect, manages the streaming app Timestamp (sequence number of the captured image): 147 Detection Flag (number of desired objects found):2 This demo illustrates the streaming data being processed through an application service pipeline that is instantiated, connected, and managed by Streaming Grid. The camera image is a result of crawling the Georgia Navigator website and archiving the image (currently the crawler is offline and has archived 200 images, the images are being repeatedly processed through the services). The image on the left is being processed by the services and the result of processing (identifying the number of cars with a given color and model) is shown above.
32
Computing/Communication Continuum
Sensor Network HPC resources High connectivity Low connectivity / Wireless Cameras, sensor nodes High Performance Computing (HPC) resources Ambient Computing Infrastructure
33
Conclusions MediaBroker DFuse Stampede SensorStack Streamline
stream transformation and typed transport engine DFuse data fusion architecture Stampede distributed programming environment SensorStack Information Exchange Service for cross-layer support Streamline Scheduling support for streaming apps on grid MediaBroker DFuse Stampede SensorStack Anticipated presentation duration: 1 minute [Approx total] TODO: I prefer to have you do this conclusion slide because you can more effectively gather insights from what I’ve prepared, rather than repeat the basics of the abstract and intro slides. TODO: The “programming ease results so far” presentation claim in the abstract needs to be satisfied for [D-Stampede,] DFuse and MB. TODO: Think of more questions to ask of the audience to get them involved in the presentation (taking a cue from Kishore’s CS7001 presentation style). Still potential TODO items for DFuse presentation portion: Images/Animations Sample application/cost function animation (MT1 done intuitively, but add others? [Difficult directly in PPT]) Diagram of experimental setup (software side) (e.g. simulation, coupling, and so forth. [Would this be too much info?]) Still potential TODO items for MB presentation portion: Lattice animation in the context of one of the driving applications or the initial scenario Performance Analysis of MB --- Streamline
34
Ongoing Work Programming tools Adaptive resource management
Marrying grid computing and ubiquitous computing Wireless networking considerations sensorstack energy efficient protocols mobility considerations
35
Web Links http://www.cc.gatech.edu/~rama
36
Applause!!!
38
Sensor Networks Current Sensor Network Nodes Recent trends
Limited capabilities (Mica-2) habitat monitoring, vineyard monitoring… Recent trends iMotes 8x radio, 4x CPU, power++ Telos motes >3x radio, similar CPU & power Future Sensor Network Nodes Today’s handhelds, gateways Ubiquitous high-bw sensors 1 minute Most current sensor networks assume an arrangement of limited nodes such as Berkeley motes (Mica-2 in the table). These networks have been successfully used for a variety of application domains such as habitat monitoring and agricultural monitoring. We believe future sensor network nodes will have the similar communication and computation capabilities as today’s handhelds. For example, recent “Intel Motes” provide more capabilities (roughly 4x faster processor, Bluetooth radios). There is a consequent increase in power usage relative to Mica-2 predecessors. However, other recent devices such as Telos motes consume less power than Mica-2, while providing increased communication bandwidth. Current sensor network “gateway” devices incorporate standard OS along with small form factors and power management capabilities. For example, Stargate devices are built with Intel’s xScale package and run linux, enabling the bridging of contemporary sensor networks with higher-capability backbones. Given these recent trends, we believe the entire continuum of devices will become increasingly energy-efficient, enabling future sensor networks to be comprised of nodes with capabilities similar to today’s Stargate devices. Similarly, higher bandwidth sensors, such as cameras, are becoming ubiquitous and cheaper. The cell-phone industry is helping this particular trend -- Source: CACM #47-6 “The platforms enabling wireless sensor networks”, Hill, Horton, Kling, Krishnamurthy, 2004.
39
HPC resources Unix / Linux / XP cluster
40
Overlaying the fusion graph
collage filter uncompress Introduce the hardware scenario, The application, And fusion function, in-network, Hierarchical need Fusion application Fusion Applications : need hierarchical fusion support
41
Family Intercom Clients connected via MediaBroker
Type attributes include audio rate and buffer specs A client tracking system used for I-combo selection ‘Colocated’ clients can perform mixing for N-way conferencing Anticipated presentation duration: 1 minute Verbatim notes from Purdue talk: Preface: Our intercom implementation is an Intercom Control Client and many Intercom Clients spread throughout the home. Each Intercom Client accepts commands from the Intercom Control Client telling it which source its sink should listen to. --
43
Fusion Module Structure Management Plumbing Issues Consumers
Producers Consumers . . . Based upon the fusion channel abstraction, the fusion module provides a rich set of functionalities to ease the distributed programming of fusion applications.
44
Fusion Module f2() f() f1() Computation Management
Dynamic embedding of user-specified fusion function Correlation and aggregation of input streams Fusion channel migration . . . f() f1() f2() What data items on the input streams are to be fused ? One simple way will be to tag every data item generated at the end points, and do the fusion based upon the tag value. The tag can be the timestamp. Fusion channel migration is supported to allow run-time changes in the task graph mapping. The placement module may decide to move a fusion function from one node to another because the original node may be getting overloaded.
45
Fusion Module f2() f() f1() Computation Management Memory Management
Dynamic embedding of user-specified fusion function Correlation and aggregation of input streams Fusion channel migration Memory Management . . . f() f1() f2() Buffering – to synchronize the input streams Caching - to support migration Prefetching – to decrease the latency (if application desires).
46
Fusion Module f2() f() f1() Error and Status management
Failure / Latency hiding . . . f() f1() f2() Data packet loss, or Node failures – may cause unavailability of data items on some of the input streams. Partial fusion – and status update to the application, so that it can find out whether some of the sensors are really busted !
47
Placement Module User inputs the task graph And, a cost function. S1
Sink (Display) Filter Collage Sources S1 S2 S3 Placement module – dose the role assignment, using a distributed algorithm Sink node knows the task graph, and an associated cost function. And, a cost function.
48
Simple Solution ? Why not push the fusion points towards its sources ?
Data sources may be lying all around Fusion points may cause data expansion Why it is not so simple as it looks ? c Network is dynamic
49
DFuse: Placement Module’s Algorithm
Three phases: Naïve role assignment Deploy task graph into the network Start app at a designated root node and delegate task graph subtree roles to richest neighbors recursively Optimization Given anticipated application behavior (attributes in the task graph), perform rapid local decisions to adjust which node performs which role. Decisions guided by application-provided cost-function Maintenance Monitor actual application behavior and perform less frequent optimizations given application-provided cost-function Anticipated presentation duration: 1-2 minutes Verbatim notes from Purdue talk: “Local decisions” involve only the current node hosting a fusion point in the task graph and its network-adjacent neighbors. Role transfer is triggered only if the target neighbor can perform the same role with lower cost. DFuse assumes an ideal routing layer that provides hop-count information and local neighbor lists. --
50
Example Cost Function f() CMT1 ( n2, f ) = 9 kbps
MT1 (Minimize Transmission Cost-1) Source f() Sink 2 Kbps 1 Kbps n1 n2 CMT1 ( n2, f ) = 9 kbps CMT1 ( n1, f ) = 6 kbps
51
DFuse: Cost Functions MT1 (Minimize Transmission Cost 1)
Used in intuitive illustration slides m input data sources (fan-in) n output data consumers (fan-out) MPV (Minimize Power Variance) Attempts to keep the power of network nodes at similar levels. MTP (Minimize Ratio of Transmission Cost to Power) Attempts to keep the time for how long nodes can run a fusion function similar. MT2 (Minimize Transmission Cost 2) Like MT1, but allows role transfer when node power is below a threshold. Anticipated presentation duration: 2 minutes TODO: SenSys talk had more readable equations. TODO: IUCRC talk trimmed MT1 (MT2 is same+role migration). Verbatim notes from Purdue talk: [This slide uses animation to incrementally show each cost function.] We have performed experiments using four cost functions. Here, “data source” means the next upstream fusion point or source, possibly reached via multiple hops through relays. Similarly “data consumer” means the next downstream fusion point or sink, possibly reached via multiple hops through relays. Text from SenSys’03 paper: MT1: This cost function aims to decrease the amount of data transmission required for running a fusion function. Input data needs to be transmitted from sources to the fusion point, and the output data needs to be propogated to the consumer nodes (possibly across hops). MPV: This cost function tries to keep the power of network nodes at similar levels. MTP: This cost function aims to decrease both the transmission cost and lower the difference in the power levels of the nodes. The intuition here is that the cost reflects how long a node can run the fusion function. MT2: This cost function is similar to MT1, except that now the cost function behaves like a step function based upon the node’s power level. For a powered node, the cost is the same as MT1, but if the node’s power level goes below a threshold, then its cost for hosting any fusion function becomes infinity. Thus, if a fusion point’s power level goes down, a role transfer will happen even if the transfer deteriorates the transmission cost. -- c(k,f) = cost of node k to perform role f t(x) = transmission rate of data source x hopCount(i,k) = network hops between nodes i and k power(k) = remaining power at node k
52
Prototype DFuse Implementation
Goal Investigate utility of Fusion Point Migration Hypothesis Migration will increase application lifetime, for constant QoS Implementation Fusion Module: ARM Stampede port Simulated placement module Interface for coupling, transmission monitoring Simple camera sources, fusion functions and sink 1 minute Introduce implementation and experimental setup -- Collage Filter
53
Experimental Setup 12 iPAQ 3870s in a “familiar 0.6.1” Linux based D-Stampede b “farm”. Only directly adjacent iPAQs in the figure below are considered mutually reachable in 1 hop. Placement module run as a simulation of the distributed algorithm coupled to the farm via an extended fusion module interface Power usage is modeled to be linear to the number of application-level bytes transmitted across a farm hop (simple model) Anticipated presentation duration: 1 minute TODO: we might be able to have a digital photo of runtime mapping visualization from Monday 11/17/03 demo, given visual development and time to integrate such a photo. Verbatim notes from Purdue talk: Further details such as application data rates are in the paper TODO: Animations of MPV, and MTP behavior would be atrocious to create directly in PPT. MT2 behavior is quite similar to MT1, except that role transfer is enabled as nodes “fail.” An externally produced animation made directly from experimental traces would be best here, but has not been developed. TODO: Given time, maybe a simplified animation of MT2 behavior would have a low cost/benefit ratio. --
54
Prototype DFuse Results: Transmission Cost over Time
< 1 minute All of MPV, MT2 and MTP realize greater application lifetime, due to cost-function directed fusion point migration. MT2 rapidly converges to and maintains minimal transmission cost. --
55
Prototype DFuse Results: Variance, Role Transfers, Lifetime
MPV vs MT2 4x less variance Migrations++ 70% lifetime MTP good variance good lifetime < 1 minute --
56
Pervasive Computing with MediaBroker
Family Intercom and Sign Post Being developed at Georgia Tech’s Aware Home Replacing a legacy hardware mixing system with a much more scalable system Integration with existing RFID and Vision based tracking system called Sign Post to enable rapid call dispatching and mobility-aware communication Anticipated presentation duration: 1 minute Verbatim notes from Purdue talk: TODO: mention individual collaborators? Preface: We are currently collaborating in the development of several applications that leverage MediaBroker for data distribution, discovery, resource sharing and stream transformation. --
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.