The Active Streams approach to adaptive distributed systems Fabián E. Bustamante, Greg Eisenhauer, Karsten Schwan, and Patrick Widener Systems Research Group College of Computing Georgia Institute of Technology
Active Streams Motivation New networking technologies + increasing number of network-capable end devices novel computing environments like Grid and Pervasive computing Building applications for these environments is complicated –Heterogeneity of nodes and network connectivity –Dynamic changes in resource availability and demand –“Right” application distribution depends on the circumstances
Active Streams Approach Customizable services and dynamically extensible applications Dynamically adaptive applications and service Structured I/O – application-level strongly typed records Component-based model and event-based integration Dynamic code generation for heterogeneity and high-performance Continuous monitoring and adaptation
Active Streams Adaptation through attachment A B C Server Client Server Application/service evolution and/or a coarse form of adaptation are obtained through the attachment/detachment of streamlets that operate on and change data steams properties.
Active Streams Adaptation through attachment CA B S Active Stream Node Server Client A simple example of adaptation through streamlet attachment is the application-specific filtering of data-stream contents in order to handle a sudden drop on bandwidth availability.
Active Streams Adaptation through parameterization B A S B Client Active Stream Node Server Parameter block update Parameter block Parameterization examples: image resolution, zoom, atmospheric level, longitude/latitude, etc. Finer grain adaptation involves tuning a streamlet’s behavior through parameters updated remotely via a push-type operation.
Active Streams Adaptation through redeployment A B SS ServerActive Stream NodeClient Streamlet migration Finer grain adaptation can also be attained by dynamically re-deployment of streamlets to approach “optimal” placement according to monitoring information.
Active Streams Streamlet example This streamlet computes and returns the average of its input array: { int i; int j; double sum = 0.0; for (i = 0; i < MAXI; i = i + 1) { for (j = 0; j < MAXJ; j = j + 1) { sum = sum + input.array[i][j]; } } output.avg_array = sum / (MAXI * MAXJ); return 1; /* submit record */ } Streamlets are the basic units of composition in Active Streams. To deal with heterogeneity, portable streamlets can be written in ECL, a subset of a general procedural language (C), and a native version of a given streamlet code can be dynamically generated at the destination. We support dynamic code generation for MIPS, Alpha, Sparcs, and x86 processors.
Active Streams Active Streams framework Streamlet repository Active Resource Monitoring Proactive Directory Active Streams Node Streamlet ECho Provides the “glue” that holds the framework together. Participating nodes makes themselves available running as Active Streams Nodes (ASN), where each ASN provides a well- defined environment for streamlet execution. The framework relies on ECho, our high-performance event infrastructure for integration and general communication. Essential for introspection is a customizable active resource monitoring system. Streamlets can be obtained from a number of locations; they can be downloaded from clients or retrieved from a streamlet repository. Streamlet
Active Streams Latency effects of specialization End-to-end latency for 20k messages with different degrees of client specialization at the source. Each subfigure corresponds to a different message size: (clock-wise) 100KB, 10KB, 1KB, and 100B. The y axis represents different specialization scenarios: (1) w/o specialization, (2) with an “identity” streamlet, and (3), (4), and (5) “filtering out” 30, 60, and 90% of the stream content.
Active Streams Shares of server/ASN utilization System/User shares of server/ASN utilization when transmitting 20k 10KB-size messages with different of specialization. The y axis represents different specialization scenarios: (1) w/o specialization, (2) with an “identity” streamlet, and (3), (4), and (5) “filtering out” 30, 60, and 90% of the stream content. Effect of dropping messages: avoiding the TCP/IP stack, etc. Load increase from executing streamlet.
Active Streams Server/ASN load with specialization Percentage of server/ASN utilization when transmitting 20k messages with different degrees of specialization. Each subfigure corresponds to a different message size: (clock-wise) 100KB, 10KB, 1KB, and 100B. The y axis represents different specialization scenarios: (1) w/o specialization, (2) with an “identity” streamlet, and (3), (4), and (5) “filtering out” 30, 60, and 90% of the stream content.
Active Streams Intermediate nodes and re-deployment Adaptation at the end- points of a connection is not enough. Percentage of Server CPU utilization contrasted with end-to-end latency when transmitting 20k 1KB messages to 30 clients, with a varying number of them specializing the stream at the source. signals the point from where there are no additional benefits of placing streamlets at this node. indicates the point of diminishing returns.
Active Streams Status and ongoing work Design and prototype implementation finished First public release in December 2001 How to best decompose application and services with Active Streams? What are useful re-deployment algorithms and heuristics? What OS-type services should Active Streams Nodes provide?