A Stratified Approach for Supporting High Throughput Event Processing Applications July 2009 Geetika T. LakshmananYuri G. RabinovichOpher Etzion IBM T.

Slides:



Advertisements
Similar presentations
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Advertisements

LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Towards Autonomic Adaptive Scaling of General Purpose Virtual Worlds Deploying a large-scale OpenSim grid using OpenStack cloud infrastructure and Chef.
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
Song Han, Xiuming Zhu, Al Mok University of Texas at Austin
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
Applying Genetic Algorithms to Decision Making in Autonomic Computing Systems Authors: Andres J. Ramirez, David B. Knoester, Betty H.C. Cheng, Philip K.
1 Virtual Machine Resource Monitoring and Networking of Virtual Machines Ananth I. Sundararaj Department of Computer Science Northwestern University July.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Copyright ©2009 Opher Etzion Event Processing Course Engineering and implementation considerations (related to chapter 10)
Event Processing Course Event processing networks (relates to chapter 6)
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Performance Management (Best Practices) REF: Document ID
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.
A Unified Modeling Framework for Distributed Resource Allocation of General Fork and Join Processing Networks in ACM SIGMETRICS
Network Aware Resource Allocation in Distributed Clouds.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Autonomic SLA-driven Provisioning for Cloud Applications Nicolas Bonvin, Thanasis Papaioannou, Karl Aberer Presented by Ismail Alan.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.
Mining High Utility Itemset in Big Data
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
StreamX10: A Stream Programming Framework on X10 Haitao Wei School of Computer Science at Huazhong University of Sci&Tech.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC.
Tao Lin Chris Chu TPL-Aware Displacement- driven Detailed Placement Refinement with Coloring Constraints ISPD ‘15.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Accommodating Bursts in Distributed Stream Processing Systems Yannis Drougas, ESRI Vana Kalogeraki, AUEB
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.
Enabling Self-management of Component-based High-performance Scientific Applications Hua (Maria) Liu and Manish Parashar The Applied Software Systems Laboratory.
Run-time Adaptive on-chip Communication Scheme 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C.
GEM: A Framework for Developing Shared- Memory Parallel GEnomic Applications on Memory Constrained Architectures Mucahid Kutlu Gagan Agrawal Department.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Massive Semantic Web data compression with MapReduce Jacopo Urbani, Jason Maassen, Henri Bal Vrije Universiteit, Amsterdam HPDC ( High Performance Distributed.
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
Performance Management (Best Practices) REF: Document ID
Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.
Adaptive Online Scheduling in Storm Paper by Leonardo Aniello, Roberto Baldoni, and Leonardo Querzoni Presentation by Keshav Santhanam.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Online Parameter Optimization for Elastic Data Stream Processing Thomas Heinze, Lars Roediger, Yuanzhen Ji, Zbigniew Jerzak (SAP SE) Andreas Meister (University.
Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.
Optimizing Distributed Actor Systems for Dynamic Interactive Services
OPERATING SYSTEMS CS 3502 Fall 2017
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Introduction to Load Balancing:
International Conference on Data Engineering (ICDE 2016)
Distributed Network Traffic Feature Extraction for a Real-time IDS
Parallel Programming By J. H. Wang May 2, 2017.
Standards and Patterns for Dynamic Resource Management
Parallel Programming in C with MPI and OpenMP
Distributed Channel Assignment in Multi-Radio Mesh Networks
Parallel Programming in C with MPI and OpenMP
2019/9/14 The Deep Learning Vision for Heterogeneous Network Traffic Control Proposal, Challenges, and Future Perspective Author: Nei Kato, Zubair Md.
Presentation transcript:

A Stratified Approach for Supporting High Throughput Event Processing Applications July 2009 Geetika T. LakshmananYuri G. RabinovichOpher Etzion IBM T. J. Watson Research CenterIBM Haifa Research Lab

2 Outline  Motivation and Goal  Definitions  Related Work  Overview of our solution –Credit-Card Scenario –Profiling and initial assignment of nodes to strata –Stratification –Load Distribution Algorithm  Algorithm optimizations and support for dynamic changes in event processing graph  Implementation and Results  Conclusion

3 Our Goal Devise a generic framework to maximize the overall input (and thus output) throughput of an event processing application which is represented as an EPN, given a specific set of resources (cluster of nodes with varying computational power) and a traffic model. The framework should be adaptive to changes either in the configuration or in the traffic model. EPN Event Producer Event Consumer Event Producer Event Producer Event Consumer Event Consumer Engine EPA Engine EPA Engine EPA Repository

4 Why is this an important problem?  Quantity of events that a single application needs to process is constantly increasing (E.g. RFID events, Massive Online Multiplayer Games).  Manual partitioning is difficult (due to semantic dependencies between event processing agents) particularly when it is required to be adaptive and dynamic.

5 Event Processing Agent  An event processing agent has input and output event channels.  In general it receives a collection of events as input, derives one or more events as output and emits them on one or more output channels.  The input channels are partitioned according to a context which partitions the space of events according to semantic partitions of relevance

6 Related Work  Scalability in event processing –Large scale event processing applications (E.g. Astrolobe, PIER, Sienna.) –Kulkarni et al., Wu et al. –More work needs to be done.  Numerous centralized implementations arising due to interdependencies among event processing agents.  Synergy between stream processing and event processing. – Load distribution techniques for streams proposed: Shah et al. Mehta et al. Gu et al. Xing et al. Zhou et al.

7 Is this a solved problem? Scalable event processing implementations (Astrolobe, PIER, Sienna) Centralized event processing Implementations Load distribution algorithms for scalable stream processing Shah et al., Mehta et al., Gu et al., Xing et al. Zhou et al., Liu et al. ……… Event-at-a-time ImplementationsSet-at-a-time Implementations Centralized stream processing Implementations

8 Overview of Our Solution 1. Profiling –Used to assign agents to nodes in order to maximize throughput 2. Stratification of EPN –Splitting the EPN into strata layers –Based on semantic dependencies between agents –Distributed implementation with event proxy to relay events between strata 3. Load Distribution –Distribute load among agents dynamically during runtime and respect statistical load relationships between nodes

9 Distributed Event Processing Network Architecture  Input: Specification of an Event Processing Application  Output: Stratified EPN ( event processing operations event processing agents)  Event Proxy receives input events and routes them to nodes in a stratum according to the event context.  Event proxy periodically collects performance statistics per node in a stratum.

10 Stratified Event Processing Graph 1.Define the event processing application in the form of an Event Processing Network Dependency Graph  G=(V,E) (directed edges from event source to event target) 2. Overview of Stratification Algorithm  Create partitions by finding sub graphs that are independent in the dependency graph.  For each sub graph, construct a network of EPAs.  Push filters to the beginning of the network to filter out irrelevant events.  Iterate through graph and identify areas of strict interdependence. (i.e. sub graphs with no connecting edges).  For each sub graph define stratum levels.

11 Credit Card Scenario Stratification algorithm Event Processing Dependency Graph Stratified Event Processing Graph

12 Initial Placement of Agents  Goal is to maximize throughput.  Assume agents in a single stratum are replicated on all nodes in that stratum.  Overall strategy: 1.Profiling. Determine maximum event processing capability of available nodes. –r i : Maximum possible event processing rate (events/sec) –d i : Maximum possible derived event production rate (events/sec) 2.Assigning nodes to each stratum. Executing at a user set percentage of their capacity, these nodes can process all of the incoming events in their stratum level in parallel under peak event traffic conditions. –Compute ratio of events split between nodes –Iterative calculation starting with the first stratum.

13 Assigning Nodes to Each Stratum 1.Assigning nodes to each stratum. Executing at a user set percentage of their capacity t i, these nodes can process all of the incoming events in their stratum level in parallel under peak event traffic conditions. –Compute ratio of events split between nodes –Iterative calculation starting with the first stratum. Formulas Example : Incoming event rate: 200,000/sec. Ti=0.95. Processing Capacity of node n: 36,000 events/sec. ((t i *r i )/m) i *100m i *(d i /r i ) Stratum nStratum n+1 Stratum n Stratum n+1 Percentage of event stream directed to node n i Derived event production rate of nodes in stratum n ((0.95*36,000)/200,000)*100 = 17.1% Percentage of event stream directed to node n If (d i /r i )=0.5, derived event production rate is 200,000*0.5=100,000 events/sec Thus, 6 nodes will be needed in this stratum

14 Dynamic Load Distribution Strategy  Desirable qualities include: –Dynamic –Observes Semantics of Agent Dependencies –Observes Link Latency –Can perform task splitting –Observes load, average load, and load variance –Observes state

15 Overview of Dynamic Load Distribution Algorithm  Event Proxy collects statistics and maintains a time series and makes the following decisions: 1.Identify most heavily loaded node in a stratum (donor node). 2.Identify a heavy context to migrate from the donor node. 3.Identify recipient node for migrated load. 4.Estimate post migration utilization of donor and recipient nodes. If post migration utilization of recipient node are unsatisfactory, go back to step 3 and identify new recipient node. If post migration utilization of donor node is unsatisfactory, go back to step 2 and identify new context to migrate. 5.Execute migration and wait for x length time interval. Go to step 1. Engine Queue AMIT Engine Queue AMIT Engine Queue AMIT Engine Queue AMIT Engine Queue AMIT Engine Queue AMIT EPProxy Stratum nStratum n+1

16 Overview of Dynamic Load Distribution Algorithm  Statistics collected by event Proxy: –Number of input events processed by execution of agents in a particular context –Number of derived events produced by the execution of agents in this context –Number of different agent executions evaluated in this context –Total amount of latency to evaluate all agents executed in this context  For these statistics, event proxy maintains a time series, and computes statistics such as mean, standard deviation, covariance and correlation coefficient.  These statistics dictate the choice of donor and recipient nodes.  Definition of load is purposely generic to incorporate different application priorities.

17 Post Migration Utilization Calculation  We need to determine whether this migration will lead to overload. If it triggers other migrations then the system will become unstable. Therefore compute the post migration utilization of the donor and recipient machines.  Thus the post migration utilization, U d, of the donor machine and U r of the recipient machine after migrating an task t1, and where n d and n r are the total number of tasks on the donor and recipient respectively, is:  Post migration utilization of donor must be less than preset quality threshold  Post migration utilization of recipient must be less than preset quality threshold

18 Support for Dynamic Changes in EP Graph  Our algorithm supports: –Addition of a new connected sub graph to the existing EPN. –Addition of an agent to the graph in the EPN. –Deletion of agents from the graph –Failure of one or more nodes in a stratum level.  Algorithm is also amenable to agent-level optimizations (E.g. coalescing of neighboring agents).

19 Implementation  Used nodes running IBM Active Middleware Technology (AMiT), a CEP engine that serves as a container for event processing agents.  Event processing scenario: credit card scenario  Node hardware characteristics: –Type 1: Dual Core AMD Opteron GHz and 1GB memory. –Type 2: Intel Pentium D 3.6 Ghz and 2GB memory. –Type 3: Intel Xeon 2.6 Ghz and 2 GB memory.

20 Goal of Implementation  Explore benefits of event processing on stratified vs. centralized vs. partitioned network (single stratum in which load is distributed according to context).  Explore benefit of stratified approach under heavy load (when the number of incoming events that trigger the generation of derived events increases).  Explore the effectiveness and scalability of the load distribution algorithm

21 Results Input events processing rate by stratified versus partitioned event processing networks

22 Results Derived events production rate by stratified versus partitioned event processing networks.

23 Results Percentage of improvement in performance of the stratified network relative to a partitioned network

24 Results Average input events processing rate per node in a stratified network with different configurations

25 Results Throughput results for the load distribution algorithm Scalability of load distribution algorithm

26 Conclusion and Future Work  Demonstrated stratified, load distribution for scalable event processing  Future Work: Investigate high availability  Future Work: Investigate other objectives in addition to scalability  Future Work: Execution of multiple strata within a single nodes cluster.  Future Work: Techniques for effective load migration between nodes.