Transparent Adaptive Resource Management for Middleware Systems

Slides:



Advertisements
Similar presentations
Performance Testing - Kanwalpreet Singh.
Advertisements

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Database Architectures and the Web
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Distributed Systems Architectures Slide 1 1 Chapter 9 Distributed Systems Architectures.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Distributed Systems Architectures
Revision Week 13 – Lecture 2. The exam 5 questions Multiple parts Read the question carefully Look at the marks as an indication of how much thought and.
Software Engineering and Middleware: a Roadmap by Wolfgang Emmerich Ebru Dincel Sahitya Gupta.
Cs238 CPU Scheduling Dr. Alan R. Davis. CPU Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU.
Investigating Lightweight Fault Tolerance Strategies for Enterprise Distributed Real-time Embedded Systems Tech-X Corporation Boulder, Colorado Vanderbilt.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 1.
23 September 2004 Evaluating Adaptive Middleware Load Balancing Strategies for Middleware Systems Department of Electrical Engineering & Computer Science.
H-1 Network Management Network management is the process of controlling a complex data network to maximize its efficiency and productivity The overall.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
CGW 2003 Institute of Computer Science AGH Proposal of Adaptation of Legacy C/C++ Software to Grid Services Bartosz Baliś, Marian Bubak, Michał Węgiel,
26 Sep 2003 Transparent Adaptive Resource Management for Distributed Systems Department of Electrical Engineering and Computer Science Vanderbilt University,
18 June 2001 Optimizing Distributed System Performance via Adaptive Middleware Load Balancing Ossama Othman Douglas C. Schmidt
Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.
Wireless Access and Terminal Mobility in CORBA Dimple Kaul, Arundhati Kogekar, Stoyan Paunov.
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Chapter 3 System Performance and Models. 2 Systems and Models The concept of modeling in the study of the dynamic behavior of simple system is be able.
Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated.
Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany University of Duisburg-Essen, Institute.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
Presented By:- Sudipta Dhara Roll Table of Content Table of Content 1.Introduction 2.How it evolved 3.Need of Middleware 4.Middleware Basic 5.Categories.
K-Anycast Routing Schemes for Mobile Ad Hoc Networks 指導老師 : 黃鈴玲 教授 學生 : 李京釜.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Distributed System Architectures Yonsei University 2 nd Semester, 2014 Woo-Cheol Kim.
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
FLARe: a Fault-tolerant Lightweight Adaptive Real-time Middleware for Distributed Real-time and Embedded Systems Dr. Aniruddha S. Gokhale
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Chapter 6: Securing the Cloud
OPERATING SYSTEMS CS 3502 Fall 2017
OPERATING SYSTEMS CS 3502 Fall 2017
SEDA: An Architecture for Scalable, Well-Conditioned Internet Services
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Introduction to Load Balancing:
Action Breakout Session
Similarities between Grid-enabled Medical and Engineering Applications
CSC 480 Software Engineering
Standards and Patterns for Dynamic Resource Management
Real-time Software Design
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
CPU Scheduling Basic Concepts Scheduling Criteria
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
The Globus Toolkit™: Information Services
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter 6: CPU Scheduling
CPU SCHEDULING.
Load Balancing in Distributed Systems
Lecture 2 Part 3 CPU Scheduling
Transparent Adaptive Resource Management for Middleware Systems
Chapter 6: CPU Scheduling
Quality-aware Middleware
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Presentation transcript:

Transparent Adaptive Resource Management for Middleware Systems Jaiganesh Balasubramanian Ossama Othman Dr. Douglas C. Schmidt {jai, ossama, schmidt}@dre.vanderbilt.edu Department of Electrical Engineering and Computer Science Vanderbilt University, Nashville

Distributed Systems Typical issues with distributed systems Heterogeneous environments Concurrency Large bursts of client requests 24/7 availability Stringent QoS requirements Examples of distributed systems E-commerce Online trading systems Mission critical systems Vanderbilt University

Motivation Development and maintenance of QoS-enabled distributed systems Non-trivial Requires expertise that application developers often lack Solution: Middleware (e.g. CORBA) Can shield distributed system developers from the complexities involved with developing distributed applications Can facilitate manipulation of QoS requirements and management of resources Vanderbilt University

Load Balancing Load balancing can improve the efficiency and scalability of a distributed system Load balancing service can be implemented in the following layers Network layer OS layer Middleware layer Why choose middleware-layer load balancing? Can take into account distributed system state Can take into account the system run-time behavior Application level control over load balancing policies Can take into account request content Vanderbilt University

Common Scenario Multiple clients making request invocations Potentially non-deterministic Duration called a session Members Multiple instances of the same object implementation Object groups Collections of members among which loads will be distributed equitably Logically a single object Load balancer Transparently distributes requests to members within an object group Vanderbilt University

TAO Load Balancer (Cygnus) Cygnus makes it easier to develop distributed applications by providing application transparency, scalability, flexibility, adaptability and interoperability. Has the following built-in strategies: RoundRobin Random LeastLoaded LoadMinimum EVENT SERVICE TRANSACTIONS SECURITY AV STREAMS DYNAMIC/STATIC SCHEDULING LOAD BALANCING RT-CORBA 1.0 Vanderbilt University

Cygnus Load Balancing Strategies Client binding granularity Per-session, client permanently forwarded to a replica Per-request, requests forwarded on client’s behalf On-demand, client can be rebound to another replica when necessary Vanderbilt University

Cygnus Load Balancing Architectures Load balancing architecture comprised of a combination of client binding granularity and balancing policy Given the strategies just described, there are six possible architectures Three common architectures Non-adaptive per-session Adaptive per-request Adaptive on-demand Vanderbilt University

Cygnus Load Balancer Components Load Monitor Provides load feedback Load Analyzer Determines location and member load conditions Member Locator Binds client to appropriate object group member Load Alert Facilitates load control (load shedding) Load Manager Mediates all interactions between load balancer components Vanderbilt University

Adaptive Load Balancing Interactions Vanderbilt University

Load Balancing Strategies (1/2) Adaptive Load Balancing Strategies Use run-time system information, example host load information to make load balancing decisions. Adaptive strategies integrated in Cygnus are LeastLoaded, which allows locations to receive loads until a threshold value is reached. Once a threshold is reached , subsequent requests are transferred to the location with the lowest load. LoadMinimum, which calculates the average load at all locations and if the load at a particular location is greater than the average load and greater than the least loaded location by a certain threshold, then the load at the particular location is transferred to the least loaded location. Vanderbilt University

Load Balancing Strategies (2/2) Non-adaptive load balancing strategies Makes load balancing decisions based on some static information. Do not use run-time system state information while making load balancing decisions. Non-adaptive load balancing strategies integrated into Cygnus are RoundRobin and Random. RoundRobin strategy keeps a list of locations and iterates through that list to select the next location that will receive requests from the client session. Random strategy keeps a list of locations and selects a random member to receive the requests from the next client session. Vanderbilt University

Load Metrics Cygnus supports the following load metrics Request-per-second, which calculates the average number of requests arriving per second at each server. CPU run queue length, which returns the load as the average number of processes waiting in the OS run time queue over a configurable time period in seconds, normalized over the number of processors. CPU utilization, which returns the load as the CPU usage percentage. Vanderbilt University

Evaluating load balancing strategies Load balancing strategies can be of the type adaptive or non-adaptive Load balancing strategies can be employed various run time parameters and load metrics. Determining the appropriate load balancing strategies for different classes of distributed applications is hard without the guidance of comprehensive performance evaluation models, systematic benchmarks and empirical results. Vanderbilt University

Introducing LBPerf LBPerf is an open source benchmarking tool suite for evaluating middleware load balancing strategies. Extensible C++ middleware framework. Supports range of adaptive and non-adaptive load balancing strategies. Tune different configurations of middleware load balancing , including choosing different load balancing strategies and run-time parameters associated with those strategies. Evaluate strategies using different metrics like throughput, latency, CPU utilization. Vanderbilt University

Workload Characterization (1/2) Empirical evaluations not useful, unless the workload is representative of the context in which the load balancer is deployed. Workload model could be of the type: Closed analytical network models Simulation models Executable models Our benchmarking experiments are based on the executable workload models. Vanderbilt University

Workload Characterization (2/2) Executable models can be classified as : Resource type, where workloads are characterized by the type of resource being consumed. Service type, where the workloads can be characterized by the type of service being performed on behalf of the clients. Session type, where the workloads are characterized by the type of requests initiated by a client and serviced by a server in the context of a session. Our benchmarking experiments focused on generating session type workloads. Vanderbilt University

Experiment Single threaded clients generating CPU intensive requests on the servers Experiments repeated for different strategies including RoundRobin, Random, LoadMinimum and LeastLoaded. Measure throughput, which is the number of requests processed per second by the server and the CPU utilization, which is the CPU usage percentage. Vanderbilt University

Run-time configurations Load reporting interval, which is the time interval between successive load information updates from the server to the load balancer. Reject threshold value of the LeastLoaded strategy, which is the load at which the servers will stop receiving extra loads. Critical threshold value of the LeastLoaded strategy, which is the load at which the servers will start shedding loads. Migration threshold value of the LoadMinimum strategy, which is the difference between the most loaded machine and the least loaded machine to trigger a migration of load from the most loaded machine to the least loaded machine. Dampening value, which is the fraction of the newly reported load that will be considered for fresh load balancing decisions. Vanderbilt University

Vanderbilt University

Vanderbilt University

Vanderbilt University

Vanderbilt University

Vanderbilt University

Results Analysis Our results indicate that: Adaptive load balancing strategies perform better than non-adaptive load balancing strategies in the presence of non-uniform loads. LeastLoaded strategy is better than the other three strategies under such loading conditions Reducing the number of client migrations is necessary for achieving maximum performance. In other words, client session migrations are not always effective. Need to maintain system utilization to enable more predictive behavior of the strategies. Vanderbilt University

Future Work Extend LBPerf to evaluate other types of workloads. Use the empirical results as a learning process to understand the nuances of the different run-time behaviors of these adaptive load balancing strategies. Use observations from the learning process to train an operational phase by developing self-adaptive load balancing strategies that can dynamically tune the run-time parameters according to the load being experienced. Define stateful load balancing model Decentralized load balancing Reduced network overhead Improved reliability Integrating load balancer service into the CCM model. Vanderbilt University

Concluding Remarks Cygnus load balancing architecture and model is Flexible Extensible load balancing strategies Allows both non-adaptive and adaptive (including self-adaptive) strategies to be used Freedom to implement in a variety of ways Centralized Decentralized / Federated / Cooperative Hierarchical Generic Supports multiple object groups Not application-specific Familiar Uses many of the same group management concepts found in existing fault tolerance models Vanderbilt University

References Ossama Othman, Jaiganesh Balasubramanian, Douglas C. Schmidt “ The Design of an Adaptive Load Balancing and Monitoring Service” http://www.dre.vanderbilt.edu/~jai/IWSAS_2003.pdf Jaiganesh Balasubramanian, Ossama Othman, Douglas C. Schmidt “Evaluating the performance of middleware load balancing strategies” http://www.dre.vanderbilt.edu/~jai/EDOC_2004.pdf Download Cygnus, LBPerf and TAO from http://deuce.doc.wustl.edu/Download.html Vanderbilt University