ANALYSIS OF TASK ASSIGNMENT POLICIES SCALABLE WEB SERVERS SYSTEMS

Slides:



Advertisements
Similar presentations
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
Advertisements

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.
The War Between Mice and Elephants LIANG GUO, IBRAHIM MATTA Computer Science Department Boston University ICNP (International Conference on Network Protocols)
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
1 A Comparison of Load Balancing Techniques for Scalable Web Servers Haakon Bryhni, University of Oslo Espen Klovning and Øivind Kure, Telenor Reserch.
Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?
Wide Web Load Balancing Algorithm Design Yingfang Zhang.
Web Caching Schemes For The Internet – cont. By Jia Wang.
Web Server Load Balancing/Scheduling Asima Silva Tim Sutherland.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
Segment-Based Proxy Caching of Multimedia Streams Authors: Kun-Lung Wu, Philip S. Yu, and Joel L. Wolf IBM T.J. Watson Research Center Proceedings of The.
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Load Distribution among Replicated Web Servers: A QoS-based Approach Marco Conti, Enrico Gregori, Fabio Panzieri WISP KAIST EECSD CALab Hwang.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
A Taxonomy of Mechanisms for Multi-Access
Talal H. Noor, Quan Z. Sheng, Lina Yao,
Web Server Load Balancing/Scheduling
OPERATING SYSTEMS CS 3502 Fall 2017
Dan C. Marinescu Office: HEC 439 B. Office hours: M, Wd 3 – 4:30 PM.
Web Server Load Balancing/Scheduling
Introduction to Load Balancing:
Log-assisted Straggler-aware I/O Scheduler for High-End Computing
Chapter 5a: CPU Scheduling
The Impact of Replacement Granularity on Video Caching
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Data Dissemination and Management (2) Lecture 10
Memory Management for Scalable Web Data Servers
Routing.
Transparent Adaptive Resource Management for Middleware Systems
Chapter 6: CPU Scheduling
Lottery Scheduling Ish Baid.
Process Scheduling B.Ramamurthy 11/18/2018.
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
Chapter 5: CPU Scheduling
Cluster Resource Management: A Scalable Approach
3: CPU Scheduling Basic Concepts Scheduling Criteria
Process Scheduling B.Ramamurthy 12/5/2018.
Chapter5: CPU Scheduling
Web switch support for differentiated services
Chapter 5: CPU Scheduling
Chapter 6: CPU Scheduling
CPU SCHEDULING.
Load Balancing/Sharing/Scheduling Part II
Chapter 5: CPU Scheduling
Lecture 2 Part 3 CPU Scheduling
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 4/11/2019.
Process Scheduling B.Ramamurthy 4/7/2019.
Subject Name: Adhoc Networks Subject Code: 10CS841
Uniprocessor scheduling
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Process Scheduling B.Ramamurthy 4/19/2019.
Process Scheduling B.Ramamurthy 4/24/2019.
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Process Scheduling B.Ramamurthy 5/7/2019.
Chapter 6: CPU Scheduling
Routing.
Module 5: CPU Scheduling
Data Dissemination and Management (2) Lecture 10
Presentation transcript:

ANALYSIS OF TASK ASSIGNMENT POLICIES SCALABLE WEB SERVERS SYSTEMS in SCALABLE WEB SERVERS SYSTEMS 3-Dec-18 Ravindra Sudhindra

To be Discussed …… Introduction Model Description Algorithms Parameters Evaluation methods Results Conclusion 3-Dec-18 Ravindra Sudhindra

Why Distributed Web Server Introduction …. Why Distributed Web Server Scalable & Fault Tolerant than centralized servers Exponential growth of usage Overloaded servers Increase usage of bandwidth 3-Dec-18 Ravindra Sudhindra

Introduction …. Our Model 3-Dec-18 Ravindra Sudhindra

How Distr.Web Servers works ? Introduction …. What is a DNS ? Domain Name Server How Distr.Web Servers works ? 3-Dec-18 Ravindra Sudhindra

LG – Local Gateway used by clients to connect to the internet. Model Description …. LG – Local Gateway used by clients to connect to the internet. LNS – Local Name Servers INS – Intermediate Name Servers TTL – Time To Live, the duration of time for which a name-to-address mapping is cached at LNSs and INSs Obstacles for the DNS - Address Caching limits the control of the DNS During the TTL period, subsequent requests from a LG arrive at the same WS If TTL ~ 0  DNS will be a bottle neck. Zip Function – Client Distr. is a small number of large values !!!! 3-Dec-18 Ravindra Sudhindra

Scheduling Policies Information on LG Information on WS Load Algorithms …. Scheduling Policies Information on LG Information on WS Load Combination of (i) & (ii) Redirection Algorithms 3-Dec-18 Ravindra Sudhindra

Hidden Load Weight Algorithms …. Nj – Name to address mapping it received from LGj Ri,j – Number of web requests that WSi received from LGj Hidden Load Weight ηj = Σi Ri,j / Nj It is called the hidden load weight as this load is transparent to the DNS. 1. Using Domain info … Two-tier Round Robin (RR2) Class Threshold Tc = 1 / #LG Dynamically Accumulated Load (DAL) Bin Value - weighted load given to each WS 2. Using Load info on Web Servers … Asynchronous Alarms Present load information 3-Dec-18 Ravindra Sudhindra

Asynchronous Alarms - Not sufficient to define a scheduling policy. Algorithms …. Asynchronous Alarms - Not sufficient to define a scheduling policy. Past Load Information - Least Utilized Node (LUN) - Considers only the latest load information in deciding the assignment. (utilization is measured for a 15 seconds interval) Present and Past Load Information - Lowest Utilization (LU) - Same as in LUN but also considers the last one hour as the interval for the utilization estimate. Lowest among Past and Present Utilizations (LPPU) - The DNS selects the WS with the lowest weighted pputi are the utilization samples of WS evaluated in the past five intervals of 15 seconds each and G = 3-Dec-18 Ravindra Sudhindra

Sequence of Previous Assignments Algorithms …. Sequence of Previous Assignments LPPU Modified and LUN Modified - Here the policy look for the first and second minimum among the WSs utilization samples. If the first minimum is the same WS of the previous assignment, the other WS is chosen. Algorithms combining Domain and Server Information Single Threshold (Thr1) Double Threshold Temporal Threshold DAL-set bin to top (DAL-ST) DAL set to actual number of requests (DAL-AN) Minimum Residual Load (MRL) 3-Dec-18 Ravindra Sudhindra

Redirection Algorithms for Load Sharing Alternative architecture that integrates the DNS dispatching mechanism with a redirection technique carried out by the WS through the redirection mechanism. Centralized Synchronous Redirection Distributed Asynchronous Redirection Centralized Synchronous Redirection The decision is centralized at the DNS. Every t seconds each Web server sends some status info. to the DNS. DNS builds Assignment Table DNS serves the address resolution requests by using this table. 3-Dec-18 Ravindra Sudhindra

The redirected entities can be – Entire Domains (DR) Algorithms …. The redirected entities can be – Entire Domains (DR) Individual Domains (CR) Both (CDR) Since the HTTP redirection works in an individual basis, domain redirection means that all the clients of the same domain are subject to the same redirection decision Domain Redirection Scheme DNS estimates the domain hit rate for each connected domain, and on this basis it orders the domains from the most to the least popular. DNS orders the servers from the least to the most loaded. DNS builds the assignment table through the domain hit rate information Determines the potential load assigned to each server through a bin, which contains the sum of the hit rates of the domains assigned to that server by the Assignment table. The server bin is updated after each assignment. The assignment is done through greedy approach. 3-Dec-18 Ravindra Sudhindra

DNS calculates the Average Bin level of all the Web Servers. Algorithms …. Client Redirection The assignment table is used only in the first level assignment carried out by the DNS when it receives an address resolution requests. The second level assignment carried out by the Web servers is instead based on the Server Percentage list to indicate the percentage of client requests that needs to be redirected. DNS calculates the Average Bin level of all the Web Servers. Broadcasts it to all the Web Servers. The Web Server will generate a random number between 0 and 1. If the number generated is greater than the average bin number of the Web server then the request is redirected. If the Server percentage is zero then request wont be redirected. From point (vi) there can be three possibilities for the choice of the server that has to receive these redirected request. 3-Dec-18 Ravindra Sudhindra

Domain and Client Redirection Algorithms …. CR_RR Policy - All requests are reassigned to the server in a Round Robin manner for the servers having the percentage equal to 0%. CR_PRR Policy – Client redirection is done in a probabilistic round-robin way, where the probability is based on the available server capacity using the latest load information CR_LL Policy – In this the server with the least load will be assigned the request. Domain and Client Redirection Combination of the first and the second method. Alarm Message Any of the above mentioned synchronous algorithms can be combined with a feedback alarm mechanism. 3-Dec-18 Ravindra Sudhindra

Asynchronous Redirection schemes - Algorithms …. Asynchronous Redirection schemes - The feedback mechanism talked about in the previous topic can be used to activate the redirection process itself. The web cluster remains a typical DNS-dispatcher based systems where the DNS carries out the first level assignment through a RR or RR2 scheme. This DNS assignment process is integrated with a second level (re)assignment mechanism triggered by any overloaded server. DNS maintains the so called ‘Available Server list’ which is the list of severs that are not overloaded at that moment. This is list is transmitted in reply to the server alarm The overloaded will redirect the client request using this list. 3-Dec-18 Ravindra Sudhindra

Category Parameter Values (default) Parameters …. Web-Server System Number of Servers 4-9(7) Average Utilization 0.5-0.8 (0.6667) LG Number 10-100(20) Client Distribution Zipf   Geometric (p = 0.3) TTL (Seconds) 0-360 (240) Client 1500-5000 (1500) Web page requests Exponential (mean 20) Hits per page Uniform (5-15) Hit Service time (milliseconds) Exponential (mean 4.5) Inter-request time (seconds) Exponential (mean 15) A good index - measure the load level and overloaded instances is the utilization of the Web servers. 3-Dec-18 Ravindra Sudhindra

Evaluation …. Performance Stability – is the main concern for the distributed web server system that is subject to non-uniform bursts of arrivals of which only a small fraction is controlled by the DNS. The impact of the DNS algorithms on avoiding the overload at any servers is more important than equalizing the system workload Utilization of the Web Servers indicates the Load Level / Overloaded Situations. If the maximum utilization at an instant is low, it means that no server is overloaded at that time. By tracking the period of time the maximum utilization is above or below a certain threshold, we can get an indication of how well the distributed system is running. Determine the percentage of time for which the at least one of the servers is critically loaded, which is given by the cumulative frequency. 3-Dec-18 Ravindra Sudhindra

Results…. The goal is to measure how effectively the DNS scheduling algorithms, that control only a very small percentage of the address resolution requests, can minimize overload situations of a distributed Web Server systems. 3-Dec-18 Ravindra Sudhindra

Algorithms using Domain Information - Results…. Algorithms using Domain Information - Distribution for RR Scheduling Policy 3-Dec-18 Ravindra Sudhindra

Comparison of 3 Algo’s using Domain information Results…. Comparison of 3 Algo’s using Domain information 3-Dec-18 Ravindra Sudhindra

Algorithms combining server load information Results…. Algorithms combining server load information 3-Dec-18 Ravindra Sudhindra

Summary … Worst Poor Good Better Best Results…. Summary … Worst Poor Good Better Best LU Random LPPU-Naive LUN Naïve DAL - LPPU-Modified LUN-Modified Thr1         The threshold policy using the state information provides the most substantial performance improvement. This motivates the research for policies that combine client and server information!!! 3-Dec-18 Ravindra Sudhindra

Algorithms combining domain and server load information Results…. Algorithms combining domain and server load information This section shows that all the algorithms with any alarm feedback from WS to the DNS have better performance than their non feedback counterparts. 3-Dec-18 Ravindra Sudhindra

Results….   Although no policy guarantees that the Web Server system is never overloaded, the graphs indicate that, for the proposed policies these overloaded server occur very rarely in more than one server. 3-Dec-18 Ravindra Sudhindra

Results…. MRL-Thr1 and RR2-Thr1 can be considered as the best policies and DAL-Thr1 stands 3rd. 3-Dec-18 Ravindra Sudhindra

Sensitivity of DNS Scheduling Algorithm Results…. Sensitivity of DNS Scheduling Algorithm Sensitivity of TTL 3-Dec-18 Ravindra Sudhindra

Results…. For RR2-Thr1 and MRL-Thr1 the probability that no server is overloaded is around 90% for more than 40 domains. 3-Dec-18 Ravindra Sudhindra

Results…. The probability that no server is overloaded remains over 0.8 for more than 40 domains, which is significantly better than the 0.68 value achieved for the default value. Another interesting aspect is the sensitivity of the scheduling policy for a singleton domain. 3-Dec-18 Ravindra Sudhindra

Again DAL-Thr1 and MRL-Thr1 performs the best. Results…. Performance improves as the client distribution resembles the uniform distribution and the skews reduce. Again DAL-Thr1 and MRL-Thr1 performs the best. 3-Dec-18 Ravindra Sudhindra

Results of redirection Algo …. 3-Dec-18 Ravindra Sudhindra

Sensitivity to the frequency of Assignment table updating - Results…. Sensitivity to the frequency of Assignment table updating - In fig 2 we compare the sensitivity to the assignment table update interval using the probability that no server is overloaded (exceeding 95% utilization) as the performance metrics. 3-Dec-18 Ravindra Sudhindra

Asynchronous redirection performance Results…. Asynchronous redirection performance 3-Dec-18 Ravindra Sudhindra

Results…. Sensitivity… 3-Dec-18 Ravindra Sudhindra

Classic Algorithms such as RR or least loaded server are not adequate. Conclusion…. Classic Algorithms such as RR or least loaded server are not adequate. On the other hand detailed load information also does not help. Extensive simulation results have shown that the strategies that take into account the domain of the requests and alarms from the servers. The most promising algorithms use asynchronous alarms combined with the estimation of the hidden load weight. RR2-Thr1 is the simplest to implement. MRL-Thr1 and RR2-Thr1 stand out the best algorithm with DAL-Thr1 closely behind. 3-Dec-18 Ravindra Sudhindra

Conclusion…. The scheduling policy combined with redirection mechanism are effective. Asynchronous Algorithms combined with redirection performs the best even in case of a highly skewed load. The intra-cluster communication overhead is typically higher than that introduced in the asynchronous policy Comparing the 2 different categories it can be concluded that when a simple best performing algorithm using domain and server information combined with the asynchronous redirection policy gives the best results. 3-Dec-18 Ravindra Sudhindra

Thank You 3-Dec-18 Ravindra Sudhindra