Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USENIX Symposium on Internet Technologies and Systems (USITS) 2003 Alex.

Slides:

Advertisements

Similar presentations

Managing Web server performance with AutoTune agents by Y. Diao, J. L. Hellerstein, S. Parekh, J. P. Bigu Jangwon Han Seongwon Park

Advertisements

Quality of Service CS 457 Presentation Xue Gu Nov 15, 2001.

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.

CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,

Web Server Benchmarking Using the Internet Protocol Traffic and Network Emulator Carey Williamson, Rob Simmonds, Martin Arlitt et al. University of Calgary.

Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.

1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

CS533 Concepts of Operating Systems Jonathan Walpole.

Locality Aware Dynamic Load Management for Massively Multiplayer Games Written by Jin Chen 1, Baohua Wu 2, Margaret Delap 2, Bjorn Knutsson 2, Honghui.

IT Systems Multiprocessor System EN230-1 Justin Champion C208 –

SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

The War Between Mice and Elephants LIANG GUO, IBRAHIM MATTA Computer Science Department Boston University ICNP (International Conference on Network Protocols)

Measurements of Congestion Responsiveness of Windows Streaming Media (WSM) Presented By:- Ashish Gupta.

1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.

Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak.

ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.

Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

Adaptive Content Delivery for Scalable Web Servers Authors: Rahul Pradhan and Mark Claypool Presented by: David Finkel Computer Science Department Worcester.

Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…

Locality-Aware Request Distribution in Cluster-based Network Servers Presented by: Kevin Boos Authors: Vivek S. Pai, Mohit Aron, et al. Rice University.

Computer Science Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar and Prashant Shenoy University of Massachusetts.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

SEDA – Staged Event-Driven Architecture

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services by, Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

Continuous resource monitoring for self-predicting DBMS Dushyanth Narayanan 1 Eno Thereska 2 Anastassia Ailamaki 2 1 Microsoft Research-Cambridge, 2 Carnegie.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.

Web Server Support for Tired Services Telecommunication Management Lab M.G. Choi.

Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.

ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.

Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.

Computer Science 1 Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USITS 2003 Presented by: Bhuvan Urgaonkar.

Providing Differentiated Levels of Service in Web Content Hosting Jussara Almeida, etc... First Workshop on Internet Server Performance, 1998 Computer.

1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,

(c) Lindsay Bradford1 Varying Resource Consumption to achieve Scalable Web Services Lindsay Bradford Centre for Information Technology Innovation.

Empirical Quantification of Opportunities for Content Adaptation in Web Servers Michael Gopshtein and Dror Feitelson School of Engineering and Computer.

Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.

Managing Web Server Performance with AutoTune Agents by Y. Diao, J. L. Hellerstein, S. Parekh, J. P. Bigus Presented by Changha Lee.

SEDA An architecture for Well-Conditioned, scalable Internet Services Matt Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium.

Lecture Topics: 11/15 CPU scheduling: –Scheduling goals and algorithms.

Providing Differentiated Levels of Service in Web Content Hosting J ussara Almeida, Mihaela Dabu, Anand Manikutty and Pei Cao First Workshop on Internet.

Lecture 4 Page 1 CS 111 Summer 2013 Scheduling CS 111 Operating Systems Peter Reiher.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Web Server Load Balancing/Scheduling

Abhinav Kamra, Vishal Misra CS Department Columbia University

SEDA: An Architecture for Scalable, Well-Conditioned Internet Services

Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.

Web Server Load Balancing/Scheduling

Introduction to Load Balancing:

Regulating Data Flow in J2EE Application Server

Congestion Control and Resource Allocation

Memory Management for Scalable Web Data Servers

Chapter 6: CPU Scheduling

Module 5: CPU Scheduling

Web switch support for differentiated services

Admission Control and Request Scheduling in E-Commerce Web Sites

Chapter 6: CPU Scheduling

CPU SCHEDULING.

Multiprocessor and Real-Time Scheduling

Multiple-resource Request Scheduling. for Differentiated QoS

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling

Congestion Control and Resource Allocation

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Presentation transcript:

Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USENIX Symposium on Internet Technologies and Systems (USITS) 2003 Alex Cheung Nov 13, 2006 ECE1747

Outline Motivation Goal Methodology Detection Overload control Experiments Comments

Motivation 1.Internet services becoming important to our daily lives: News Trading 2.Services becoming more complex Large dynamic content Requires high computation and I/O Hard to predict load requirements of requests 3.Withstand peak load that is 1000x the norm without over-provisioning Solve CNN’s problem on 911

Goal Adaptive overload control scheme at node level by maintaining: Response time Throughput QoS & Availability

Methodology - Detection 1.Look at the 90 th percentile response time 2.Compare with threshold and decide what to do Weaker alternatives: 100 th percentile: does not capture “shape” of response time curve Throughput: does not capture user perceived performance of the system I ask: What makes 90 th percentile so great? Why not 95 th ? 80 th ? 70 th ? No supporting micro-experiment Requests served Examine 90 th highest response time

Methodology – Overload Control If response time is higher than threshold: 1.Limit service rate by rejecting selected requests Extension: Differentiate requests with classes/priorities levels and reject lower class/priority requests first 2.Quality/service degradation 3.Back pressure 1.Queue explosion at 1 st stage (they say) Solved by rejecting requests at 1 st stage 2.Breaks the loose-coupling modular design of SEDA with out-of-band notification scheme (I say)

Methodology – Overload Control 4.Forward rejected request to another “more available” server. “more available” – server with the most of a particular resource: CPU, network, I/O, hard disk Make decision using centralized or distributed algorithm Reliable state migration, possibly transactional My take: More complex, interesting, and actually solves CNN’s problem with a cluster of servers!

Rate Limit SMOOTHED Multiplicative decrease Additive increase Just like TCP! 10 fine-tuned parameters per stage.

Rate Limit With Class/Priority Class/priority assignment based on: IP address, header information, HTTP cookies I ask: Where is the priority assignment module implemented? Should priority assignment be a stage of its own? Is it not shown because complicates the diagram and makes the stage design not “clean”? How to classify which requests are potentially “bottleneck” requests? Application dependent?

Quality/Service Degradation Notify application via signal to DO service degradation. Application does service degradation, not SEDA Questions: How is the signaling implemented? Out of band? Is it possible to signal previous stages in the pipeline? Will this SEDA’s loose- coupling design? signal Attach image Send response

Experiments

Experiments - Setup Arashi server (realistic experiment) Real access workload Real content Admission control Web server benchmark Service degradation + 1-class admission control

Experiments – Admission Rate Controller response time is not as fast. Additive increase Multiplicative decrease

Experiments – Response Time Why?

Experiments – Massive Load Spike Not fair! SEDA’s parameters were fine- tuned. Apache can be tuned to stay flat too.

Experiments – Service Degradation Service degradation and admission control kick in at roughly the same time

Experiments – Service Differentiation Average reject rates without service differentiation: Low-priority: 55.5% High-priority: 57.6% With service differentiation: Low-priority: 87.9% +32.4% High-priority: 48.8% -8.8% Question: Why is the drop rate for high priority request reduced so little with service differentiation? Workload dependent?

Comments

No idea on what is the controller’s overhead Overload control at node level is not good: Node level is inefficient Late rejection Node level is not user friendly: All session state is gone if you get a reject out of the blues ← comes without warning Need global level overload control scheme Idea/concept is explained in 2.5 pages

Comments Rejected requests: Instead of TCP timeout, send static page. (Paper says) this is better (I say) This is worst because it leads to a out-of- memory crash down the road: Saturated output bandwidth Boundless queue at reject handler Parameters: How to tune them? How difficult to tune? May be tedious tuning each stage manually. Given a 1M stage application, need to configure all 1M stage thresholds manually? Automated tuning with control theory? Methodology of adding extensions is not shown in any figures.

Comments Experiment is not entirely realistic: Inter-request think time is 20ms realistic? Rejected users have to re-login after 5 min: All state information is gone Frustrated users Two drawbacks of using response time for load detection…

Comments 1.No idea which resource is the bottleneck: CPU? I/O? Network? Memory? SEDA can only either: Do admission control Reduces throughput Tell application to degrade overall service

Comments CPU I/O Network Memory Resource utilization threshold Default admission control: Attach image Send response Reject requests OVERLOADED!!! … and piss off some users.

Comments CPU I/O Network Memory Resource utilization threshold Service degradation WITH bottleneck intelligence: Network is the bottleneck, so expend some CPU and memory to reduce fidelity and size of images to reduce bandwidth consumption WITHOUT reducing admission rate.

Comments 2.The response time index is lagging by at least the magnitude of the response time 50 requests come in all at once nreq = 100 timeout = 10s target = 20s Processing time per request = 1s Detects overload after 30s Solution: Compare enqueue VS dequeue rate Overload occurs when enqueue rate > dequeue rate Detects overload after 10s

Questions?