Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern.

Slides:



Advertisements
Similar presentations
Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern.
Advertisements

Web Server Benchmarking Using the Internet Protocol Traffic and Network Emulator Carey Williamson, Rob Simmonds, Martin Arlitt et al. University of Calgary.
1 Size-Based Scheduling Policies with Inaccurate Scheduling Information Dong Lu *, Huanyuan Sheng +, Peter A. Dinda * * Prescience Lab, Dept. of Computer.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
The War Between Mice and Elephants LIANG GUO, IBRAHIM MATTA Computer Science Department Boston University ICNP (International Conference on Network Protocols)
Simulation Evaluation of Hybrid SRPT Policies
Web Server Request Scheduling Mingwei Gong Department of Computer Science University of Calgary November 16, 2004.
Maryam Elahi Fairness in Speed Scaling Design Joint work with: Carey Williamson and Philipp Woelfel.
1 Size-Based Scheduling Policies with Inaccurate Scheduling Information Dong Lu *, Huanyuan Sheng +, Peter A. Dinda * * Prescience Lab, Dept. of Computer.
1 Modeling and Taming Parallel TCP on the Wide Area Network Dong Lu,Yi Qiao Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern.
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
1 Components of a Scalable Distributed Relational Information Service Dong Lu June 14, 2005.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Research Paper Example Exploiting Process Lifetime Distributions for Dynamic Load Balancing Mor Harchol-Balter Allen Downey SIGMETRICS 2006.
Looking at the Server-side of P2P Systems Yi Qiao, Dong Lu, Fabian E. Bustamante and Peter A. Dinda Department of Computer Science Northwestern University.
1 Dong Lu, Peter A. Dinda Prescience Laboratory Department of Computer Science Northwestern University Evanston, IL GridG: Synthesizing Realistic.
1 Connection Scheduling in Web Servers Mor Harchol-Balter School of Computer Science Carnegie Mellon
Performance Evaluation
Online Prediction of the Running Time Of Tasks Peter A. Dinda Department of Computer Science Northwestern University
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Informationsteknologi Tuesday, October 9, 2007Computer Systems/Operating Systems - Class 141 Today’s class Scheduling.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
Ningning HuCarnegie Mellon University1 A Measurement Study of Internet Bottlenecks Ningning Hu (CMU) Joint work with Li Erran Li (Bell Lab) Zhuoqing Morley.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Admission Control and Dynamic Adaptation for a Proportional-Delay DiffServ-Enabled Web Server Yu Cai.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.
Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.
Approximate Load Balance Based on ID/Locator Split Routing Architecture 1 Sanqi Zhou, Jia Chen, Hongbin Luo, Hongke Zhang Beijing JiaoTong University
Chapter 6 CPU SCHEDULING.
Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE.
1 Mor Harchol-Balter Carnegie Mellon University Computer Science Heavy Tails: Performance Models & Scheduling Disciplines.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Advanced Computer Networks1 Efficient Policies for Carrying Traffic Over Flow-Switched Networks Anja Feldmann, Jenifer Rexford, and Ramon Caceres Presenters:
Computer Networks Performance Metrics. Performance Metrics Outline Generic Performance Metrics Network performance Measures Components of Hop and End-to-End.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
1 The Effect of Heavy-Tailed Job Size Distributions on System Design Mor Harchol-Balter MIT Laboratory for Computer Science.
Carnegie Mellon University Computer Science Department 1 OPEN VERSUS CLOSED: A CAUTIONARY TALE Bianca Schroeder Adam Wierman Mor Harchol-Balter Computer.
N. Hu (CMU)L. Li (Bell labs) Z. M. Mao. (U. Michigan) P. Steenkiste (CMU) J. Wang (AT&T) Infocom 2005 Presented By Mohammad Malli PhD student seminar Planete.
Deadline-based Resource Management for Information- Centric Networks Somaya Arianfar, Pasi Sarolahti, Jörg Ott Aalto University, Department of Communications.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 5 CPU Scheduling Slide 1 Chapter 5 CPU Scheduling.
Analysis of SRPT Scheduling: Investigating Unfairness Nikhil Bansal (Joint work with Mor Harchol-Balter)
Static Process Scheduling
Piotr Srebrny 1.  Problem statement  Packet caching  Thesis claims  Contributions  Related works  Critical review of claims  Conclusions  Future.
Queue Scheduling Disciplines
Data Communications and Computer Networks Chapter 4 CS 3830 Lecture 19 Omar Meqdadi Department of Computer Science and Software Engineering University.
1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal.
Scheduling for QoS Management. Engineering Internet QoS2 Outline  What is Queue Management and Scheduling?  Goals of scheduling  Fairness (Conservation.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Process Scheduling. Scheduling Strategies Scheduling strategies can broadly fall into two categories  Co-operative scheduling is where the currently.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
Web Server Load Balancing/Scheduling
Looking at the Server-side of P2P Systems
Web Server Load Balancing/Scheduling
Early Measurements of a Cluster-based Architecture for P2P Systems
Module 5: CPU Scheduling
Autoscaling Effects in Speed Scaling Systems
3: CPU Scheduling Basic Concepts Scheduling Criteria
Modeling and Taming Parallel TCP on the Wide Area Network
Network Layer: Control/data plane, addressing, routers
Size-Based Scheduling Policies with Inaccurate Scheduling Information
M/G/1/MLPS Queue Mean Delay Analysis
Module 5: CPU Scheduling
Module 5: CPU Scheduling
Presentation transcript:

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University + Ask Jeeves, Inc.

2 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT scheduling under real workload Domain-based scheduling

3 Quick Review of Size-based Scheduling SRPT –Shortest Remaining Processing Time –Assuming perfect knowledge of service times FSP –Fair Sojourn Protocol –Assuming perfect knowledge of service times Typical non-size-based scheduling –Processor Sharing (PS) –First Come First Serve (FCFS)

4 SRPT Always serve the job with minimum remaining processing time first, preemptive scheduling –Performance: Minimum mean response time [Schrage, Operations Research, 1968] –Fairness: performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01]

5 FSP Combined SRPT with PS, preemptive scheduling. [Friedman, et al, Sigmetrics ‘03] –SRPT + the longer a job stay in the queue, the higher its priority –Performance: Mean response time is close to that of SRPT –Fairness: Fairer than PS

6 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT scheduling under real workload Domain-based scheduling

7 Motivation Current implementation of SRPT and FSP –Use file size as service time (sorting jobs using file size) Is file size a good estimator of service time? What is the performance of SRPT and FSP using file size as service time? And how to improve? Service time: the time needed to send requested data in the absence of other requests in the system

8 Trace-driven Simulation Simulator: –C++ –Supports G/G/n/m queuing model –Driven by enhanced web server traces –Validation Little’s law Repeat the simulations in the FSP paper [Friedman, et al, Sigmetrics ‘03] Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]

9 Scheduling Policies Studied SRPT: Ideal SRPT SRPT-FS: File size as service time SRPT-D: Domain-estimated service time FSP: Ideal FSP FSP-FS: File size as service time FSP-D: Domain-estimated service time PS: Processor sharing

10 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

11 Correlation is Weak on a Typical Web Server Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale) R ≈ 0.14 Service time File Size Request from the whole Internet

12 Correlation is Weak on Web Cache Servers Measurement on 10 Squid web cache servers: – Correlation Coefficient R Between File size and Service time P[R>x]

13 Main reason for the weak correlation End-to-end path diversity Web Server Client 1 Client 2 Client 3 Client 4

14 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

15 Mean Response Time Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Load on the queue Mean Response Time (millisec) PS SRPT-FS FSP-FS Ideal SRPT and FSP

16 Mean Queue Length Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Load on the queue Mean Queue Length FSP-FS SRPT-FS PS Ideal SRPT and FSP

17 Requirements For A Better Service Time Estimator Low overhead –Passive measurement –Low computation complexity –Low / adjustable memory usage Effective –Approximate the correct ordering of the service times. High correlation.

18 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

19 Domain-based estimator Divide Internet into smaller “domains” by leveraging CIDR (Classless Inter-domain Routing) Hosts in the same domain are likely to share same/similar routes to web server, and thus similar throughput Web Server

20 Supporting Facts Statistical Internet stability and locality –Routing stability [Paxson, Sigcomm 1996] –TCP throughput locality and stability [Balakrishnan, et al, Sigmetrics 1997]; [Seshan, et al, USITS 1997]; [Myers, et al, Infocom 1999] Classless Inter-domain Routing –implies that routes from machines in the domain to a server outside the domain will share many hops.

21 Algorithm Use high order k bits of client IP address to classify clients into 2 k domains For each domain, calculate R = F/S –R: representative service rate –F: sum of file sizes delivered to domain –S: sum of corresponding service times For each request, first extract its domain, then service time can be estimated as B/R –B: requested file size –R: representative service rate obtained before

22 Higher Correlation Can Be Achieved Correlation Coefficient R Bits used to define a domain

23 Much Lower Service Times Can Be Achieved Bits used to define a domain Mean Response time (milisec) PS FSP-D SRPT-FS FSP-FS SRPT-D SRPT and FSP

24 Much Lower Queue Lengths Can Be Achieved Bits used to define a domain Mean queue length FSP-D FSP-FS SRPT-FS PS SRPT-D SRPT and FSP

25 Conclusions File size may not be a good estimator of service time for many regimes File size-based SRPT and FSP can perform worse than PS in these regimes Domain-based scheduling brings the benefits of size-based scheduling to these regimes

26 For more information Prescience Lab at Northwestern University –

27 Jeeves’ Invitation … Have you ever seen the whole Web at once? Did you ever wonder how to rein the power of thousands of machines? We are hiring talents for Internet Search –Software Engineer –Development Manager Send us your Resume: