1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal.

1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal

2 “size” = service requirement load  < 1 Q: Which minimizes mean response time?

3 “size” = service requirement jobs SRPT jobs load  < 1 jobs PS FCFS Q: Which best represents scheduling in web servers ?

4 How about using SRPT in Web servers as opposed to the traditional processor-sharing (PS) type scheduling ?

5 Many servers receive mostly static web requests. “GET FILE” For static web requests, know file size Approx. know service requirement of request. Immediate Objections 1) Can’t assume known job size 2) But the big jobs will starve...

6 Outline of Talk 1) “Analysis of SRPT Scheduling: Investigating Unfairness” with Nikhil Bansal 2) “Implementation of SRPT Scheduling in Web Servers” with Nikhil Bansal and Bianca Schroeder and Mukesh Agrawal THEORY IMPLEMENT

7 THEORY SRPT has a long history... 1966 Schrage & Miller derive M/G/1/SRPT response time: 1968 Schrage proves optimality 1979 Pechinkin & Solovyev & Yashkov generalize 1990 Schassberger derives distribution on queue length BUT WHAT DOES IT ALL MEAN?

8 THEORY SRPT has a long history (cont.) 1990 - 97 7-year long study at Univ. of Aachen under Schreiber SRPT WINS BIG ON MEAN! 1998, 1999 Slowdown for SRPT under adversary: Rajmohan, Gehrke, Muthukrishnan, Rajaraman, Shaheen, Bender, Chakrabarti, etc. SRPT STARVES BIG JOBS! Various o.s. books: Silberschatz, Stallings, Tannenbaum: Warn about starvation of big jobs... Kleinrock’s Conservation Law: “Preferential treatment given to one class of customers is afforded at the expense of other customers.”

9 THEORY Our Analytical Results (M/G/1): SRPT PS All-Can-Win Theorem: Under workloads with heavy-tailed (HT) property, ALL jobs, including the very biggest, prefer SRPT to PS, provided load not too close to 1. Almost-All-Win-Big Theorem: Under workloads with HT property, 99% of all jobs perform orders of magnitude better under SRPT. ISRPT Counter-intuitive! Many more such results in Sigmetrics talk.

10 THEORY Our Analytical Results (M/G/1): Moderate-Load Theorem: If load <.5, for every job size distribution, ALL jobs prefer SRPT to PS. Bounding-the-damage Theorem: For any load, for every job size distribution, for every size x, PSSRPT xTExTE)]([1 ([             

11 What’s the Heavy-Tail property? 20, ~ } { Pr     xxX Defn: heavy-tailed distribution: Many real-world workloads well-modeled by truncated HT distribution. Key property: HT Property: “Largest 1% of jobs comprise half the load.”

12 What does SRPT mean within a Web server? Many devices: Where to do the scheduling? No longer one job at a time. IMPLEMENT From theory to practice:

13 IMPLEMENT Previous work on implementation: 1999 Crovella, Frangioso, Harchol-Balter: SRPT scheduling of requests. User-level scheduling of reads and writes. 1998 Almeida, Dabu, Manikutty, Cao: Prioritizing HTTP requests at Web servers. “Nice” the low-priority process. What is the problem with both of these?

14 Network/O.S. insides of traditional Web server Sockets take turns draining --- FAIR = PS. Web Server Socket 1 Socket 3 Socket 2 Network Card Client1 Client3 Client2 BOTTLENECK IMPLEMENT

15 Network/O.S. insides of our improved Web server Socket corresponding to file with smallest remaining data gets to feed first. Web Server Socket 1 Socket 3 Socket 2 Network Card Client1 Client3 Client2 priority queues. 1 st 2 nd 3 rd S M L BOTTLENECK IMPLEMENT

16 Experimental Setup Linux 0.S. Direct Connection 10Mb/s CLIENTS 1 2 3 1 2 3 200 APACHE WEB SERVER 200 Implement SRPT-based scheduling: 1) Modifications to Linux O.S.: 6 priority Levels 2) Modifications to Apache Web server 3) Priority algorithm design. IMPLEMENT

17 Experimental Setup Linux 0.S. CLIENTS 1 2 3 1 2 3 200 APACHE WEB SERVER 200 Comparison of SRPT implementation vs. FAIR scheduling standard. Comparison done under both APACHE and FLASH web servers. Comparisons done under trace-based workload and under Web workload generator. Under trace-based workload: -- Number requests made: 1,000,000 -- Size of file requested: 41B -- 2 MB -- Distribution of file sizes requested has HT property. Experiment under range of loads. IMPLEMENT

18 Preliminary Comments to Results: Linux 0.S. CLIENTS 1 2 3 1 2 3 200 APACHE WEB SERVER 200 Measured job throughput, byte throughput, and bandwidth utilization were same under SRPT and FAIR scheduling. Same set of requests complete. No additional CPU overhead under SRPT scheduling. CPU utilization always 1% - 5%. Network was bottleneck in all experiments. IMPLEMENT

19 Load FAIR SRPT Mean Response Time (  s) Results: Mean Response Time

20 FAIR SRPT Load Mean Slowdown Results: Mean Slowdown

21 Requested File Size (bytes) FAIR SRPT Load = 0.8 Mean Response Time (  s) Results: Mean Response Time vs. Size

22 Percentile of Request Size Mean Response time (  s) FAIR SRPT Load =0.8 Mean Response Time vs. Size Percentile

23 SRPT scheduling yields significant improvements in Mean Response Time at the server. Negligible starvation. No CPU overhead. No drop in throughput. Summary so far...

24 More questions … This study involved a LAN. Are the effects of SRPT in a WAN as strong? So far we’ve only experimented with load < 1. What happens under SRPT vs. FAIR when the server runs under transient overload ? -> new analysis -> implementation study

25 Will improvement of SRPT over FAIR scheduling appear greater in a LAN setting or a WAN setting? Question:

26 WAN setting Normally LAN setting shows more improvement. But for very high load, WAN setting shows more improvement. Why? Answer: Load 0.7Load 0.9

27 Zzzzzzz zzz... Person under overload

28 Web server under overload Clients SYN-queue When reach SYN-queue limit, server drops all connection requests. Server SYN-queueACK-queue Apache-processes

29 Question: What will happen under FAIR vs. SRPT wrt number of connections at server and wrt response times? (a) Given persistent overload (b) Given transient overload FAIR vs. SRPT     

30 Persistent Overload: Time until SYN-queue limit is hit FAIR FAIR+ (tuned version) SRPT 

31 FAIR+ SRPT Persistent Overload: Buildup in number of connections +

32 Transient Overload: Buildup of Connections at Server SRPT FAIR+

33 Transient Overload: Mean response time (in msec) SRPT FAIR+

34 Transient overload: Response time as function of job size small jobs win big! big jobs aren’t hurt! FAIR+ SRPT WHY?

35 SRPT scheduling yields significant improvements in Mean Response Time at the server under LAN, WAN, under high load, and under overload. Negligible or zero unfairness. Often better for all requests. All results corroborated via implementation and analysis. Conclusion

1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal.

Similar presentations

Presentation on theme: "1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal.

Similar presentations

Presentation on theme: "1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal."— Presentation transcript:

Similar presentations

About project

Feedback