Providing Differentiated Levels of Service in Web Content Hosting Jussara Almeida, etc... First Workshop on Internet Server Performance, 1998 Computer Architecture Lab. CS Dept. KAIST 2000/9/18 Kim, Sung-Wan
1/16 Contents Introduction Methodology Design & Implementation Experiment & Results Limitations of the study Conclusions
2/16 Introduction (1/2) Increased Web content hosting –Many servers charge fees for the Web service –Customers expect quality of service proportional to the fee Apache –Most used Web server –FCFS(first-com first-served) doesn’t support for differentiated quality of service Objective –To provide differentiated levels of service by priority-based request scheduling
3/16 Introduction (2/2) Priority-based request scheduling –Performance metric QoS - Response time to the request –Method of improving performance Restrict the maximum number of concurrent processes at each priority –Approaches User-level & kernel-level
4/16 Methodology Priority of requests –Based on customer who pay, not documents Approaches –User-level Add a scheduler process to Apache –Kernel-level Both Apache and Linux kernel by adding new system calls –Mapping from request priorities into process priorities –Keeping track of which processes are running at each priority level Performance metric –Response time The average latency time taken by the server After accepting a connection, until closing the connection
5/16 Design & Implementation (1/3) Scheduling policies –Sleep policy : Upon receiving a request –Wakeup policy : In place of a completed request –Implementation Maximum thresholds : A fixed number of slots for each priority level Queue for blocked requests Conserving policies –Non-work conserving Allow requests to occupy only slots of the same type –Work conserving Does NOT allow above High-priority Low-priority Queue request or
6/16 Design & Implementation (2/3) User-level approach –A master process spawn a child process for each request and a Scheduler process –The child process determines its priority from URL Maps the customer name into a priority value Master process Scheduler spawn Child process #1 Child process #2 Child process #3 spawn requests request scheduling sleep or wakeup policy
7/16 Design & Implementation (3/3) Kernel-level approach –Parameters The number of priority levels The number of concurrent processes at each level The priority value assigned to a blocked process –SLEEPING_PRIORITY –Roles of kernel Maps request priority to a process priority Scheduling (sleep & wake-up policy) Wake-up –Decides the priority level of the processes to be unblocked –Choose the oldest process New system calls –initialize_priority_scheme, my_set_priority, my_release_priority
8/16 Experimental Setup For user-level approach –Sun SparcStation Two 66MHz CPUs, 64 MB RAM, Solaris 2.4, 100 Mbps Ethernet For kernel-level approach –DEC 90MHz Pentium 32 MB RAM, Linux , 10 Mbps Ethernet HTTP server : Apache 1.3b2, KeepAlive off Client : WebStone benchmark –6 machines, 5 client processes per machine –2 different workloads
9/16 Results (1/6) - User-level approach (1/3) Non-work conserving
10/16 Results (2/6) - User-level approach (2/3) Non-work conserving
11/16 Results (3/6) - User-level approach (3/3) Work conserving
12/16 Results (4/6) - Kernel-level approach (1/3) Average latency for requests of type A & B for both workload with no policy The configurations used in the experiments
13/16 Results (5/6) - Kernel-level approach (2/3) Average latency for workload WA Average latency for workload WB
14/16 Results (6/6) - Kernel-level approach (3/3) Average latency for workload WB using non-work conserving and SLEEPING_PRIORITY = -1
15/16 Limitations of the Study For truly differentiated QoS –CPU scheduling –Replacement policy for buffer cache –Disk I/O scheduling to favor high-priority –Networking QoS –But, focused on only CPU scheduling in this study Various mix of high-priority and low-priority requests
16/16 Conclusions Implement the priority-based scheduling Restricting the number of concurrent processes is a simple and effective strategy Work conserving policy is not adequate when the thresholds are large –Non-work conserving is better for multiple levels of priority Critique –Kernel modification –Too small benefit for high-priority, too much loss for low-priority –Thresholds parameter –Request ratio - High-priority : low-priority = 1 : 1 ? –Separated server for high-priority requests