Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel, EPFL
Dynamic Content Is Important 1 2 3
Generating Dynamic Content Web Server Dynamic Content Generator Database Server http Several components Examples of request types
Objective: Stable Performance Ideal Throughput Actual Underload Saturation overload Sustain peak throughput Load Prevent overload (admission control) Sustain peak throughput in overload region
Objective: Improve Response Time Excess requests are queued Improve response time (request scheduling) Sustain peak throughput
Main Idea: Gatekeeper Proxy Web Server Dynamic Content Generator Gate Keeper Database Server http Transparent Intercept requests to the bottleneck Maintains measurement-based estimates Performs: Admission control Request scheduling Transparently control how requests are admitted (non-invasive)
Admission Control Sustain peak throughput during overload Amount of work required by each request Capacity of the system
Estimating Work by Request Type Web Server Dynamic Content Generator Gate Keeper Database Server http Key Observations: Finite number of request types Requests of same type take same execution time Different requests differ greatly in execution time Online measurements Gatekeeper maintains per-request estimates
Service Time Distributions To drive these points home let us look at this graph Request type is key! Online measurement
Estimating System Capacity Request load = Execution time (work units of required) Database capacity = max # work units before overload To determine system capacity Binary search Offline Unit approximates resource usage Oblivious to the bottleneck resource
Admission Control - Example req type units R1 5 R2 500 R3 300 … R2 R3 700 1 R1 6 Table: maintain for each type, online estimates of service time (moving average) R1 R2 R3 695 2 R3 195 3 R1 R2 R3 200 4 R2
Request Scheduling - Example (0+500) + (500+10) = 1010 505 (0+10) + (10+500) = 520 260 10 500 500 10
Request Scheduling Reduce average response time Use shortest job first (SJF) policy Reorder requests in admission queue No preemption
Large Variability: TPC-W Requests 95% require < 1000 ms Scheduling has high impact Large variability in many internet workloads
Request Scheduling: Aging Prevent starvation Limit the delay due to scheduling Limit is X times “expected service time” 1 1 1 Big 1 Some graphics in here delay due to sched.
Outline Motivation & Background The Gatekeeper Proxy Experimental Environment Software & Hardware Metrics & Methodology Results Summary and Conclusions
TPC-W Benchmark Transaction Processing Council (TPC-W) Workload generator Models a large e-commerce site: Online bookstore Searching, browsing, buying, registration, … Persistent data Static images on web server All others on back-end database
TPC-W Benchmark - Snapshot Image Promotion (ad) Shopping Cart Next Interaction
TPC-W: Interactions 14 interactions, e.g.: Scale Home (read-only query) Best sellers (complex) Secure payment (ssl) Shopping cart (update query) Scale 10,000 items 288,000 clients 350 MB database (fits in main memory) 183 MB images (in file system of web server)
Software Web Server Dynamic Content Generator Gate Keeper Database http GateKeeper Implemented in Java (JDBC driver) Web Server Apache 1.3.27 App Server Jakarta Tomcat 3.2.4 Database MySQL 3.23.53 OS RedHat 7.2 Linux 2.4.18
Hardware Apache Tomcat MySQL http sql CPU AMD Athlon 1.33 GHz Memory GateKeeper MySQL http sql CPU AMD Athlon 1.33 GHz Memory 768 MB Disk 60 GB, 12 msec, 5400 rpm Network 100 Mbps Ethernet
Emulated Clients Emulated Clients Apache Tomcat MySQL http sql GateKeeper MySQL http sql Client emulator Session duration Think time Markov model Load is a function of the number of clients
Experiments Performance Metrics Methodology Throughput (interactions/minute) Response time (msec, submission to completion) Examine each as a function of load (# of clients) Methodology Average of 5 runs 100 second warm-up 600 second measurement
Admission Control - Throughput 1 db processes: 49 used mem: 275 MB 2 db processes: 233 used mem: 450 MB 3 db processes: 345 used mem: 509 MB
Request Scheduling - Response Time
Request Scheduling - Analysis Response time = Waiting time + Execution (service) time Large variability Many short requests Few very large requests
Request Scheduling - Explanation Big effect on waiting times Average job 9000 200 430
Request Scheduling - Explanation There are many small requests, whose waiting time is affected greatly Few large requests, big effect, but relatively small Net result is … Short job Average job Long job 8000 100 9000 200 13000 16000 400 430 4800
Request Scheduling - Fairness Response time = Waiting time + Execution (service) time Fairness trade-off FIFO Fair: all wait for same amount of time SJF Unfair: favors short requests Better average response time Control waiting time Trade-off
Aging: Prevent Starvation 1 3 5 ∞
In The Paper More results Related work Different bottleneck (database lock contention) Online vs. offline measurements DB2 Related work Most other methods are invasive
Summary Presented the Gatekeeper proxy Admission control Transparent (non-invasive) Intercept requests Online measurements Admission control Consistent performance during overload Improves throughput 10 % Request scheduling using SJF Improves response time 14 times Penalizes long jobs only 13 % Aging controls penalty
Thank You!