Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya Dwarkadas (University of Rochester) Michael Scott (University of Rochester) Jian Yin (IBM TJ Watson)
Large Distributed Online Services Amazon, EBay, Google, Citrix, etc. Implemented via many distinct single- purpose components Developers use common interface Service demands affect the bottom line Sustained Throughput Response time, Reliability, etc. Hardware costs affect bottom line. How can we optimize transparently?
Component Placement Is system performance affected by component placement? Can we determine the maximal throughput placement? Given a complex service divided into N components distributed among M machines:
Our Solution 1. Build component profiles 2. Ascertain workload and available resources 3. Estimate throughput for all settings (or use heuristic) 4. Setting with largest throughput estimate is optimal Web Server WS Business Logic A A Business Logic B B Database DB Component Resource Consumption Profiles Available resources and workload Placement Decisions Placement Executive Runtime environment for a cluster-based online services Offline
Component Profiles Characterize resource consumption per component Acquired via offline examination Contain information on the following resources CPU Consumption (average, peak) Memory Usage (peak) Network Consumption (average, peak) Derived from proc file system We hypothesize resource consumption grows linearly with workload (requests per second).
Profile Validation Resulting component profile: Workload (requests/second) average average linear fitting peak peak linear fitting CPU usage (in percentage) Network usage (in Mb) Resource consumption is proportional to workload.
Predicting Throughput For any given placement: A server reaches maximum throughput when a resource saturates Using component profiles, we predict resource consumption over all components on the server CPU consumption = CPU per-request * Workload + CPU overhead A workload is non-saturating iff CPU saturation > CPU per-request * Workload + CPU overhead CPU max : Largest non-saturating workload TP = MIN[ CPU max, MEM max, NET max ] of all servers
Experimental Setup RUBiS [Amza et al., 2002] Auction benchmark modeled after EBay 11 Components : Web server, Database, and 9 Enterprise Java Beans. 2 Machine setup 1.26 GHz, 2 GB Memory, JBOSS, and MySql Business logic: No replication, Static database and web server Compare 4 Placement Strategies All on web, All on DB, Writer’s with Web, and Profiler’s choice
Impact of Placement Input workload (requests/second) All with Web All with database Writers with Web Profiler‘s choice Throughput (requests/second) We Observe: Placement affects maximal throughput by 38% Profiler’s choice exceeds other strategies by % We Conclude: Component placement can significantly affect performance. Component Profiles can choose a good placement strategy.
Prediction Accuracy AllWebAllDBWritersWebProfiler Pessimistic estimation Measurement result Optimistic estimation Throughput (requests/decond) We Observe: Throughput tends to fall between peak and average predictions Large ranges We Conclude: Component profile predictions are generally accurate More accurate measurement tools are needed.
Future Work Improve Prediction Accuracy Use profiles for other QoS metrics and service needs response time, service differentiation, and capacity planning Dynamic placement decisions Adjust to changing workloads online Support plug and play hardware modifications Placement over wide area networks Extend edge servers for optimal performance
Take Away Points 1. Component placement has a significant impact upon performance in online services. 2. Component profiles capture resource consumption characteristics 3. Resource consumption and throughput can be predicted via component profiles
More Information