Smita Vijayakumar Qian Zhu Gagan Agrawal Automatic and Dynamic Accuracy Management and Resource Provisioning in A Cloud Environment Smita Vijayakumar Qian Zhu Gagan Agrawal
Outline Research Goals Accuracy Management Experimental Evaluation Background Data Streams Virtualization Dynamic Resource Allocation Accuracy Adaptation Research Goals CPU Resource Allocation Accuracy Management Experimental Evaluation Conclusion
Data Streams Sequence of data packets in transmission Example Live Camera Captures, Stock Markets, Video Streaming Applications, etc Require Real-Time Analysis 3
Data Stream Processing on Clouds and VMs Why Clouds for Streaming Apps Pay-as-you model Meet dynamically varying demand Clouds are based on Virtual Machines Software implementation of a machine that executes programs Hides hardware and software heterogeneity CPU can be shared between VMs Modes of Operation Capped Mode Non-Capped Mode
Resource Allocation: Guiding Principles Pay-as-you-go Automatic Resource Allocation Dynamic Resource Allocation Both varying Data Rates and Characteristics affect resource requirement Amount of Data and its Complexity determine CPU requirement
Research Goals Framework for providing Accuracy and Resource Management in Cloud Environment Accuracy Management Convergence to application-specific accuracy goal Maintain user-specified accuracy requirement for the entire duration of run CPU Management Converge to near -optimal resource allocation by constant monitoring of load characteristics
An Adaptive Application: Average of Input Stream Random stream of integers Application considers every third integer in the stream to compute average Adaptive Parameter, sample = 1/3 If higher accuracy is desired, sample can be set to ½ or 1 But then, that requires more CPU resources 1 2 7 4 2 5 8 2 6 8 0 4 3 4 8 2 2 6 7 3 4 3 1 3 6 8 3 2 5 9 9 3 4 6 8 .. 1 2 7 4 2 5 8 2 6 8 0 4 3 4 8 2 2 6 7 3 4 3 1 3 6 8 3 2 5 9 9 3 4 6 8 ..
Accuracy in Data Stream Processing Accuracy-specific processing: User- defined processing accuracy should be met Changes according to input Data Characteristics Require corresponding Resource Allocations Final cost determined by amount of resources allocated over time
Calculating Current Application Accuracy Application developer provides Accuracy Function Many methods of calculating accuracy: Method of direct comparison with input data Not always viable Method of correlation with more fine-grained processing Process data with current adaptive parameters Process same data set with adaptive parameter set to greater accuracy Compare results
Example of Accuracy in Adaptive Application Process batch with current value sample =1/3 For same data set Set sample = 1and find new average Accuracy = f(avg, higher_avg) If Accuracy < Accuracy Goal, set sample = 1/2 Repeat adapting sample 1 2 7 4 2 5 8 2 6 8 0 4 3 4 8 2 2 6 7 3 4 3 1 3 6 8 3 2 5 9 9 3 4 6 8 .. 1 2 7 4 2 5 8 2 6 8 0 4 3 4 8 2 2 6 7 3 4 3 1 3 6 8 3 2 5 9 9 3 4 6 8 ..
Outline Background Research Goals CPU Resource Allocation Accuracy Management Experimental Evaluation Conclusion
CPU Allocation Algorithm Monitor current load statistics Buffer Write Time Processing Time Time-Averaged rates Average data rate over a time window Update CPU allocation Time- Averaged pattern indicates decrease or increase in data flow Continuous Monitoring and Action Arrive at most optimal CPU Allocation
CPU Allocation Algorithm Resource Allocation Adjustments: Coarse Multiplicative Increase Fine Linear Increase Fine Linear Decrease Coarse Linear Decrease Inspired by TCP Congestion Control
CPU Allocation Algorithm Met Accuracy Goal? Sleep and awaken periodically Adjust CPU Allocation Met Allocation Needs? Yes No Yes No
Outline Background Research Goals CPU Resource Allocation Accuracy Management Experimental Evaluation Conclusion
Accuracy Management Checks periodically for accuracy level Re-computes application accuracy If less than specified value then Adjust adaptive parameters Repeat Once target accuracy is achieved, wakes up after every 500 rounds of processing
Accuracy Adaptation: Design Get Current Application Accuracy Met Accuracy Goal? Sleep and awaken periodically Yes No Adjust Adaptive Parameters
Interaction between Components Process Data Block If baseline accuracy not met Accuracy Module adapts till accuracy is met State: Accuracy Met Else, periodically monitor accuracy Periodically CPU Manager wakes up Checks if accuracy goal is met Checks CPU resource allocation
Outline Research Objectives Introduction to Cost Framework CPU Resource Allocation Accuracy Management Experimental Evaluation Static Experiments Dynamic Experiments Conclusion
Experimental Focus Experiments Process Static Experiments Constant Data Rate And Characteristics Dynamic Experiments Varying Data Rates and/or Characteristics Process Accuracy Adaptation to User-Specified Accuracy CPU Convergence to near-optimal Allocation
Streaming Applications Multi-staged pipelined processing Two streaming applications considered: CluStream Intermediate Microclustering of data Approx-Freq-Counts Mining most frequently seen itemset within permissible error
Experimental Setup Virtualization Technology: Xen Ideal CPU Usage: Xentop Applications initialized to values corresponding to least accuracy Communication between management node and processing nodes using UDP
CluStream Static Accuracy Adaptation Accuracy Adaptation for 1.2MBps and 6MBps data rates
CluStream Static Accuracy Adaptation Ideal CPU Load: 74% Average CPU Allocated: 76.0% Ideal CPU Load: 54% Average CPU Allocated: 55.4% Accuracy and CPU Allocation Adaptation for 1.2MBps and 6MBps data rates
Approx-Freq-Counts Dynamic Accuracy Adaptation Spread Distb Sharp Distb Spread Distb
Approx-Freq-Counts Dynamic Accuracy Adaptation Sharp Distb Slow Data Rate Spread Distb Fast Data Rate Sharp Distb Slow Data Rate
Conclusion A framework for automatically and dynamically managing resource allocations on cloud environments Eliminates manual intervention Ensures user-specified accuracy is maintained Converges to near-optimal resource allocation Adapts to varying data stream characteristics Low Overheads: Within 2% ideal resource allocation
Thank You!