Presentation is loading. Please wait.

Presentation is loading. Please wait.

15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division,

Similar presentations


Presentation on theme: "15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division,"— Presentation transcript:

1 15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division, BARC, Mumbai, India Helge Mainhard, Tony Cass, Olof Barring, CERN Geneva, Switzerland

2 15/02/2006CHEP 062 INTRODUCTION  Quality of Service Defines goodness of a node for a type of task Needed for better/optimum utilization of resources  Computer Division, BARC and IT Division CERN collaborated to explore ways to predict QoS

3 15/02/2006CHEP 063 QoS – Definition  QoS defines, how better the node is for a given task  QoS relates execution times like this  QoS varies between 0 to 1 T execution = Wall clock execution time for any task T noload = Wall clock execution time of the task on a given node without load QoS = Quality of Service

4 15/02/2006CHEP 064 Methodology  Three task categories CPU intensive Disk IO intensive Network IO intensive  Representative probe programs for each category  Load generating program for each category

5 15/02/2006CHEP 065 Methodology  Monitor system metrics Load avg, CPU utilization, Memory utilization, disk utilization, swap utilization etc.  Execute probe programs in different load conditions (generated using load generating programs)  Correlate probe execution time, system metrics and no load execution time of probe

6 15/02/2006CHEP 066 Probe Selection  Probe should Represent real world applications Have less execution time Non-interactive  Selected probes are Linpack for CPU intensive Bonnie for Disk IO intensive Network IO intensive (not considered)

7 15/02/2006CHEP 067 Load Generating programs  Generate load in given category  Should have large execution time  Feature for varying the load  Two type of Disk IO load Block IO (IO in large data blocks) Character IO (IO in small data blocks)

8 15/02/2006CHEP 068 SETUP  32 node cluster  Each node consists of P4@1.6 GHz P4@1.6 640 MB memory 40 GB HDD Redhat Linux version 7.3  EDG Fabric Monitoring System for gathering system metrics

9 15/02/2006CHEP 069 CPU Probe  CPU probe in different loading conditions  Correlation using load average  Execution time varies linearly with load average  Problem in block IO load (Equation 1)

10 15/02/2006CHEP 0610 CPU Probe

11 15/02/2006CHEP 0611 CPU Probe  Load average represents combined CPU and IO load  CPU probe depends only on CPU load  Two ways to achieve it Average CPU load (VmStatR) Calculate available CPU to probe

12 15/02/2006CHEP 0612 CPU Probe  Average CPU Load 1 minute running average of run queue Called VmStatR Predicted QoS will be (Equation 2)

13 15/02/2006CHEP 0613 CPU Probe

14 15/02/2006CHEP 0614 CPU Probe  Available CPU to probe Calculate using CPU utilization metric Probe is eligible for  Available Idle time  A share of System and User time (Equation 3)

15 15/02/2006CHEP 0615 CPU Probe  Table shows the comparison between QoS predicted using equation 1 & 3 in Block IO load  QoS using Eq. 3 shows correct characteristic QoS using Equation 1QoS using Equation 3Execution Time (Sec) 0.24330.430048732 0.16050.437544131 0.13290.462446832 0.11360.41530 0.10420.453607931 0.09520.429047630 0.08690.443043531

16 15/02/2006CHEP 0616 Comparison of results  Compare the QoS results obtained using the three equations for CPU probe in different loads Equation 1 does not give correct prediction in block IO load conditions Equation 2 & 3 give acceptable results in any load condition

17 15/02/2006CHEP 0617 CPU Probe – Comparison of results LC – CPU Load LC+LB – CPU + Block IO Load LC + LCh – CPU + Character IO Load LCh + LB – Character + Block IO Load

18 15/02/2006CHEP 0618 Disk IO Probe  Modified ‘Bonnie’ to perform both as block IO and character IO probe  Considered block IO probe as most of the applications were block IO intensive  Correlate execution time probe under different loading conditions  Predicted QoS using the three equations and compared results

19 15/02/2006CHEP 0619 Disk IO Probe – Comparison of results LC – CPU Load LC+LB – CPU + Block IO Load LC + LCh – CPU + Character IO Load LCh + LB – Character + Block IO Load

20 15/02/2006CHEP 0620 CMSIM Results  Predicted execution time using QoS from Equation 2  % error against the measured one acceptable Measured Execution Time (Sec)Predicted Execution Time (Sec)% Error 585 610.86874.422 739 744.32090.720017 929 934.4660.588377 1082 1080.702-0.11999 1230 1216.43-1.10328 1413 1381.166-2.25294 1687 1707.3171.204332

21 15/02/2006CHEP 0621 Problem Areas  Effect of swapping If available memory is less than the size of task Linux kernel dynamically changes the priorities of tasks and swaps tasks accordingly Difficult to predict QoS

22 15/02/2006CHEP 0622 Problem Areas – Swapping

23 15/02/2006CHEP 0623 Problem Areas  Metric sampling frequency of monitoring system Immediate metric value ensures better QoS prediction At higher sampling frequency monitoring loads the node  Change in state after submission of task QoS can’t consider load changes after submission of task Submission/removal of other task may change QoS

24 15/02/2006CHEP 0624 Conclusion  Equation 2 & 3 provides better QoS for CPU bound applications  Equation 1 can be used for IO bound applications  Successfully predicted for CMSIM – It is mostly cpu bound job  Load balancing programs can use derived equations for job submissions

25 15/02/2006CHEP 0625 Thanks


Download ppt "15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division,"

Similar presentations


Ads by Google