15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division,

Slides:



Advertisements
Similar presentations
Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.
Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing
XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.
Cs238 CPU Scheduling Dr. Alan R. Davis. CPU Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU.
Performance Evaluation of Load Sharing Policies on a Beowulf Cluster James Nichols Marc Lemaire Advisor: Mark Claypool.
Chapter 8 Operating System Support
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Chapter 3 Overview of Operating Systems Copyright © 2008.
A Hadoop MapReduce Performance Prediction Method
Page 1 © 2001 Hewlett-Packard Company Tools for Measuring System and Application Performance Introduction GlancePlus Introduction Glance Motif Glance Character.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Operating systems CHAPTER 7.
Intel
MobSched: An Optimizable Scheduler for Mobile Cloud Computing S. SindiaS. GaoB. Black A.LimV. D. AgrawalP. Agrawal Auburn University, Auburn, AL 45 th.
A Dynamic MapReduce Scheduler for Heterogeneous Workloads Chao Tian, Haojie Zhou, Yongqiang He,Li Zha 簡報人:碩資工一甲 董耀文.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
Thanks to Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction n What is an Operating System? n Mainframe Systems.
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
المحاضرة الاولى Operating Systems. The general objectives of this decision explain the concepts and the importance of operating systems and development.
SSS Test Results Scalability, Durability, Anomalies Todd Kordenbrock Technology Consultant Scalable Computing Division Sandia is a multiprogram.
Operating Systems Process Management.
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
Uniprocessor Scheduling Chapter 9. Aim of Scheduling To improve: Response time: time it takes a system to react to a given input Turnaround Time (TAT)
A User-Lever Concurrency Manager Hongsheng Lu & Kai Xiao.
Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Guide To UNIX Using Linux Third Edition Chapter 8: Exploring the UNIX/Linux Utilities.
A Cyclic-Executive-Based QoS Guarantee over USB Chih-Yuan Huang,Li-Pin Chang, and Tei-Wei Kuo Department of Computer Science and Information Engineering.
Queueing Models with Multiple Classes CSCI 8710 Tuesday, November 28th Kraemer.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
2.5 Scheduling Given a multiprogramming system. Given a multiprogramming system. Many times when more than 1 process is waiting for the CPU (in the ready.
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
2.5 Scheduling. Given a multiprogramming system, there are many times when more than 1 process is waiting for the CPU (in the ready queue). Given a multiprogramming.
The IEEE International Conference on Cluster Computing 2010
Operating System. Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered.
International Conference on Autonomic Computing Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging Jonathan Strickland (1) Vincent.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 31 – Process Management (Part 1) Klara Nahrstedt Spring 2009.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Sunpyo Hong, Hyesoon Kim
Operating Systems. Categories of Software System Software –Operating Systems (OS) –Language Translators –Utility Programs Application Software.
CSC414 “Introduction to UNIX/ Linux” Lecture 3
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Brief introduction about “Grid at LNS”
REAL-TIME OPERATING SYSTEMS
Lecture 2: Performance Evaluation
Linux Scheduler.
Introduction to Load Balancing:
Uniprocessor Scheduling
Chapter 1: Introduction
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
A Comparison Study of Process Scheduling in FreeBSD, Linux and Win2k
Lecture 21: Introduction to Process Scheduling
Support for ”interactive batch”
Smita Vijayakumar Qian Zhu Gagan Agrawal
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Lecture 21: Introduction to Process Scheduling
Software System Performance
Lecture 3: Main Memory.
Scheduling 21 May 2019.
COMP755 Advanced Operating Systems
OPERATING SYSTEMS MEMORY MANAGEMENT BY DR.V.R.ELANGOVAN.
Presentation transcript:

15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division, BARC, Mumbai, India Helge Mainhard, Tony Cass, Olof Barring, CERN Geneva, Switzerland

15/02/2006CHEP 062 INTRODUCTION  Quality of Service Defines goodness of a node for a type of task Needed for better/optimum utilization of resources  Computer Division, BARC and IT Division CERN collaborated to explore ways to predict QoS

15/02/2006CHEP 063 QoS – Definition  QoS defines, how better the node is for a given task  QoS relates execution times like this  QoS varies between 0 to 1 T execution = Wall clock execution time for any task T noload = Wall clock execution time of the task on a given node without load QoS = Quality of Service

15/02/2006CHEP 064 Methodology  Three task categories CPU intensive Disk IO intensive Network IO intensive  Representative probe programs for each category  Load generating program for each category

15/02/2006CHEP 065 Methodology  Monitor system metrics Load avg, CPU utilization, Memory utilization, disk utilization, swap utilization etc.  Execute probe programs in different load conditions (generated using load generating programs)  Correlate probe execution time, system metrics and no load execution time of probe

15/02/2006CHEP 066 Probe Selection  Probe should Represent real world applications Have less execution time Non-interactive  Selected probes are Linpack for CPU intensive Bonnie for Disk IO intensive Network IO intensive (not considered)

15/02/2006CHEP 067 Load Generating programs  Generate load in given category  Should have large execution time  Feature for varying the load  Two type of Disk IO load Block IO (IO in large data blocks) Character IO (IO in small data blocks)

15/02/2006CHEP 068 SETUP  32 node cluster  Each node consists of GHz 640 MB memory 40 GB HDD Redhat Linux version 7.3  EDG Fabric Monitoring System for gathering system metrics

15/02/2006CHEP 069 CPU Probe  CPU probe in different loading conditions  Correlation using load average  Execution time varies linearly with load average  Problem in block IO load (Equation 1)

15/02/2006CHEP 0610 CPU Probe

15/02/2006CHEP 0611 CPU Probe  Load average represents combined CPU and IO load  CPU probe depends only on CPU load  Two ways to achieve it Average CPU load (VmStatR) Calculate available CPU to probe

15/02/2006CHEP 0612 CPU Probe  Average CPU Load 1 minute running average of run queue Called VmStatR Predicted QoS will be (Equation 2)

15/02/2006CHEP 0613 CPU Probe

15/02/2006CHEP 0614 CPU Probe  Available CPU to probe Calculate using CPU utilization metric Probe is eligible for  Available Idle time  A share of System and User time (Equation 3)

15/02/2006CHEP 0615 CPU Probe  Table shows the comparison between QoS predicted using equation 1 & 3 in Block IO load  QoS using Eq. 3 shows correct characteristic QoS using Equation 1QoS using Equation 3Execution Time (Sec)

15/02/2006CHEP 0616 Comparison of results  Compare the QoS results obtained using the three equations for CPU probe in different loads Equation 1 does not give correct prediction in block IO load conditions Equation 2 & 3 give acceptable results in any load condition

15/02/2006CHEP 0617 CPU Probe – Comparison of results LC – CPU Load LC+LB – CPU + Block IO Load LC + LCh – CPU + Character IO Load LCh + LB – Character + Block IO Load

15/02/2006CHEP 0618 Disk IO Probe  Modified ‘Bonnie’ to perform both as block IO and character IO probe  Considered block IO probe as most of the applications were block IO intensive  Correlate execution time probe under different loading conditions  Predicted QoS using the three equations and compared results

15/02/2006CHEP 0619 Disk IO Probe – Comparison of results LC – CPU Load LC+LB – CPU + Block IO Load LC + LCh – CPU + Character IO Load LCh + LB – Character + Block IO Load

15/02/2006CHEP 0620 CMSIM Results  Predicted execution time using QoS from Equation 2  % error against the measured one acceptable Measured Execution Time (Sec)Predicted Execution Time (Sec)% Error

15/02/2006CHEP 0621 Problem Areas  Effect of swapping If available memory is less than the size of task Linux kernel dynamically changes the priorities of tasks and swaps tasks accordingly Difficult to predict QoS

15/02/2006CHEP 0622 Problem Areas – Swapping

15/02/2006CHEP 0623 Problem Areas  Metric sampling frequency of monitoring system Immediate metric value ensures better QoS prediction At higher sampling frequency monitoring loads the node  Change in state after submission of task QoS can’t consider load changes after submission of task Submission/removal of other task may change QoS

15/02/2006CHEP 0624 Conclusion  Equation 2 & 3 provides better QoS for CPU bound applications  Equation 1 can be used for IO bound applications  Successfully predicted for CMSIM – It is mostly cpu bound job  Load balancing programs can use derived equations for job submissions

15/02/2006CHEP 0625 Thanks