Grid performance analysis Directions, issues and open problems Zsolt Németh MTA SZTAKI Computer and Automation Research Institute.

Slides:



Advertisements
Similar presentations
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
Advertisements

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Distributed Systems Architectures Slide 1 1 Chapter 9 Distributed Systems Architectures.
Grid performance, grid benchmarks, grid metrics Zsolt Németh MTA SZTAKI Computer and Automation Research Institute
© 2005 Prentice Hall7-1 Stumpf and Teague Object-Oriented Systems Analysis and Design with UML.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
The CrossGrid project Juha Alatalo Timo Koivusalo.
Software Requirements
CS533 - Concepts of Operating Systems
Common System Components
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
What is adaptive web technology?  There is an increasingly large demand for software systems which are able to operate effectively in dynamic environments.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
Prepared by Abzamiyeva Laura Candidate of the department of KKGU named after Al-Farabi Kizilorda, Kazakstan 2012.
Chapter 9 Architecture Alignment. 9 – Architecture Alignment 9.1 Introduction 9.2 The GRAAL Alignment Framework  System Aspects  The Aggregation.
Chapter 10 Architectural Design
Computer System Architectures Computer System Software
Chapter 8 Architecture Analysis. 8 – Architecture Analysis 8.1 Analysis Techniques 8.2 Quantitative Analysis  Performance Views  Performance.
© 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1 A Discipline of Software Design.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
Chapter 6 : Software Metrics
Computer and Automation Research Institute Hungarian Academy of Sciences Presentation and Analysis of Grid Performance Data Norbert Podhorszki and Peter.
Fall 2000M.B. Ibáñez Lecture 01 Introduction What is an Operating System? The Evolution of Operating Systems Course Outline.
February 20, AgentCities - Agents and Grids Prof Mark Baker ACET, University of Reading Tel:
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
المحاضرة الاولى Operating Systems. The general objectives of this decision explain the concepts and the importance of operating systems and development.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Silberschatz and Galvin  Operating System Concepts Module 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming.
Performance evaluation on grid Zsolt Németh MTA SZTAKI Computer and Automation Research Institute.
Chapter 2 Introduction to Systems Architecture. Chapter goals Discuss the development of automated computing Describe the general capabilities of a computer.
Operating System Principles And Multitasking
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
1 Presentation Methodology Summary B. Golden. 2 Introduction Why use visualizations?  To facilitate user comprehension  To convey complexity and intricacy.
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
Silberschatz and Galvin  Operating System Concepts Module 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming.
1.1 Sandeep TayalCSE Department MAIT 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming Batched Systems Time-Sharing Systems.
Software Engineering, COMP201 Slide 1 Software Requirements BY M D ACHARYA Dept of Computer Science.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Background Computer System Architectures Computer System Software.
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Application architectures Advisor : Dr. Moneer Al_Mekhlafi By : Ahmed AbdAllah Al_Homaidi.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
1 Software Requirements Descriptions and specifications of a system.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Network Topologies for Scalable Multi-User Virtual Environments Lingrui Liang.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Applied Operating System Concepts
Chapter 7: Modifiability
OPERATING SYSTEMS CS 3502 Fall 2017
Applying Control Theory to Stream Processing Systems
Grid Computing.
Part 3 Design What does design mean in different fields?
CSC 480 Software Engineering
Artificial Intelligence Lecture No. 5
University of Technology
GRID COMPUTING PRESENTED BY : Richa Chaudhary.
Quick Introduction to OS
Operating System Concepts
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Operating System Concepts
Presentation transcript:

Grid performance analysis Directions, issues and open problems Zsolt Németh MTA SZTAKI Computer and Automation Research Institute

Outline  What is the grid?  What is grid performance?  Elementary problems of grid performance evaluation Directions Issues Open questions

Distributed applications  A set of cooperative processes

Distributed applications  Processes require resources CPU Memory Network Printer Storage Database Librabries I/O devices

Distributed applications  Resources can be found on computational nodes CPU Memory NetworkPrinter Storage Database Libraries I/O devices CPU Mapping

Distributed applications  Application processes are mapped onto computational nodes  Computational nodes Form a loosely coupled computer system Interact via messages

Distributed applications Process control? Security? Naming? Communication? Input / output? File access? Application: Cooperative processes Physical layer: Computational nodes

Distributed applications Application: Cooperative processes Physical layer: Computational nodes Virtual machine: Process control Security Naming Communication Input / output File access

 Distributed resources are virtually unified by a software layer A virtual machine is introduced between the application and the physical layer Provides a single system image to the application  Types “Conventional” (PVM, some implementations of MPI) Grid (Globus, Legion) Conventional distributed environments and grids

 What is the essential difference?

Conventional distributed environments and grids  Geographical extent?

Conventional distributed environments and grids  Performance?

Conventional distributed environments and grids  Tools and services?

Conventional distributed environments and grids  How is the virtual machine built up?  What does execution mean?  What is the semantics of execution?

Modeling the semantics  Abstract State Machines (ASM) 1.Model for a distributed application assuming a conventional environment 2.The same model (with minimal modifications) assuming a grid  If the latter model works, there are no real differences  If it does not work, what are the fundamental differences  The semantical differences derived from the formal model are presented

Conventional environments Physical level Set of nodes (node=collection of resources) Login access Static Virtual machine Constructed on a priori information Processes Have resource requests Mapping Processes are mapped onto nodes Resource assignment is implicit

Description of grid  “flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources” (The anatomy of the grid)  “single, seamless, computational environment in which cycles, communication and data are shared” (Legion: the Next Step Toward a Nationwide Virtual Computer)  “widearea environment that transparently consists of workstations, personal computers, graphic rendering engines, supercomputers and nontraditional devices” (Legion - A View from 50,000 Feet)  “collection of geographically separated resources connected by a high speed network”, “a software layer which transforms a collection of independent resources into a single, coherent virtual machine” (Metacomputing - What’s in it for me)

Grid Physical layer Virtual machine Resources are assigned to processes Consists of the selected resources Processes Have resource requirements Mapping Assign nodes to resources? Set of resources Shared Dynamic

Grid: the resource abstraction Physical layer Processes Have resource needs Resource abstraction Explicit mapping between virtual and physical resources Cannot be solved at user/application level

Grid: the user abstraction Physical layer Local, physical users (user accounts) Processes Belong to a user User of the virtual machine is authorised to use the constituting resources Have no login access to the node the resource belongs to User abstraction User of the virtual machine is temporarily mapped onto some local accounts Cannot be solved at user/application level

Fundamental grid functionalities  By formal modeling the essential functionalities can be identified Resource abstraction Physical resources can be assigned to virtual resource needs (matched by properties) Grid provides a mapping between virtual and physical resources User abstraction User of the physical machine may be different from the user of the virtual machine Grid provides a temporal mapping between virtual and physical users

Conventional distributed environments and grids Smith 4 nodes Smith 4 CPU, memory, storage Smith 1 CPU

Performance analysis  Instrumentation  Monitoring  Data reduction  Analysis and presentation  Optimisation

The scope of this presentation  Instrumentation  Monitoring  Data reduction  Analysis and presentation  Optimisation

What is grid performance at all?  Traditionally ‘performance’ is Speed Throughput Bandwidth, etc.  Using grids Quantitative reasons Qualitative reasons – QoS Economic aspects

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Interaction of application and the infrastructure  Performance = application perf.  infrastructure perf.  Signature model (Pablo group) Application signature e.g. instructions/FLOPs Scaling factor (capabilities of the resources) e.g. FLOPs/seconds Execution signature: application signature * scaling factor E.g. instructions/second = instructions/FLOPS * FLOPs/seconds

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Possible performance problems in grids  All that may occur in a distributed application  Plus Effectiveness of resource brokering Synchronous availability of resources Resources may change during execution Various local policies Shared use of resources Higher costs of some activities  The corresponding symptoms must be characterised -

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Grid performance metrics  Abstract representation of measurable quantities  M=R 1 xR 2 x...R n  Usual metrics Speedup, efficiency Queue length  Such strict values are not characteristic in grid Cannot be interpreted Cannot be compared  New metrics Local metrics and grid metrics Symbolic description / metrics

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Processing monitoring information  Trace data reduction Proportional to time t, processes P, metrics dimension n  Statistical clustering (reducing P) Similar temporal behaviours are classified Questionnable if works for grids Representative processes are recorded for each class  Statistical projection pursuit (reducing n) reduces the dimension by identifying significant metrics  Sampling frequency (reducing t)

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Processing monitoring information  On-line, semi on-line techniques are preferred Off-line techniques assume that runs can be reproduced  Event ordering No global clock can be assumed Partial ordering is possible only  Automatic analysis instead of human observation

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Performance tuning, optimisation  The execution cannot be reproduced Post-mortem optimisation is not viable On-line steering is necessary though, hard to realise Sensors and actuators Application and implementation dependent E.g Autopilot, Falcon  Average behaviour of applications can be improved Post-mortem tuning of the infrastructure (if possible) Brokering decisions Supporting services

Grid performance analysis 1.Performance is not characterisitic to an application itself rather to the interaction of the application and the infrastructure. 2.The more complex and dynamic nature of a grid introduces more possible performance flaws. 3.Usual metrics and characteristic parameters are not necessarily applicable for grids. 4.The larger event data volume needs careful reduction, feature extraction and intelligent presentation. 5.Due to the permanently changing environment, on-line and semi on- line techniques are advantageous over post mortem methods. 6.Performance tuning is more difficult due to dynamic environment and changing infrastructure. 7.Observation, comparison and analysis is more complex due to the diversity and heterogeneity of resources.

Performance visualisation  Static 2D, 3D Representing statistical data ‘off-line’  Dynamic 2D, 3D Better suited to grid environment Co-visualisation of application and infrastructure, rendering symbolic values, etc. ?  Virtual reality, immersive environments Puts the real world user into the virtual world of grid Allows steering Not widespread

Grid performance prediction  Past cannot be replayed, present is volatile, future?  Application behaviour (temporal and space patterns) Markov models  Infrastructure behaviour (e.g. NWS) Mean based methods Median based methods Autoregressive methods Assumes a priori knowledge about the resources  Multivariate methods Correlated metrics can be estimated in order to reduce intrusion

Some initial thoughts about performance analysis  First steps Grid metrics How to derive metrics from monitored data? Possible grid performance problems How to detect from metrics? What exactly should be monitored?

Technical differences User abstraction Resource abstraction What? Security Information system How? Refinement Resource management Information provider How? Refinement... Refinement