Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

High Performance Computing Course Notes Grid Computing.
CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
Distributed components
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Figure 1.1 Interaction between applications and the operating system.
Legion Worldwide virtual computer. About Legion Made in University of Virginia Object-based metasystems software project middleware that connects computer.
Workload Management Massimo Sgaravatto INFN Padova.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Computer System Architectures Computer System Software
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
GrIDS -- A Graph Based Intrusion Detection System For Large Networks Paper by S. Staniford-Chen et. al.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Nimrod/G GRID Resource Broker and Computational Economy David Abramson, Rajkumar Buyya, Jon Giddy School of Computer Science and Software Engineering Monash.
Cluster Reliability Project ISIS Vanderbilt University.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
N. GSU Slide 1 Chapter 05 Clustered Systems for Massive Parallelism N. Xiong Georgia State University.
MIDORI The Post Windows Operating System Microsoft Research’s.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Processes Introduction to Operating Systems: Module 3.
Department of Electronic Engineering Challenges & Proposals INFSO Information Day e-Infrastructure Grid Initiatives 26/27 May.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
August 3, March, The AC3 GRID An investment in the future of Atlantic Canadian R&D Infrastructure Dr. Virendra C. Bhavsar UNB, Fredericton.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Background Computer System Architectures Computer System Software.
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
MSF and MAGE: e-Science Middleware for BT Applications Sep 21, 2006 Jaeyoung Choi Soongsil University, Seoul Korea
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Issues in Cloud Computing. Agenda Issues in Inter-cloud, environments  QoS, Monitoirng Load balancing  Dynamic configuration  Resource optimization.
XtreemOS IP project is funded by the European Commission under contract IST-FP Scientific coordinator Christine Morin, INRIA Presented by Ana.
Computer System Structures
GridOS: Operating System Services for Grid Architectures
Workload Management Workpackage
Clouds , Grids and Clusters
Introduction to Distributed Platforms
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Architecting Web Services
Joseph JaJa, Mike Smorul, and Sangchul Song
Architecting Web Services
Grid Computing.
University of Technology
Dipanjan Chakraborty Anupam Joshi CSEE University of Maryland Baltimore County Anamika: Distributed Service Discovery and Composition Architecture for.
Chapter 3: Windows7 Part 4.
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
The Globus Toolkit™: Information Services
Grid Services B.Ramamurthy 12/28/2018 B.Ramamurthy.
Introduction to Grid Technology
Chapter 2: Operating-System Structures
Resource and Service Management on the Grid
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Outline Operating System Organization Operating System Examples
Chapter 2: Operating-System Structures
Presentation transcript:

Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter, Computer Science Department University of Manitoba Winnipeg Manitoba

Outline Grid Computing Issues u Network computing environment u Scalability, Extensibility, and Adaptability u Quality of Service Grid Models u Resource Management Techniques u Application Execution Models Grid Architecture Example Applications u Compiling, Numerical Processing, Grid Aware Application Related Work

Grid Computing Issues

Network Computing Environment Heterogeneous Nodes u Autonomous administration domains with different resource management policies u Servers, network devices, workstations, PDA, etc. Connected by Communication Links u Support differentiated service levels Use native operating system services u Does not replace existing scheduling and resource control mechanisms u Native operating system is a Grid device driver

Scalability Target Size u Hundreds to Millions of nodes u Different platforms for different scale Grids Global resource management protocols u Fixed format messages u Ability to locally tune protocol performance parameters to match local infrastructure and administrative policy Local policies for resource management u Scheduling, Quality of Service, Tolerance to faults

Extensibility and Adaptability Extensible resource protocol content u Fixed message framework with structured extensibility (XML like) Extensible resource management protocol processing u Message content extensions are processed by extension modules u Modules are dynamically loaded and register content identifiers Variability u Multiple different implementations of the resource protocols Adaptability u Nodes and resources enter and leave the grid continuously u Fault tolerance by resource replication u Operate in an actively hostile environment u Try to survive Byzantine failures

Quality of Service Not restricted to end-to-end network u Processor, memory, I/O also need to support QoS specifications Co-allocation and Co-reservation u Allocation and scheduling need to take into account QoS given to other jobs already in the Grid Providing Service Level Agreements u Aggregate performance levels or on a per job basis? u Site autonomy and resource control restricts the ability to provide guarantees Applications should be able to negotiate QoS with the Grid

Grid Models

Resource Management Techniques Super Scheduler u Hierarchy of cooperating schedulers u Issues: Co-allocation Market Based u Auctioning for resources u Issues: Price management and co-allocation Resource Discovery u Resource attribute and status in a distributed database u Centralized, Agent based, or Hybrid u Issues: devise highly distributed, scalable, fault tolerant schemes

Application Execution Models Legacy application u Native OS resource and scheduling, implicit QoS u Use external resource description language u Modify native OS and service libraries and infer resource requirements and QoS u Recompile with Grid aware compiler that inserts specialized Grid code Grid Aware application u Use specialized Grid API u First “applications” will be compilers, service libraries (MPI, PVM), Grid workbenches and monitoring tools

Grid Aware Applications

Non-Grid Aware Applications

Grid Architecture

Design Approach Layered u Grid Kernel u Grid Core Services u Grid toolkits, workbenches, and user interfaces Fully distributed peer-to-peer model u No centralized information servers u Implementations free to use specialized servers Minimal configuration u Use Service Location Protocol like service

Grid Kernel Architectural Principles Functions that use the services are aware of the distributed environment No guarantees made about reliability of nodes or links Operate on all types of heterogeneous nodes using minimal resources Services will be implemented using native OS with minimal changes to trusted computing base Provide uniform extensible API and services across all nodes Provide resource management mechanisms but do not implement resource management policies

Grid Architecture

Grid Layers and Core Services

Grid Example Applications

Applications Compiling u Ensure similar compiler and libraries are used on all nodes u Compute how long to transfer and compile u Perform deadline scheduling Legacy Numerical Processing u Dynamically linking of Grid code, variable QoS for job steps u Describe network QoS requirements or infer dynamically u Much further research required Collaborative Research Workbench u Negotiate video bandwidth required u Query if a simulation can be run and completed quickly, or schedule it later u Different GUI depending on resources nearby to a research

Related Work

Related Work Application Enabling Systems u Provide tools to allow applications to access globally distributed resources in an integrated fashion u ATLAS, Globe, Globus/GUSTO, Legion, ParaWeb, Symera User Access Systems u Provide end users of the Grid transparent access to geographically distributed systems in a location independent manner u CCS, MOL, NetSolve, PUNCH

Questions ?