An Introduction to Grid Computing Richard Fujimoto Reference: The Grid 2, ch. 1-4, 7 Ian Foster & Carl Kesselman (eds.)

Slides:



Advertisements
Similar presentations
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Advertisements

Distributed Data Processing
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
High Performance Computing Course Notes Grid Computing.
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The Grid Background and Architecture. 1. Keys to success for IT technologies Infrastructure Open Standards.
Introduction and Overview “the grid” – a proposed distributed computing infrastructure for advanced science and engineering. Purpose: grid concept is motivated.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Introduction to Database Management
Chapter 8: Network Operating Systems and Windows Server 2003-Based Networking Network+ Guide to Networks Third Edition.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Grid Computing Net 535.
Introduction to Grid Computing Ann Chervenak Carl Kesselman And the members of the Globus Team.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
A.V. Bogdanov Private cloud vs personal supercomputer.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
◦ What is an Operating System? What is an Operating System? ◦ Operating System Objectives Operating System Objectives ◦ Services Provided by the Operating.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
The Anatomy of the Grid: An Integrated View of Grid Architecture Ian Foster, Steve Tuecke Argonne National Laboratory The University of Chicago Carl Kesselman.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
Chapter 1 Introduction to Databases. 1-2 Chapter Outline   Common uses of database systems   Meaning of basic terms   Database Applications  
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
Chapter 2 Introduction to Systems Architecture. Chapter goals Discuss the development of automated computing Describe the general capabilities of a computer.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 Observations on Architecture, Protocols, Services, APIs, SDKs, and the Role of the Grid Forum Ian Foster Carl Kesselman Steven Tuecke.
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Introduction to Grid Computing and its components.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Introduction to Active Directory
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
1.3 Operating system services An operating system provide services to programs and to the users of the program. It provides an environment for the execution.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Clouds , Grids and Clusters
A. Rama Bharathi Regd. No: 08931F0040 III M.C.A
Grid Computing.
Recap: introduction to e-science
University of Technology
GRID COMPUTING PRESENTED BY : Richa Chaudhary.
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
The Globus Toolkit™: Information Services
Grid Services B.Ramamurthy 12/28/2018 B.Ramamurthy.
Introduction to Grid Technology
Resource and Service Management on the Grid
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

An Introduction to Grid Computing Richard Fujimoto Reference: The Grid 2, ch. 1-4, 7 Ian Foster & Carl Kesselman (eds.)

Outline What is Grid Computing? Why are we interested in Grids? Grid Architecture from 10,000 feet

Evolution of Technology Phase I: Developmental Stage –Concerned with development of the technology –Focus is the technology itself - how it is built, how it works –Users of the technology are experts –If successful, technology grows in popularity, standards develop, costs decline, widespread use –Examples: automobile, electric power Phase II: Post-Technology Phase –Technology is taken for granted, except when it fails –Main issues are application of technology, ease of use, reliability, availability, cost –Experts behind the scenes make it work, transparent to users

Information Technology Fast approaching post-technology phase (mass adoption) –Increasing commoditization (processors, memory, storage, communications) –More complex, powerful, systems –Possible to have systems with billions of devices and sophistication to hide them from users Issues –Integration and standards –Efficiency while maintaining transparency Virtualization seen as key approach to allow transparent, shared resource usage –Quality of Service Sophisticated, end-to-end resource management needed to ensure high quality at low price

Virtual Organizations “… mutually distrustful participants with varying degrees of prior relationships (perhaps none at all) want to share resources in order to perform some task.” [Foster/Kesselman, p. 39] Coordinated, controlled resource sharing among dynamic multi-institutional virtual organizations –Resources Computational facilities Software Data Sensors, instruments, actuators –Control over what is shared, who has access, conditions under which sharing occurs

What is a Grid? Includes three essential elements: Coordinates sharing of distributed resources –Resources and users live within different control domains –Issues such as security, policy, payment, membership etc. Uses standard, open, general-purpose protocols and interfaces –Address issues such as authentication, authorization, resource discovery, resource access Deliver non-trivial qualities of service –Throughput, response time, availability, security But what does this mean? Main elements Distributed computing using Standard interfaces, APIs, tools in order to Virtualize resources, people, applications, to support Virtual organizations

Virtual Observatory Application Multiple archives of astronomic data stored at geographically distributed sites –Each covers part of the electromagnetic spectrum for a certain period of time for certain celestial objects –Desire to do multi-spectral or temporal studies of specific objects by combining data from different archives –Terabytes to petabytes of data; data growing at an exponential rate –Peer-reviewed data! Virtual data [Grid Physics Network Project -GriPhyN] –Pipelined processing of data typical –Data used by analysis packages might be generated dynamically, e.g., query distributed data, processed data in pipeline specified by the user (e.g., recalibration followed by object detection) –Moving data vs. moving computation? Large data sets, operations involving much reduction suggest moving the computation to the data Reference: Grid 2, Chapter 7 (Szalay, Gray)

Hierarchical Architecture Archives –Text, images, raw data –Data mining tools to search and subset data objects –Metadata (units, provenance) Web services –Queries –File transfer –Data format standards (VOTable) - similar to HLA OMT Registries –Records kinds of information stored in each archive - sky coverage, temporal coverage, spectral coverage, resolution Portals –Process user queries by integrating data from different archives

Issues Economics of database queries –Empirical costs for computation, disk space, network bandwidth, DB access; use to compute most economical approach to processing query –Most queries data intensive (<10K instructions per byte) suggesting usually better to move computation near data –Either provide cluster near data, or move database to user (Internet or sneakernet) Compute-Intensive tasks –Raw data must be converted to calibrated, cataloged data –Must reprocess data ~annually due to s/w improvements –Currently, about instructions (15 TB data) - 10 CPU years –Clusters can do it in about 6 weeks –Exploit grid computing Data mining and statistical calculations –Amount of computations for large data sets a major impediment

Sample VO Grid Workflow Locate suitable sites (data archives) Authenticate access to these sites Allocate resources on those computers Select, configure and initiate computations at those sites Automatically and transparently adapt to changes in resource availability, changes in user requirements Display output to user

Grid Architecture Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture Slide courtesy of C. Kessleman Cal(IT)2 Presentation

Fabric Layer Two types of basic services for individual resources Introspection mechanisms –Determination of structure, state, capability of resource Resource management mechanisms –Control over delivered quality of service Resource types and example services Computational resources –Characteristics of hardware/software resources available, status (e.g., load, job queue length) –Starting programs, monitoring and controling execution of processes –Control over resources allocated to processes, advance reservations Storage resources –File access (read, write) –Check availability of memory or disk space –Control of resources allocated for data transfer (e.g., disk bandwidth) Network resources –Control over prioritization, bandwidth allocation –Interrogate for network characteristics of load

Connectivity Layer Communication services between fabric layer resources –Basically, Internet protocols (TCP, UDP, DNS, RSVP, etc.) Authentication protocols –Single sign-on to access multiple resources –Delegation - give program ability to access resources user is authorized to access –Integration with local security mechanisms –User-based trust relationships - if user can access A and B, should be able to access both without requiring A’s and B’s security administrators to interact

Resource Layer: Sharing Single Resources Protocols for secure negotiation, initiation, monitoring, control, accounting, and payment of sharing operations on individual resources Envisioned to be a small set of protocols Use fabric level functions to access and control local resources Information protocols - obtain information on structure and state of resource (e.g., loading, configuration, cost of use) Management protocols - negotiate access to resource, e.g., for QoS –Check usage against policy –Accounting and payment

Collective: Coordinating Multiple Resources Discovery services to allow discovery of resources and queries of status Coallocation, scheduling, and brokering services to utilize multiple resources for a specific purpose Monitoring and diagnostic services Data replication services - manage storage resources to achieve acceptable performance Programming models and tools Grid enable programming systems, e.g., grid-MPI Workflow specification and management Software discovery services Collaborative work services Security, policy, accounting issues

Final Comments Current trends –Merging of Grid and Web services –Has much momentum - substantial industry support –Universally embraced by scientific computing community –Enterprise computing in commercial sector Ideas have been around for awhile (e.g., meta-computing) –Standardization perhaps most important aspect