Introduction to Grid Computing Ann Chervenak and Ewa Deelman USC Information Sciences Institute.

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
Data Grids Darshan R. Kapadia Gregor von Laszewski
Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Introduction to Grid Computing The Globus Project™ Argonne National Laboratory USC Information Sciences Institute Copyright (c)
The Grid Background and Architecture. 1. Keys to success for IT technologies Infrastructure Open Standards.
SLIDE 1IS 257 – Fall 2006 New Generation Database Systems: XML Databases and Grid-based Digital Libraries University of California, Berkeley.
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Introduction to Grid Computing Ann Chervenak Carl Kesselman And the members of the Globus Team.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Grid Computing. What is a Grid? Many definitions exist in the literature Early definitions: Foster and Kesselman, 1998 –“A computational grid is a hardware.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
Grid Security Steve Tuecke Argonne National Laboratory.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Some Grid Experiences Laura Pearlman USC Information Sciences Institute ICTP Advanced Training Workshop on Scientific Instruments on the Grid *Most of.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
The Anatomy of the Grid: An Integrated View of Grid Architecture Ian Foster, Steve Tuecke Argonne National Laboratory The University of Chicago Carl Kesselman.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Major Grid Computing Initatives Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
GCRC Meeting 2004 Introduction to the Grid and Security Philip Papadopoulos.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
The Globus Toolkit®: The Open Source Solution for Grid Computing
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 Observations on Architecture, Protocols, Services, APIs, SDKs, and the Role of the Grid Forum Ian Foster Carl Kesselman Steven Tuecke.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
7. Grid Computing Systems and Resource Management
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
1 Overall Architectural Design of the Earth System Grid.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
1 I.Foster LCG Grid Technology: Introduction & Overview Ian Foster Argonne National Laboratory University of Chicago.
Middleware and the Grid Steven Tuecke Mathematics and Computer Science Division Argonne National Laboratory.
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
] Open Science Grid Ben Clifford University of Chicago
Clouds , Grids and Clusters
Grid Computing.
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
CS258 Spring 2002 Mark Whitney and Yitao Duan
Introduction to Grid Technology
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

Introduction to Grid Computing Ann Chervenak and Ewa Deelman USC Information Sciences Institute

2 Outline l Motivation l Definition and characteristics of Grids l Example Grid applications l Grid Architecture l How a Grid Is Assembled l Overview of the Globus Toolkit u Security Tools u Monitoring and Discovery System u Computing/Execution Tools u Data Tools l A more detailed example: The Earth System Grid

3 Motivation: Supporting Scientific Applications l Computation intensive u Large-scale simulation and analysis (climate modeling, galaxy formation, gravity waves, event simulation) u Engineering (parameter studies, linked models) l Data intensive u Experimental data analysis (high energy physics) u Image & sensor analysis (astronomy, climate) l Distributed collaboration u Online instrumentation (microscopes, x-ray) u Remote visualization (climate studies, biology) u Engineering (large-scale structural testing) l Large, complex scientific problems u Require people in several organizations to collaborate u Share computing resources, data, instruments

4 The Grid Problem l Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource (From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”) l Enable communities (“Virtual Organizations”) to share geographically distributed resources as they pursue common goals l Assuming the absence of… u central location u central control u omniscience u existing trust relationships

5 An Old Idea … l “The time-sharing computer system can unite a group of investigators …. one can conceive of such a facility as an … intellectual public utility.” u Fernando Corbato and Robert Fano, 1966 l “We will perhaps see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.” u Len Kleinrock, 1967

A Few Grid Application Examples

7 Earth System Grid objectives To support the infrastructural needs of the national and international climate community, ESG is providing crucial technology to securely access, monitor, catalog, transport, and distribute data in today’s Grid computing environment. 7 Bernholdt_ESG_0611 HPC hardware running climate models ESG Sites ESG Portal Slide Courtesy of Dave Bernholdt, ORNL

8 ESG Portal at NCAR IPCC AR4 ESG Portal 130 TB of data at four locations l 840,331 files l Includes the past 6 years of joint DOE/NSF climate modeling experiments 28 TB of data at one location l 68,400 files l Generated by a modeling campaign coordinated by the Intergovernmental Panel on Climate Change l Model data from 11 countries 3,200 registered users818 registered analysis projects Downloads to date l 25 TB l 91,000 files Downloads to date l 123 TB l 543,500 files l 300 GB/day (average) 300 scientific papers published to date based on analysis of IPCC AR4 data ESG Facts and Figures Worldwide ESG user base Nov 2004 – Oct 2006 IPCC Downloads (10/12/06) Slide Courtesy of Dave Bernholdt, ORNL

9 UCSD UT UC/ANL NCSA PSC ORNL PU IU A National Science Foundation Investment in Cyberinfrastructure $100M 3-year construction ( ) $150M 5-year operation & enhancement ( ) NSF’s TeraGrid * l TeraGrid DEEP: Integrating NSF’s most powerful computers (60+ TF) u 2+ PB Online Data Storage u National data visualization facilities u World’s most powerful network (national footprint) l TeraGrid WIDE Science Gateways: Engaging Scientific Communities u 90+ Community Data Collections u Growing set of community partnerships spanning the science community. u Leveraging NSF ITR, NIH, DOE and other science community projects. u Engaging peer Grid projects such as Open Science Grid in the U.S. as peer Grids in Europe and Asia-Pacific. l Base TeraGrid Cyberinfrastructure: Persistent, Reliable, National u Coordinated distributed computing and information environment u Coherent User Outreach, Training, and Support u Common, open infrastructure services * Slide courtesy of Ray Bair, Argonne National Laboratory

10 Image courtesy Harvey Newman, Caltech Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPSFrance Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents

11 Elements of a Grid l Resource sharing u Computers, storage systems, sensors, networks,… u This sharing is always conditional: issues of trust, policy, negotiation, payment, etc. l Coordinated problem solving u Distributed data analysis, computation, simulation, collaboration, … l Dynamic, multi-institutional virtual organizations u Community overlays on classic organizational structures u May be large or small, static or dynamic

12 Two Rules or Principles of the Grid l Can’t rely on homogeneity of resources u In practice, resources in a large, distributed environment will be heterogeneous u STRATEGY - Plan for diverse systems and use mechanisms to manage heterogeneity l Can’t rely on trust among participants u Sites will not be willing to share their resources if they cannot trust clients from other sites u STRATEGY - Provide a security model that can express complicated social networks u STRATEGY - Use full disclosure when making requests (who is requesting, authorizing, and authenticating the request) and give service owners tools to enforce local policies.

13 Relation to Other Technologies l Grid Computing has much in common with major industrial thrusts u Service-Oriented Architecture (SOA), Business-to- business, Peer-to-peer, Application Service Providers, Storage Service Providers, etc. l Sharing issues not adequately addressed by existing technologies u Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q” u High performance: unique demands of advanced and high-performance systems

14 Grid Infrastructure l Provides distributed management u Of physical resources u Of software services u Of communities and their policies l Unified treatment u Build on Web Services framework u Use Web Services Resource Framework (WS-RF), Web Services Notification (WS-Notification), etc. to represent and access state associated with a service u Common management abstractions & interfaces

15 Elements of the End-to-End Problem Include … l Massively parallel petascale simulation l High-performance parallel I/O l Remote visualization l High-speed reliable data movement l Terascale local analysis l Data access and analysis by external users l Troubleshooting problems in end-to-end system l Security l Orchestration of these various activities Slide Courtesy of Ian Foster

Layered Grid Architecture

17 Layered Grid Architecture (By Analogy to Internet Architecture) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture

18 Protocols, Services, and APIs Occur at Each Level Languages/Frameworks Fabric Layer Applications Local Access APIs and Protocols Collective Service APIs and SDKs Collective Services Collective Service Protocols Resource APIs and SDKs Resource Services Resource Service Protocols Connectivity APIs Connectivity Protocols

19 Important Points l Built on Internet protocols & services u Communication, routing, name resolution, etc. l “Layering” here is conceptual, does not imply constraints on who can call what u Protocols/services/APIs/SDKs will, ideally, be largely self-contained u Some things are fundamental: e.g., communication and security u But, advantageous for higher-level functions to use common lower-level functions

20 The Hourglass Model l Focus on architecture issues u Propose set of core services as basic infrastructure u Use to construct high-level, domain-specific solutions l Design principles u Keep participation cost low u Enable local control u Support for adaptation u “IP hourglass” model Diverse global services Core services Local OS A p p l i c a t i o n s

21 Layered Grid Architecture (By Analogy to Internet Architecture) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture

22 GSI: Connectivity Layer Protocols & Services l Communication protocols u Internet protocols: IP, DNS, routing, etc. l Security protocols and infrastructure u Uniform authentication, authorization, and message protection mechanisms in multi-institutional setting u Single sign-on, delegation, identity mapping u E.g., Public key technology, SSL, X.509, GSS-API u Supporting infrastructure: Certificate Authorities, certificate & key management, …

23 Resource Layer Protocols & Services l Job submission and management tools u Remote allocation, advance reservation, control of compute resources l Data Transport Tools u High-performance data access & transport l Information Provider u Collects information about the current state of a resource, makes available to higher-level service

24 Collective Layer Protocols & Services l Information Services u Aggregate and publish information about resource characteristics u Monitor current status of resources l Resource brokers u Resource discovery and allocation l Metadata and Replica Catalogs l Data Management Services (e.g., replication) l Co-reservation and co-allocation services l Workflow management services

25 Example: High-Throughput Computing System High Throughput Computing System Dynamic checkpoint, job management, failover, staging Brokering, certificate authorities Access to data, access to computers, access to network performance data Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, schedulers Collective (App) App Collective (Generic) Resource Connect Fabric

26 Example: Grid Services for Data-Intensive Applications Discipline-Specific Data Grid Application Coherency control, replica selection, task management, data placement services, … Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs, … Access to data, access to computers, access to network performance data, … Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, clusters, networks, network caches, … Collective (App) App Collective (Generic) Resource Connect Fabric