Download presentation
Presentation is loading. Please wait.
Published byCory Owens Modified over 9 years ago
1
Introduction to Grid Computing Ann Chervenak and Ewa Deelman USC Information Sciences Institute
2
2 Outline l Motivation l Definition and characteristics of Grids l Example Grid applications l Grid Architecture l How a Grid Is Assembled l Overview of the Globus Toolkit u Security Tools u Monitoring and Discovery System u Computing/Execution Tools u Data Tools l A more detailed example: The Earth System Grid
3
3 Motivation: Supporting Scientific Applications l Computation intensive u Large-scale simulation and analysis (climate modeling, galaxy formation, gravity waves, event simulation) u Engineering (parameter studies, linked models) l Data intensive u Experimental data analysis (high energy physics) u Image & sensor analysis (astronomy, climate) l Distributed collaboration u Online instrumentation (microscopes, x-ray) u Remote visualization (climate studies, biology) u Engineering (large-scale structural testing) l Large, complex scientific problems u Require people in several organizations to collaborate u Share computing resources, data, instruments
4
4 The Grid Problem l Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource (From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”) l Enable communities (“Virtual Organizations”) to share geographically distributed resources as they pursue common goals l Assuming the absence of… u central location u central control u omniscience u existing trust relationships
5
5 An Old Idea … l “The time-sharing computer system can unite a group of investigators …. one can conceive of such a facility as an … intellectual public utility.” u Fernando Corbato and Robert Fano, 1966 l “We will perhaps see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.” u Len Kleinrock, 1967
6
A Few Grid Application Examples
7
7 Earth System Grid objectives To support the infrastructural needs of the national and international climate community, ESG is providing crucial technology to securely access, monitor, catalog, transport, and distribute data in today’s Grid computing environment. 7 Bernholdt_ESG_0611 HPC hardware running climate models ESG Sites ESG Portal Slide Courtesy of Dave Bernholdt, ORNL
8
8 ESG Portal at NCAR IPCC AR4 ESG Portal 130 TB of data at four locations l 840,331 files l Includes the past 6 years of joint DOE/NSF climate modeling experiments 28 TB of data at one location l 68,400 files l Generated by a modeling campaign coordinated by the Intergovernmental Panel on Climate Change l Model data from 11 countries 3,200 registered users818 registered analysis projects Downloads to date l 25 TB l 91,000 files Downloads to date l 123 TB l 543,500 files l 300 GB/day (average) 300 scientific papers published to date based on analysis of IPCC AR4 data ESG Facts and Figures Worldwide ESG user base Nov 2004 – Oct 2006 IPCC Downloads (10/12/06) Slide Courtesy of Dave Bernholdt, ORNL
9
9 UCSD UT UC/ANL NCSA PSC ORNL PU IU A National Science Foundation Investment in Cyberinfrastructure $100M 3-year construction (2001-2004) $150M 5-year operation & enhancement (2005-2009) NSF’s TeraGrid * l TeraGrid DEEP: Integrating NSF’s most powerful computers (60+ TF) u 2+ PB Online Data Storage u National data visualization facilities u World’s most powerful network (national footprint) l TeraGrid WIDE Science Gateways: Engaging Scientific Communities u 90+ Community Data Collections u Growing set of community partnerships spanning the science community. u Leveraging NSF ITR, NIH, DOE and other science community projects. u Engaging peer Grid projects such as Open Science Grid in the U.S. as peer Grids in Europe and Asia-Pacific. l Base TeraGrid Cyberinfrastructure: Persistent, Reliable, National u Coordinated distributed computing and information environment u Coherent User Outreach, Training, and Support u Common, open infrastructure services * Slide courtesy of Ray Bair, Argonne National Laboratory
10
10 Image courtesy Harvey Newman, Caltech Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPSFrance Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents
11
11 Elements of a Grid l Resource sharing u Computers, storage systems, sensors, networks,… u This sharing is always conditional: issues of trust, policy, negotiation, payment, etc. l Coordinated problem solving u Distributed data analysis, computation, simulation, collaboration, … l Dynamic, multi-institutional virtual organizations u Community overlays on classic organizational structures u May be large or small, static or dynamic
12
12 Two Rules or Principles of the Grid l Can’t rely on homogeneity of resources u In practice, resources in a large, distributed environment will be heterogeneous u STRATEGY - Plan for diverse systems and use mechanisms to manage heterogeneity l Can’t rely on trust among participants u Sites will not be willing to share their resources if they cannot trust clients from other sites u STRATEGY - Provide a security model that can express complicated social networks u STRATEGY - Use full disclosure when making requests (who is requesting, authorizing, and authenticating the request) and give service owners tools to enforce local policies.
13
13 Relation to Other Technologies l Grid Computing has much in common with major industrial thrusts u Service-Oriented Architecture (SOA), Business-to- business, Peer-to-peer, Application Service Providers, Storage Service Providers, etc. l Sharing issues not adequately addressed by existing technologies u Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q” u High performance: unique demands of advanced and high-performance systems
14
14 Grid Infrastructure l Provides distributed management u Of physical resources u Of software services u Of communities and their policies l Unified treatment u Build on Web Services framework u Use Web Services Resource Framework (WS-RF), Web Services Notification (WS-Notification), etc. to represent and access state associated with a service u Common management abstractions & interfaces
15
15 Elements of the End-to-End Problem Include … l Massively parallel petascale simulation l High-performance parallel I/O l Remote visualization l High-speed reliable data movement l Terascale local analysis l Data access and analysis by external users l Troubleshooting problems in end-to-end system l Security l Orchestration of these various activities Slide Courtesy of Ian Foster
16
Layered Grid Architecture
17
17 Layered Grid Architecture (By Analogy to Internet Architecture) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture
18
18 Protocols, Services, and APIs Occur at Each Level Languages/Frameworks Fabric Layer Applications Local Access APIs and Protocols Collective Service APIs and SDKs Collective Services Collective Service Protocols Resource APIs and SDKs Resource Services Resource Service Protocols Connectivity APIs Connectivity Protocols
19
19 Important Points l Built on Internet protocols & services u Communication, routing, name resolution, etc. l “Layering” here is conceptual, does not imply constraints on who can call what u Protocols/services/APIs/SDKs will, ideally, be largely self-contained u Some things are fundamental: e.g., communication and security u But, advantageous for higher-level functions to use common lower-level functions
20
20 The Hourglass Model l Focus on architecture issues u Propose set of core services as basic infrastructure u Use to construct high-level, domain-specific solutions l Design principles u Keep participation cost low u Enable local control u Support for adaptation u “IP hourglass” model Diverse global services Core services Local OS A p p l i c a t i o n s
21
21 Layered Grid Architecture (By Analogy to Internet Architecture) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture
22
22 GSI: www.gridforum.org/security Connectivity Layer Protocols & Services l Communication protocols u Internet protocols: IP, DNS, routing, etc. l Security protocols and infrastructure u Uniform authentication, authorization, and message protection mechanisms in multi-institutional setting u Single sign-on, delegation, identity mapping u E.g., Public key technology, SSL, X.509, GSS-API u Supporting infrastructure: Certificate Authorities, certificate & key management, …
23
23 Resource Layer Protocols & Services l Job submission and management tools u Remote allocation, advance reservation, control of compute resources l Data Transport Tools u High-performance data access & transport l Information Provider u Collects information about the current state of a resource, makes available to higher-level service
24
24 Collective Layer Protocols & Services l Information Services u Aggregate and publish information about resource characteristics u Monitor current status of resources l Resource brokers u Resource discovery and allocation l Metadata and Replica Catalogs l Data Management Services (e.g., replication) l Co-reservation and co-allocation services l Workflow management services
25
25 Example: High-Throughput Computing System High Throughput Computing System Dynamic checkpoint, job management, failover, staging Brokering, certificate authorities Access to data, access to computers, access to network performance data Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, schedulers Collective (App) App Collective (Generic) Resource Connect Fabric
26
26 Example: Grid Services for Data-Intensive Applications Discipline-Specific Data Grid Application Coherency control, replica selection, task management, data placement services, … Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs, … Access to data, access to computers, access to network performance data, … Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, clusters, networks, network caches, … Collective (App) App Collective (Generic) Resource Connect Fabric
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.