Grid Computing and the Open Grid Service Architecture Ian Foster Argonne National Laboratory University of Chicago 2nd IEEE Intl Symp. on Network Computing & Applications, Boston, April 17, 2003
2 ARGONNE CHICAGO Abstract In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic "virtual organizations“ formed from the disparate resources within a single enterprise and/or via external resource sharing relationships. This integration can be technically challenging due to the need to achieve various qualities of service in heterogeneous environments. I introduce this "Grid opportunity," discuss the origins and applications of Grid technologies in the world of science, and present recent work on an Open Grid Services Architecture that seeks to generalize Grid computing concepts to create a powerful framework for distributed resource sharing and management.
3 ARGONNE CHICAGO Partial Acknowledgements l Open Grid Services Architecture design –Carl Kesselman, Karl USC/ISI –Steve –Jeff Nick, Steve Graham, Jeff IBM l Grid services collaborators at ANL –Kate Keahey, Gregor von Laszewski –Thomas Sandholm, Jarek Gawor, John Bresnahan l Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see l Strong links with many EU, UK, US Grid projects l Support from DOE, NASA, NSF, IBM, Microsoft
4 ARGONNE CHICAGO Overview l Grid: why and what l Evolution of Grid technology –Open Grid Services Architecture l Future directions –Towards lightweight VOs: dynamic trust relationships –Towards global knowledge communities: virtual data and dynamic workspaces
5 ARGONNE CHICAGO Why the Grid? (1) Revolution in Science l Pre-Internet –Theorize &/or experiment, alone or in small teams; publish paper l Post-Internet –Construct and mine large databases of observational or simulation data –Develop simulations & analyses –Access specialized devices remotely –Exchange information within distributed multidisciplinary teams
6 ARGONNE CHICAGO Why the Grid? (2) Revolution in Business l Pre-Internet –Central data processing facility l Post-Internet –Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B) –Business processes increasingly computing- & data-rich –Outsourcing becomes feasible => service providers of various sorts
7 ARGONNE CHICAGO New Opportunities Demand New Technology “ Resource sharing & coordinated problem solving in dynamic, multi- institutional virtual organizations” “When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder)
8 ARGONNE CHICAGO Grid Communities & Technologies l Yesterday –Small, static communities, primarily in science –Focus on sharing of computing resources –Globus Toolkit as technology base l Today –Larger communities in science; early industry –Focused on sharing of data and computing –Open Grid Services Architecture emerging l Tomorrow –Large, dynamic, diverse communities that share a wide variety of services, resources, data –New issues: Trust, distributed RM, knowledge
9 ARGONNE CHICAGO NSF TeraGrid l NCSA, SDSC, Argonne, Caltech l Unprecedented capability –13.6 trillion flop/s –600 terabytes of data –40 gigabits per second –Accessible to thousands of scientists working on advanced research l
10 ARGONNE CHICAGO
11 ARGONNE CHICAGO Data Grids for High Energy Physics l Enable international community of 1000s to access & analyze petabytes of data l Harness computing & storage worldwide l Virtual data concepts: manage programs, data, workflow l Distributed system management
12 ARGONNE CHICAGO NEESgrid Earthquake Engineering Collaboratory U.Nevada Reno
13 ARGONNE CHICAGO Grid Computing By M. Mitchell Waldrop May 2002 Hook enough computers together and what do you get? A new kind of utility that offers supercomputer processing on tap. Is Internet history about to repeat itself?
14 ARGONNE CHICAGO “Gridified” Infrastructure Industrial Perspective on Grids: A Wide Range of Applications Financial Services Derivatives Analysis Statistical Analysis Portfolio Risk Analysis Derivatives Analysis Statistical Analysis Portfolio Risk Analysis Manufacturing Mechanical/ Electronic Design Process Simulation Finite Element Analysis Failure Analysis Mechanical/ Electronic Design Process Simulation Finite Element Analysis Failure Analysis LS / Bioinformatics Cancer Research Drug Discovery Protein Folding Protein Sequencing Cancer Research Drug Discovery Protein Folding Protein Sequencing Other Web Applications Weather Analysis Code Breaking/ Simulation Academic Web Applications Weather Analysis Code Breaking/ Simulation Academic Sources: IDC, 2000 and Bear Stearns- Internet /01 Analysis by SAI Grid Services Market Opportunity 2005 Unique by Industry with Common Characteristics Energy Seismic Analysis Reservoir Analysis Seismic Analysis Reservoir Analysis Entertainment Digital Rendering Massive Multi-Player Games Massive Multi-Player Games Streaming Media
15 ARGONNE CHICAGO Overview l Grid: why and what l Evolution of Grid technology –Open Grid Services Architecture l Future directions –Towards lightweight VOs: dynamic trust relationships –Towards global knowledge communities: virtual data and dynamic workspaces
16 ARGONNE CHICAGO Open Grid Services Architecture l Service-oriented architecture –Key to virtualization, discovery, composition, local-remote transparency l Leverage industry standards –Internet, Web services l Distributed service management –A “component model for Web services” l A framework for the definition of composable, interoperable services “The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002
17 ARGONNE CHICAGO Web Services l XML-based distributed computing technology l Web service = a server process that exposes typed ports to the network l Described by the Web Services Description Language, an XML document that contains –Type of message(s) the service understands & types of responses & exceptions it returns –“Methods” bound together as “port types” –Port types bound to protocols as “ports” l A WSDL document completely defines a service and how to access it
18 ARGONNE CHICAGO OGSA Structure l A standard substrate: the Grid service –Standard interfaces and behaviors that address key distributed system issues –A refactoring and extension of the Globus Toolkit protocol suite l … supports standard service specifications –Resource management, databases, workflow, security, diagnostics, etc., etc. –Target of current & planned GGF efforts l … and arbitrary application-specific services based on these & other definitions
19 ARGONNE CHICAGO Open Grid Services Infrastructure Implementation Service data element Other standard interfaces: factory, notification, collections Hosting environment/runtime (“C”, J2EE,.NET, …) Service data element Service data element GridService (required) Data access Lifetime management Explicit destruction Soft-state lifetime Introspection: What port types? What policy? What state? Client Grid Service Handle Grid Service Reference handle resolution
20 ARGONNE CHICAGO Open Grid Services Infrastructure GWD-R (draft-ggf-ogsi- gridservice-23) Editors: Open Grid Services Infrastructure (OGSI) S. Tuecke, ANL K. Czajkowski, USC/ISI I. Foster, ANL J. Frey, IBM S. Graham, IBM C. Kesselman, USC/ISI D. Snelling, Fujitsu Labs P. Vanderbilt, NASA February 17, 2003 Open Grid Services Infrastructure (OGSI)
21 ARGONNE CHICAGO Example: Reliable File Transfer Service Performance Policy Faults service data elements Pending File Transfer Internal State Grid Service Notf’n Source Policy interfaces Query &/or subscribe to service data Fault Monitor Perf. Monitor Client Request and manage file transfer operations Data transfer operations
22 ARGONNE CHICAGO Open Grid Service Architecture: Next Steps Technical specifications –Open Grid Services Infrastructure is complete –Security, data access, Java binding, common resource models, etc., etc., in the pipeline Implementations and compliant products –Here: OGSA-based Globus Toolkit v3, … –Announced: IBM, Avaki, Platform, Sun, NEC, HP, Oracle, UD, Entropia, Insors, …, … Rich set of service defns & implementations
23 ARGONNE CHICAGO Globus Toolkit v3 (GT3) Open Source OGSA Technology l Implements OGSI interfaces l Supports primary GT2 interfaces –High degree of backward compatibility l Multiple platforms & hosting environments –J2EE, Java, C,.NET, Python l New services –SLA negotiation, service registry, community authorization, data management, … l Rapidly growing adoption and contributions: “Linux for the Grid”
24 ARGONNE CHICAGO Overview l Grid: why and what l Evolution of Grid technology –Open Grid Services Architecture l Future directions –Towards lightweight VOs: dynamic trust relationships –Towards global knowledge communities: virtual data and dynamic workspaces
25 ARGONNE CHICAGO Future Directions l Grids are about computers, certainly –“On-demand” access to computing, etc. –Challenging future issues here: e.g., scale
26 ARGONNE CHICAGO CMS Event Simulation Production l Production Run on the Integration Testbed –Simulate 1.5 million full CMS events for physics studies: ~500 sec per event on 850 MHz processor –2 months continuous running across 5 testbed sites –Managed by a single person at the US-CMS Tier 1
27 ARGONNE CHICAGO CMS Event Simulation Production l Production Run on the Integration Testbed –Simulate 1.5 million full CMS events for physics studies: ~500 sec per event on 850 MHz processor –2 months continuous running across 5 testbed sites –Managed by a single person at the US-CMS Tier Million Events Delivered to CMS Physicists! (nearly 30 CPU years)
28 ARGONNE CHICAGO Future Directions l Grids are about computers, certainly –“On-demand” access to computing, etc. –Challenging future issues here: e.g., scale l But they are ultimately about people, their activities, and their interactions –New interaction modalities supported by on- demand formation of lightweight VOs –New technologies needed: e.g., trust, security, data and knowledge integration l Convergence of interest between “Compute” and “Collaboration” Grids?
29 ARGONNE CHICAGO Global Knowledge Communities
30 ARGONNE CHICAGO Example Issue: Trust and Security l Effective VO operation depends critically on –Trust: can I rely on you? –Protection mechanisms to govern actions l Suffers from VO-organization policy mismatch l Goal: collaborations no longer defined by slow centralized mechanisms but can –form spontaneously; –be managed in a distributed manner; and –be protected by an infrastructure that maintains and enforces trust relationships
31 ARGONNE CHICAGO Grid Security Services Requestor Application VO Domain Credential Validation Service Authorization Service Requestor's Domain Service Provider's Domain Audit/ Secure-Logging Service Attribute Service Trust Service Provider Application Bridge/ Translation Service Privacy Service Credential Validation Service Authorization Service Audit/ Secure-Logging Service Attribute Service Trust Service Privacy Service Credential Validation Service Authorization Service Attribute Service Trust Service Credential Validation Service Authorization Service Attribute Service Trust Service WS-Stub Secure Conversation
32 ARGONNE CHICAGO trust ( Tr, Te, As, L ) <- Cs; recommend ( Rr, Re, As, L ) <- Cs; VOTA, PKI, VPN, etc. Establishment, enhancement, maintenance, verification Feasibility analysis wrt cost, legality, etc. Social network analysis Other analyses Workflow analysis Risk analysis allowed (S, O, A, C) Factoring wrt environment Monitoring for reputation, compliance, intrusion detection, etc. Mechanism Policy Trust Community Usability analysis Understanding and Enhancing VO Trust and Security
33 ARGONNE CHICAGO Virtual Data for Collaborative Science l Much collaboration is concerned with the development & use of knowledge, whether –Programs for data analysis and generation –Computations involving those programs –Metadata concerning data, programs, computations—and their interrelationships l In a distributed, heterogeneous, fractal (?) environment with widely varying –Data and analysis program formats –Degrees of formality and scale –Scientific goals and sharing policies
34 ARGONNE CHICAGO Sloan Digital Sky Survey Production System
35 ARGONNE CHICAGO Virtual Data Concept l Capture and manage information about relationships among –Data (of widely varying representations) –Programs (& their execution needs) –Computations (& execution environments) l Apply this information to, e.g. –Discovery: Data and program discovery –Workflow: Structured paradigm for organizing, locating, specifying, & requesting data –Explanation: provenance –Planning and scheduling –Other uses we haven’t thought of
36 ARGONNE CHICAGO TransformationDerivation Data created-by execution-of consumed-by/ generated-by “I’ve detected a calibration error in an instrument and want to know which derived data to recompute.” “I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.” “I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.” “I want to apply an astronomical analysis program to millions of objects. If the results already exist, I’ll save weeks of computation.” Motivations
37 ARGONNE CHICAGO Galaxy cluster size distribution DAG Example: Sloan Galaxy Cluster Analysis Sloan Data Jim Annis, Steve Kent, Vijay Sehkri, Fermilab; Michael Milligan, Yong Zhao, Chicago
38 ARGONNE CHICAGO Integrating Provenance Data
39 ARGONNE CHICAGO Summary l Yesterday –Small, static communities, primarily in science –Focus on sharing of computing resources –Globus Toolkit as technology base l Today –Larger communities in science; early industry –Focused on sharing of data and computing –Open Grid Services Architecture emerging l Tomorrow –Large, dynamic, diverse communities that share a wide variety of services, resources, data –New issues: Trust, distributed RM, knowledge
40 ARGONNE CHICAGO l The Globus Project™ – l Technical articles – l Open Grid Services Arch. – l Chimera – l Global Grid Forum – For More Information