Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ian Foster Computation Institute Argonne National Lab & University of Chicago Service-Oriented Science: Scaling eScience Impact.

Similar presentations


Presentation on theme: "Ian Foster Computation Institute Argonne National Lab & University of Chicago Service-Oriented Science: Scaling eScience Impact."— Presentation transcript:

1 Ian Foster Computation Institute Argonne National Lab & University of Chicago Service-Oriented Science: Scaling eScience Impact

2 2 Acknowledgements l Carl Kesselman, with whom I developed many ideas (& slides) l Bill Allcock, Charlie Catlett, Kate Keahey, Jennifer Schopf, Frank Siebenlist, Mike Wilde @ ANL/UC l Ann Chervenak, Ewa Deelman, Laura Pearlman @ USC/ISI l Karl Czajkowski, Steve Tuecke @ Univa l Numerous other fine colleagues in NESC, EGEE, OSG, TeraGrid, etc. l NSF & DOE for research support

3 3 Context: System-Level Science Problems too large &/or complex to tackle alone …

4 4 Two Perspectives on System-Level Science l System-level problems require integration u Of expertise u Of data sources (“data deluge”) u Of component models u Of experimental modalities u Of computing systems l Internet enables decomposition u “When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder)

5 5 Integration & Decomposition: A Two-Dimensional Problem l Decompose across network Clients integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as new services l Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski

6 6 A Unifying Concept: The Grid “ Resource sharing & coordinated problem solving in dynamic, multi- institutional virtual organizations” 1. Enable integration of distributed resources 2. Using general-purpose protocols & infrastructure 3. To deliver required quality of service “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

7 Facilities Computers Storage Networks Services Software People Implementation System-Level Problem Grid technology Decomposition U. Colorado Experimental Model NCSA Computational Model COORD. UIUC Experimental Model

8 8 Provisioning Service-Oriented Systems: Applications vs. Infrastructure l Service-oriented Grid infrastructure u Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation l Service-oriented applications u Wrap applications as services u Compose applications into workflows “The Many Faces of IT as Service”, ACM Queue, Foster, Tuecke, 2005

9 9 Scaling eScience: Forming & Operating Communities l Define membership & roles; enforce laws & community standards u I.e., policy for service-oriented architecture u Addressing dynamic membership & policy l Build, buy, operate, & share infrastructure u Decouple consumer & provider u For data, programs, services, computing, storage, instruments u Address dynamics of community demand

10 10 Defining Community: Membership and Laws l Identify VO participants and roles u For people and services l Specify and control actions of members u Empower members  delegation u Enforce restrictions  federate policy A 12 B 12 A B 1 10 1 1 16 Access granted by community to user Site admission- control policies Effective Access Policy of site to community

11 11 Evolution of Grid Security & Policy 1) Grid security infrastructure u Public key authentication & delegation u Access control lists (“gridmap” files)  Limited set of policies can be expressed 2) Utilities to simplify operational use, e.g. u MyProxy: online credential repository u VOMS, ACL/gridmap management  Broader set of policies, but still ad-hoc 3) General, standards-based framework for authorization & attribute management

12 12 Core Security Mechanisms l Attribute Assertions u C asserts that S has attribute A with value V l Authentication and digital signature u Allows signer to assert attributes l Delegation u C asserts that S can perform O on behalf of C l Attribute mapping u {A1, A2… An}vo1  {A’1, A’2… A’m}vo2 l Policy u Entity with attributes A asserted by C may perform operation O on resource R

13 13 Security Services for VO Policy l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy VO A Service VO ATA VO AZA VO User A VO User B

14 14 Security Services for VO Policy l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy VO A Service VO ATA VO AZA VO User A Delegation Assertion User B can use Service A VO User B Resource Admin Attribute

15 15 Security Services for VO Policy l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy VO A Service VO ATA VO AZA VO User A Delegation Assertion User B can use Service A VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute

16 16 Security Services for VO Policy l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy VO A Service VO ATA VO AZA Mapping ATA VO B Service VO User A Delegation Assertion User B can use Service A VO-A Attr  VO-B Attr VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute

17 17 Closing the Loop: GT4 Security Toolkit VO Rights Users Rights’ Compute Center Access Services (running on user’s behalf) Rights Local policy on VO identity or attribute authority CAS or VOMS issuing SAML or X.509 ACs SSL/WS-Security with Proxy Certificates Authz Callout: SAML, XACML KCA MyProxy Shib

18 18 Security Needn’t Be Hard: Earth System Grid l Purpose u Access to large data l Policies u Per-collection control u Different user classes l Implementation (GT) u Portal-based User Registration Service u PKI, SAML assertions l Experience u >2000 users u >100 TB downloaded PURSE User Registration Optional review www.earthsystemgrid.org See also: GAMA (SDSC), Dorian (OSU)

19 19 Scaling eScience: Forming & Operating Communities l Define membership & roles; enforce laws & community standards u I.e., policy for service-oriented architecture u Addressing dynamics of membership & policy l Build, buy, operate, & share infrastructure u Decouple consumer & provider u For data, programs, services, computing, storage, instruments u Address dynamics of community demand

20 20 Community Services Provider Content Services Capacity Bootstrapping a VO by Assembling Services 1) Integrate services from other sources u Virtualize external services as VO services 2) Coordinate & compose u Create new services from existing ones Capacity Provider “Service-Oriented Science”, Science, 2005

21 21 Providing VO Services: (1) Integration from Other Sources l Negotiate service level agreements l Delegate and deploy capabilities/services l Provision to deliver defined capability l Configure environment l Host layered functions Community A Community Z …

22 22 Virtualizing Existing Services into a VO l Establish service agreement with service u E.g., WS-Agreement l Delegate use to VO user User A VO Admin User B VO User Existing Services

23 23 Deploying New Services Policy Client Environment Activity Allocate/provision Configure Initiate activity Monitor activity Control activity Interface Resource provider WSRF (or WS-Transfer/WS-Man, etc.), Globus GRAM, Virtual Workspaces

24 24 Available in High-Quality Open Source Software … Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005

25 25 http://dev.globus.org Guidelines (Apache) Infrastructure (CVS, email, bugzilla, Wiki) Projects Include …

26 26 Virtual Workspaces (Kate Keahey et al.) l GT4 service for the creation, monitoring, & management of virtual workspaces l High-level workspace description l Web Services interfaces for monitoring & managing l Multiple implementations u Dynamic accounts u Xen virtual machines u (VMware virtual machines) l Virtual clusters as a higher-level construct

27 27 deploy, suspend How do Grids and VMs Play Together? Client request VM EPR inspect & manage use existing VM image Create VM image VM Factory VM Repository VM Manager create new VM image Resource VM start program

28 28 Virtual OSG Clusters OSG cluster Xen hypervisors TeraGrid cluster OSG “Virtual Clusters for Grid Communities,” Zhang et al., CCGrid 2006

29 29 Dynamic Service Deployment (Argonne + China Grid) l Interface u Upload-push u Upload-pull u Deploy u Undeploy u Reload “HAND: Highly Available Dynamic Deployment Infrastructure for GT4,” Li Qi et al., 2006

30 30 Providing VO Services: (2) Coordination & Composition l Take a set of provisioned services … … & compose to synthesize new behaviors l This is traditional service composition u But must also be concerned with emergent behaviors, autonomous interactions u See the work of the agent & PlanetLab communities “Brain vs. Brawn: Why Grids and Agents Need Each Other," Foster, Kesselman, Jennings, 2004.

31 31 Birmingham The Globus-Based LIGO Data Grid Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory www.globus.org/solutions  Cardiff AEI/Golm

32 32 l Pull “missing” files to a storage system GridFTP Reliable File Transfer Service GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Data Movement

33 33 l Pull “missing” files to a storage system GridFTP Local Replica Catalog Replica Location Index Reliable File Transfer Service Local Replica Catalog GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location

34 34 l Pull “missing” files to a storage system List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication

35 35 Hypervisor/OS Deploy hypervisor/OS Composing Resources … Composing Services Physical machine Procure hardware VM Deploy virtual machine State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels JVM Deploy container Deploy service GridFTP

36 36 Hypervisor/OS Deploy hypervisor/OS Composing Resources … Composing Services Physical machine Procure hardware VM Deploy virtual machine State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels JVM Deploy container DRS Deploy service GridFTP LRC VO Services GridFTP

37 37 Decomposition Enables Separation of Concerns & Roles User Service Provider “Provide access to data D at S1, S2, S3 with performance P” Resource Provider “Provide storage with performance P1, network with P2, …” D S1 S2 S3 D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3

38 38 Another Example: Astro Portal Stacking Service l Purpose u On-demand “stacks” of random locations within ~10TB dataset l Challenge u Rapid access to 10- 10K “random” files u Time-varying load l Solution u Dynamic acquisition of compute, storage + + + + + + = + S4S4 Sloan Data Web page or Web Service

39 39 Astro Portal Stacking Performance (LAN GPFS)

40 40 Summary l Community based science will be the norm u Requires collaborations across sciences— including computer science l Many different types of communities u Differ in coupling, membership, lifetime, size l Must think beyond science stovepipes u Community infrastructure will increasingly become the scientific observatory l Scaling requires a separation of concerns u Providers of resources, services, content l Small set of fundamental mechanisms required to build communities

41 41 For More Information l Globus Alliance u www.globus.org l Dev.Globus u dev.globus.org l Open Science Grid u www.opensciencegrid.org l TeraGrid u www.teragrid.org l Background u www.mcs.anl.gov/~foster 2nd Edition www.mkp.com/grid2


Download ppt "Ian Foster Computation Institute Argonne National Lab & University of Chicago Service-Oriented Science: Scaling eScience Impact."

Similar presentations


Ads by Google