Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ian Foster Argonne National Laboratory University of Chicago Univa Corporation Service-Oriented Science Scaling eScience Application & Impact.

Similar presentations


Presentation on theme: "Ian Foster Argonne National Laboratory University of Chicago Univa Corporation Service-Oriented Science Scaling eScience Application & Impact."— Presentation transcript:

1 Ian Foster Argonne National Laboratory University of Chicago Univa Corporation Service-Oriented Science Scaling eScience Application & Impact

2 2 Acknowledgements l Carl Kesselman, with whom I developed many of these slides l Bill Allcock, Charlie Catlett, Kate Keahey, Jennifer Schopf, Frank Siebenlist, Mike Wilde @ ANL/UC l Ann Chervenak, Ewa Deelman, Laura Pearlman @ USC/ISI l Karl Czajkowski, Steve Tuecke @ Univa l Numerous other fine colleagues l NSF, DOE, IBM for research support

3 3 Context: System-Level Science Problems too large &/or complex to tackle alone …

4 4 Seismic Hazard Analysis (T. Jordan & SCEC) Seismic Hazard Model Seismicity Paleoseismology Local site effects Geologic structure Faults Stress transfer Crustal motion Crustal deformation Seismic velocity structure Rupture dynamics

5 5 SCEC Community Model IntensityMeasuresEarthquake Forecast Model AttenuationRelationship 1 Standardized Seismic Hazard Analysis Ground motion simulation Physics-based earthquake forecasting Ground-motion inverse problem Structural Simulation AWM GroundMotions SRM Unified Structural Representation Faults Motions Stresses Anelastic model 2 AWP = Anelastic Wave Propagation = SRM = Site Response Model RDMFSM 3 FSM = Fault System Model RDM = Rupture Dynamics Model Invert Other Data GeologyGeodesy 4 2 3 1 4 5 5

6 6 Facilities Computers Storage Networks Services Software People Implementation System-Level Problem Grid technology Decomposition U. Colorado Experimental Model NCSA Computational Model COORD. UIUC Experimental Model

7 7 Science Takes a Village … l Teams organized around common goals u People, resource, software, data, instruments… l With diverse membership & capabilities u Expertise in multiple areas required l And geographic and political distribution u No location/organization possesses all required skills and resources l Must adapt as a function of the situation u Adjust membership, reallocate responsibilities, renegotiate resources

8 8 Virtual Organizations l From organizational behavior/management: u "a group of people who interact through interdependent tasks guided by common purpose [that] works across space, time, and organizational boundaries with links strengthened by webs of communication technologies" (Lipnack & Stamps, 1997) l The impact of cyberinfrastructure u People  computational agents & services u Communication technologies  IT infrastructure, i.e. Grid “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

9 9 Beyond Science Silos: Service-Oriented Architecture l Decompose across network l Clients integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as a new service l Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski

10 10 Provisioning Service-Oriented Systems: The Role of Grid Infrastructure l Service-oriented Grid infrastructure u Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation l Service-oriented applications u Wrap applications as services u Compose applications into workflows “The Many Faces of IT as Service”, Foster, Tuecke, 2005

11 11 Forming & Operating (Scientific) Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

12 12 Forming & Operating (Scientific) Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

13 13 Defining Community: Membership and Laws l Identify VO participants and roles u For people and services l Specify and control actions of members u Empower members  delegation u Enforce restrictions  federate policy A 12 B 12 A B 1 10 1 1 16

14 14 Security Services Objectives l It’s all about “policy” u Define a VO’s operating rules u Security services facilitate the enforcement l Policy facilitates “business objectives” u Related to goals/purpose of the VO l Security policy often delicate balance u Legislation may mandate minimum security u More security  Higher costs u Less security  Higher exposure to loss u Risk versus Rewards

15 15 Policy Challenges in VOs l Restrict VO operations based on characteristics of requestor u VO dynamics create challenges l Intra-VO u VO specific roles u Mechanisms to specify/enforce policy at VO level l Inter-VO u Entities/roles in one VO not necessarily defined in another VO Access granted by community to user Site admission- control policies Effective Access Policy of site to community

16 16 Core Security Mechanisms l Attribute Assertions u C asserts that S has attribute A with value V l Authentication and digital signature u Allows signer to assert attributes l Delegation u C asserts that S can perform O on behalf of C l Attribute mapping u {A1, A2… An}vo1  {A’1, A’2… A’m}vo2 l Policy u Entity with attributes A asserted by C may perform operation O on resource R

17 17 Trust in VOs l Do I “believe” an attribute assertion? u Used to evaluate cost vs. benefit of performing an operation l E.g., perform untrusted operation with extra auditing l Look at attributes of assertion signer l Rooting trust u Externally recognized source, e.g., CA u Dynamically via VO structure  delegation u Dynamically via alternative sources, e.g., reputation

18 18 Security Services for VO Policy l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy l Use with message/transport level security VO A Service VO ATA VO AZA Mapping ATA VO B Service VO User A Delegation Assertion User B can use Service A VO-A Attr  VO-B Attr VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute

19 19 Closing the Loop VO Rights Users Rights’ Compute Center Access Services (running on user’s behalf) Rights Local policy on VO identity or attribute authority CAS or VOMS issuing SAML or X.509 ACs SSL/WS-Security with Proxy Certificates Authz Callout: SAML, XACML KCA MyProxy

20 20 Forming & Operating Scientific Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

21 21 Community Services Provider Content Services Capacity Bootstrapping a VO by Assembling Services 1) Integrate services from other sources u Virtualize external services as VO services 2) Coordinate & compose u Create new services from existing ones Capacity Provider “Service-Oriented Science”, Foster, 2005

22 22 Providing VO Services: (1) Integration from Other Sources l Negotiate service level agreements l Delegate and deploy capabilities/services l Provision to deliver defined capability l Configure environment l Host layered functions Community A Community Z …

23 23 Virtualizing Existing Services into a VO l Establish service agreement with service u E.g., WS-Agreement l Delegate use to VO user User A VO Admin User B VO User Existing Services

24 24 Deploying New Services Policy Client Environment Activity Allocate/provision Configure Initiate activity Monitor activity Control activity Interface Resource provider

25 25 Activities Can Be Nested Policy Client Environment Interface Resource provider Client

26 26 www.opensciencegrid.org Jobs (2004) Open Science Grid  50 sites (15,000 CPUs) & growing  400 to >1000 concurrent jobs  Many applications + CS experiments; includes long-running production operations  Up since October 2003; few FTEs central ops

27 27 VO User Embedded Resource Management Cluster Resource Manager GRAM Cluster Resource Manager GRAM VO admin delegates credentials to be used by downstream VO services. VO admin starts the required services. VO jobs comes in directly from the upstream VO Users VO job gets forwarded to the appropriate resource using the VO credentials Computational job started for VO Client-side VO Scheduler Other Services VO Admin... Monitoring and control Headnode Resource Manager GRAM Deleg VO User VO Job

28 28 Providing VO Services: (2) Coordination & Composition l Take a set of provisioned services … … & compose to synthesize new behaviors l This is traditional service composition u But must also be concerned with emergent behaviors, autonomous interactions u See the work of the agent & PlanetLab communities “Brain vs. Brawn: Why Grids and Agents Need Each Other," Foster, Kesselman, Jennings, 2004.

29 29 Birmingham The Globus-Based LIGO Data Grid Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory www.globus.org/solutions  Cardiff AEI/Golm

30 30 l Pull “missing” files to a storage system List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication

31 31 Hypervisor/OS Deploy hypervisor/OS Composing Resources … Composing Services Physical machine Procure hardware VM Deploy virtual machine Provisioning, management, and monitoring at all levels JVM Deploy container DRS Deploy service GridFTP LRC VO Services GridFTP

32 32 Decomposition Enables Separation of Concerns & Roles User Service Provider “Provide access to data D at S1, S2, S3 with performance P” Resource Provider “Provide storage with performance P1, network with P2, …” D S1 S2 S3 D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3

33 33 Community Commons l What capabilities are available to VO? u Membership changes, state changes l Require mechanisms to aggregate and update VO information VO-specific indexes S S SS Information A A A FRESH MORE The age of information

34 34 GT4 Container Monitoring and Discovery Services MDS- Index GT4 Cont. RFT MDS- Index GT4 Container MDS- Index Registration & WSRF/WSN Access GridFTP adapter Custom protocols for non-WSRF entities Clients (e.g., WebMDS) GRAMUser Automated registration in container WS-ServiceGroup

35 35 Provisioning Service-Oriented Systems: The Role of Grid Infrastructure l Service-oriented Grid infrastructure u Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation l Service-oriented applications u Wrap applications as services u Compose applications into workflows “The Many Faces of IT as Service”, Foster, Tuecke, 2005

36 36 Forming & Operating Scientific Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

37 37 Collaborative Work Executed Executing Executable Not yet executable Query Edit Schedule Execution environment What I Did What I Want to Do What I Am Doing … Time

38 38 Managing Collaborative Work l Process as “workflow,” at different scales, e.g.: u Run 3-stage pipeline u Process data flowing from expt over a year u Engage in interactive analysis l Need to keep track of: u What I want to do (will evolve with new knowledge) u What I am doing now (evolve with system config.) u What I did (persistent; a source of information)

39 39 Trident: The GriPhyN Virtual Data System Abstract workflow Local planner DAGman DAG Statically Partitioned DAG DAGman & Condor-G Dynamically Planned DAG VDL Program Virtual Data catalog Virtual Data Workflow Generator Job Planner Job Cleanup Workflow spec Create Execution Plan Grid Workflow Execution

40 40 Functional MRI Analysis Workflow courtesy James Dobson, Dartmouth Brain Imaging Center

41 41 Functional MRI – Mapping Brain Function using Grid Workflows <>

42 42 Functional MRI Virtual Data Queries Which transformations can process a “subject image”? l Q: xsearchvdc -q tr_meta dataType subject_image input l A: fMRIDC.AIR::align_warp List anonymized subject-images for young subjects: l Q: xsearchvdc -q lfn_meta dataType subject_image privacy anonymized subjectType young l A: 3472-4_anonymized.img Show files that were derived from patient image 3472-3: l Q: xsearchvdc -q lfn_tree 3472-3_anonymized.img l A: 3472-3_anonymized.img 3472-3_anonymized.sliced.hdr atlas.hdr atlas.img … atlas_z.jpg 3472-3_anonymized.sliced.img

43 43 QuarkNet: Leveraging Trident for Science Education

44 44 PUMA: Analysis of Metabolism PUMA Knowledge Base Information about proteins analyzed against ~2 million gene sequences Analysis on Grid Involves millions of BLAST, BLOCKS, and other processes Natalia Maltsev et al. http://compbio.mcs.anl.gov/puma2

45 45 Astronomy: A Small Montage Workflow ~1200 node workflow, 7 levels Mosaic of M42 created on TeraGrid

46 46 Summary (1): Community Services l Community roll, city hall, permits, licensing & police force u Assertions, policy, attribute & authorization services l Directories, maps u Information services l City services: power, water, sewer u Deployed services l Shops, businesses u Composed services l Day-to-day activities u Workflows, visualization l Tax board, fees, economic considerations u Barter, planned economy, eventually markets

47 47 Summary (2) l Community based science will be the norm u Requires collaborations across sciences— including computer science l Many different types of communities u Differ in coupling, membership, lifetime, size l Must think beyond science stovepipes u Increasingly the community infrastructure will become the scientific observatory l Scaling requires a separation of concerns u Providers of resources, services, content l Small set of fundamental mechanisms required to build communities

48 48 For More Information l Globus Alliance u www.globus.org l NMI and GRIDS Center u www.nsf-middleware.org u www.grids-center.org l Infrastructure u www.opensciencegrid.org u www.teragrid.org l Background u www.mcs.anl.gov/~foster 2nd Edition www.mkp.com/grid2


Download ppt "Ian Foster Argonne National Laboratory University of Chicago Univa Corporation Service-Oriented Science Scaling eScience Application & Impact."

Similar presentations


Ads by Google