Download presentation
Presentation is loading. Please wait.
1
The Global Storage Grid Or, Managing Data for “Science 2.0” Ian Foster Computation Institute Argonne National Lab & University of Chicago
2
2 “Web 2.0” l Software as services u Data- & computation-rich network services l Services as platforms u Easy composition of services to create new capabilities (“mashups”)—that themselves may be made accessible as new services l Enabled by massive infrastructure buildout u Google projected to spend $1.5B on computers, networks, and real estate in 2006 u Dozens of others are spending substantially l Paid for by advertising Declan Butler, Nature
3
3 Science 2.0: E.g., Cancer Bioinformatics Grid Data Service @ uchicago.edu <BPEL Workflow Doc> BPEL Engine Analytic service @ osu.edu Analytic service @ duke.edu <Workflow Results> <Workflow Inputs> link caBiG: https://cabig.nci.nih.gov/ BPEL work: Ravi Madduri et al.
4
4 Science 2.0: E.g., Virtual Observatories Data Archives User Analysis tools Gateway Figure: S. G. Djorgovski Discovery tools
5
5 Science 1.0 Science 2.0: For Example, Digital Astronomy Tell me about this star Tell me about these 20K stars Support 1000s of users E.g., Sloan Digital Sky Survey, ~40 TB; others much bigger soon
6
6 Global Data Requirements l Service consumer u Discover data u Specify compute-intensive analyses u Compose multiple analyses u Publish results as a service l Service provider u Host services enabling data access/analysis u Support remote requests to services u Control who can make requests u Support time-varying, data- and compute- intensive workloads
7
7 l But: u Amount of computation can be enormous u Load can vary tremendously u Users want to compose distributed services data must sometimes be moved, anyway l Fortunately u Networks are getting much faster (in parts) u Workloads can have significant locality of reference Analyzing Large Data: “Move Computation to the Data”
8
8 “Move Computation to the Core” Poorly connected “periphery” Highly connected “core”
9
9 Highly Connected “Core”: For Example, TeraGrid Starlight LA Atlanta SDSC TACC PU IU ORNL NCSA ANL PSC 16 Supercomputers - 9 different types, multiple sizes World’s fastest network Globus Toolkit and other middleware providing single login, application management, data movement, web services 30 Gigabits/s to large sites = 20-30 times major uni links = 30,000 times my home broadband = 1 full length feature film per sec 75 Teraflops (trillion calculations/s) = 12,500 faster than all 6 billion humans on earth each doing one calculation per second
10
10 Open Science Grid May 7-14, 2006 4000 jobs
11
11 The Two Dimensions of Science 2.0 l Decompose across network Clients integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as new services l Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski
12
12 Provisioning Technology Requirements: Integration & Decomposition l Service-oriented Grid infrastructure u Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation l Service-oriented applications u Wrap applications & data as services u Compose services into workflows “The Many Faces of IT as Service”, ACM Queue, Foster, Tuecke, 2005
13
13 Technology Requirements: Within the Core … l Provide “service hosting services” that allow consumers to negotiate the hosting of arbitrary data analysis services l Dynamically manage resources (compute, storage, network) to meet diverse computational demands l Provide strong internal security, bridging to diverse external sources of attributes
14
14 Globus Software Enables Grid Infrastructure l Web service interfaces for behaviors relating to integration and decomposition u Primitives: resources, state, security u Services: execution, data movement, … l Open source software that implements those interfaces u In particular, Globus Toolkit (GT4) l All standard Web services u “Grid is a use case for Web services, focused on resource management”
15
15 Open Source Grid Software Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005
16
16 GT4 Data Services l Data movement u GridFTP—secure, reliable, performant u Reliable File Transfer: managed transfers u Data Replication Service—managed replication l Replica Location Service u Scales to 100s of millions of replicas l Data Access & Integration services u Access to, and server-side processing, of structured data Disk-to-disk on TeraGrid
17
17 Security Services l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy VO A Service VO ATA VO AZA Mapping ATA VO B Service VO User A Delegation Assertion User B can use Service A VO-A Attr VO-B Attr VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute
18
18 Service Hosting Policy Client Environment Activity Allocate/provision Configure Initiate activity Monitor activity Control activity Interface Resource provider WSRF (or WS-Transfer/WS-Man, etc.), Globus GRAM, Virtual Workspaces
19
19 Virtual OSG Clusters OSG cluster Xen hypervisors TeraGrid cluster OSG “Virtual Clusters for Grid Communities,” Zhang et al., CCGrid 2006
20
20 Managed Storage: GridFTP with NeST (Demoware) GridFTP Server NeST Module Disk Storage NeST Server Chirp Custom Application (Lot operations, etc.) (chirp) (File transfers) (GSI-FTP) (File transfer) (chirp) GT4NeST Bill Allcock, Nick LeRoy, Jeff Weber, et al.
21
21 Community Services Provider Content Services Capacity Hosting Science Services 1) Integrate services from other sources u Virtualize external services as VO services 2) Coordinate & compose u Create new services from existing ones Capacity Provider “Service-Oriented Science”, Science, 2005
22
22 Virtualizing Existing Services l Establish service agreement with service u E.g., WS-Agreement l Delegate use to community (“VO”) user User A VO Admin User B VO User Existing Services
23
23 Birmingham The Globus-Based LIGO Data Grid Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory www.globus.org/solutions Cardiff AEI/Golm
24
24 l Pull “missing” files to a storage system List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication
25
25 Hypervisor/OS Deploy hypervisor/OS Data Replication Service: Dynamic Deployment Physical machine Procure hardware VM Deploy virtual machine State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels JVM Deploy container DRS Deploy service GridFTP LRC VO Services GridFTP
26
26 Decomposition Enables Separation of Concerns & Roles User Service Provider “Provide access to data D at S1, S2, S3 with performance P” Resource Provider “Provide storage with performance P1, network with P2, …” D S1 S2 S3 D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3
27
27 Example: Biology Public PUMA Knowledge Base Information about proteins analyzed against ~2 million gene sequences Back Office Analysis on Grid Millions of BLAST, BLOCKS, etc., on OSG and TeraGrid Natalia Maltsev et al., http://compbio.mcs.anl.gov/puma2
28
28 Genome Analysis & Database Update (GADU) on OSG April 24, 2006 GADU 3,000 jobs
29
29 Example: Earth System Grid l Climate simulation data u Per-collection control u Different user classes u Server-side processing l Implementation (GT) u Portal-based User Registration (PURSE) u PKI, SAML assertions u GridFTP, GRAM, SRM l >2000 users l >100 TB downloaded www.earthsystemgrid.org — DOE OASCR
30
30 “Inside the Core” of ESG
31
31 Example: Astro Portal Stacking Service l Purpose u On-demand “stacks” of random locations within ~10TB dataset l Challenge u Rapid access to 10- 10K “random” files u Time-varying load l Solution u Dynamic acquisition of compute, storage + + + + + + = + S4S4 Sloan Data Web page or Web Service Joint work with Ioan Raicu & Alex Szalay
32
32 Preliminary Performance (TeraGrid, LAN GPFS) Joint work with Ioan Raicu & Alex Szalay
33
33 Example: Cybershake Calculate hazard curves by generating synthetic seismograms from estimated rupture forecast Rupture Forecast Synthetic Seismogram Strain Green Tensor Hazard Curve Spectral Acceleration Hazard Map Tom Jordan et al., Southern California Earthquake Center
34
34 20 TB, 1.8 CPU-year Enlisting TeraGrid Resources Workflow Scheduler/Engine VO Service Catalog Provenance Catalog Data Catalog SCEC Storage TeraGrid Compute TeraGrid Storage VO Scheduler Ewa Deelman, Carl Kesselman, et al., USC Information Sciences Institute
35
35 http://dev.globus.org Guidelines (Apache) Infrastructure (CVS, email, bugzilla, Wiki) Projects Include … dev.globus — Community Driven Improvement of Globus Software, NSF OCI
36
36 Summary l “Science 2.0”—science as service, & service as platform—demands u New infrastructure—service hosting u New technology—hosting & management u New policy—hierarchically controlled access l Data & storage management cannot be separated from computation management u And increasingly become community roles l A need for new technologies, skills, & roles u Creating, publishing, hosting, discovering, composing, archiving, explaining … services
37
37 Science 1.0 Science 2.0 Gigabytes Tarballs Journals Individuals Community codes Supercomputer centers Makefile Computational science Physical sciences Computational scientists NSF-funded Terabytes Services Wikis Communities Science gateways TeraGrid, OSG, campus Workflow Science as computation All sciences (& humanities) All scientists NSF-funded
38
38 Science 2.0 Challenges l A need for new technologies, skills, & roles u Creating, publishing, hosting, discovering, composing, archiving, explaining … services l A need for substantial software development u “30-80% of modern astronomy projects is software”—S. G. Djorgovski l A need for more & different infrastructure u Computers & networks to host services l Can we leverage commercial spending? u To some extent, but not straightforward
39
39 Acknowledgements l Carl Kesselman for many discussions l Many colleagues, including those named on slides, for research collaborations and/or slides l Colleagues involved in the TeraGrid, Open Science Grid, Earth System Grid, caBIG, and other Grid infrastructures l Globus Alliance members for Globus software R&D l DOE OASCR, NSF OCI, & NIH for support
40
40 For More Information l Globus Alliance u www.globus.org l Dev.Globus u dev.globus.org l Open Science Grid u www.opensciencegrid.org l TeraGrid u www.teragrid.org l Background u www.mcs.anl.gov/~foster 2nd Edition www.mkp.com/grid2 Thanks for DOE, NSF, and NIH for research support!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.