TeraGrid Information Services: Building on Globus MDS4 NSF TeraGrid Review January 10, 2006 TeraGrid Information Services: Building on Globus MDS4 John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration University of Chicago, Argonne National Laboratory GlobusWorld May 13, 2008 Good afternoon, I’m JP Navarro I’m a member of the TeraGrid’s Grid Infrastructure Group where I focus on Software Integration and Information Services activities This talk will introduce the TG’s information services motivation, goals, architecture, current capabilities, and plans and how these build on Globus MDS4 Charlie Catlett (cec@uchicago.edu)
What is the TeraGrid? 1 NSF funded facility with NSF TeraGrid Review What is the TeraGrid? January 10, 2006 1 NSF funded facility with 11 resource providers and several other partners Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Grid Infrastructure Group (UChicago) NSF TeraGrid Review What is the TeraGrid? January 10, 2006 Operating a coordinated high-performance compute, network, storage, and visualization infrastructure Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Open Scientific Discover NSF TeraGrid Review What is the TeraGrid? January 10, 2006 To enable Open Scientific Discover Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
One facility in what sense? NSF TeraGrid Review One facility in what sense? January 10, 2006 Common People Interfaces, examples: Request allocations & accounts “POPS” Identify available resources Learn how to use resources Find resource status Ask for help Community events Events Helpdesk User Documentation & Knowledge Base What does the TeraGrid provide as a Grid that is different from what independent facilities would offer 1st were providing COMMON interfaces intended for PEOPLE POPS: single interface to request allocations on any or all resources “roaming” User Portal: standard user aware interface User Documentation: public documentation and information Proposal System “POPS” User Portal May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
(Tera)Grid in what sense? NSF TeraGrid Review (Tera)Grid in what sense? January 10, 2006 Coordinated/Standard (User) Software Interfaces Coordinated software Coordinated development & runtime Unix environment Standardized (Grid) service interfaces Grid Services Remote Login (GSI-OpenSSH) Coordinated Software Unix Environment Data Movement (GridFTP, RFT) Grid clients & tools Data Management (SRB) Development tools & languages What else does the TeraGrid provide as a Grid that is different from what independent facilities would offer The 2nd major thing we provide providing coordinated/standard software interface Remote Execution (GRAM) Communication, Data, and Math tools & libraries Information/Discovery (MDS4, Tomcat, Apache2) May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
How can users keep it all straight? NSF TeraGrid Review Challenges January 10, 2006 Integrating many more types of infrastructure resources, Including more services providers, Providing more types of services, With more resources, providers, and services diversity, Supporting specialization. The TG started as 4 institutions running the same hardware, OS, and software Since then we’ve diversified to 11 resource providers, ~dozen platforms (Linux and vendor Unix), 20 machines, traditional and community/gateway users Expanding variety of service types, service providers, and specialization amongst the service providers How can users keep it all straight? May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Meeting the Challenge 1 Re-architect CTSS NSF TeraGrid Review Meeting the Challenge 1 Re-architect CTSS January 10, 2006 Took Coordinated TeraGrid Software and Services “CTSS” v3: Monolithic coordinated capabilities In production since mid 2006 Re-architected into CTSS v4 “capability kits”: Small required core integration kit with minimum needed to integrate resources Plus ~10 optional user capability kits (at least 1 required) CTSS v3 (with exceptions) CTSS v4 TeraGrid Core Integration Remote Login Data Movement Data Management To address this challenge the TeraGrid took a two (2) prong approach. First we took the Coordinated TeraGrid Software and Services Application Development & Runtime Science Workflow Remote Compute Wide Area GPFS Parallel Application (MPI) Visualization May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Meeting the Challenge 2 Information Services NSF TeraGrid Review Meeting the Challenge 2 Information Services January 10, 2006 Information Services Vision: Create a coordinated way for TeraGrid participants to publish about the services they offer, Create a way for the TeraGrid to aggregate and index the information from TeraGrid participants, and to publish this information to the public in a form that can easily be used by user software, user interfaces, and TeraGrid service providers themselves to discover capabilities and how to access them Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Information Services Design Goals NSF TeraGrid Review Information Services Design Goals January 10, 2006 Applies Grid concepts to original information publishing Publishing is the responsibility of the information owner Publishing is done using standard (content) schemas Publishing thru standard interfaces regardless of content and where the data comes from Publishing services should be available globally (subject to authentication/authorization) Information owners publish to EVERYONE, not just the TeraGrid Publishing is a grid service Applies Grid concepts to aggregated information publishing Aggregation uses standard information services interfaces to retrieve information Publishing aggregated information is done exactly like original information publishing This is how a collaboration, such as the TeraGrid, aggregates participant information Applies Grid concepts to querying information Querying can use standard interfaces regardless of content Querying can use standard interfaces for original or aggregated information Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
High-Level Architecture NSF TeraGrid Review High-Level Architecture January 10, 2006 TeraGrid Wide Information Services Apache 2.0 WS/REST HTTP GET Clients Cache Tomcat WebMDS TeraGrid Wide Respositories WS/SOAP Clients WS MDS4 Service Provider Information Services WS/SOAP WS MDS4 Clients Adapter Local Info May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Information Services Tooling NSF TeraGrid Review Information Services Tooling January 10, 2006 WS/SOAP (Globus 4.0.x MDS4) Benefits Indexing, Trigger Registration, Publish, Subscribe Security/Authorization Robust WSRF interface Content XML WS/* (Tomcat 5.0, Apache 2.0) Very common web services platform Supports several web service interfaces (including simple) Supports multiple styles like REST, Web 2.0 Can be highly scalable Many formats: HTML, XHTML/XML, XML, RSS/Atom, … WebMDS (Globus 4.0.x) Live MDS4 content access XPath support XSLT transforms Many formats: HTML, XHTML/XML, XML, RSS/Atom May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Service Provider vs TG Wide Services NSF TeraGrid Review Service Provider vs TG Wide Services January 10, 2006 Content: Locally owned and published information Can come from existing local systems Services: 1 general purpose MDS service 1 remote execution MDS services May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Service Provider vs TG Wide Services NSF TeraGrid Review Service Provider vs TG Wide Services January 10, 2006 Content: Aggregate/index service provider information Plus central information (TeraGrid databases) Cached Authenticated registrations Services: Several redundant servers (>99.5% availability) Each server: Information caching programs MDS4 index services (WS/SOAP) WebMDS/Tomcat, Apache 2.0, … services (WS/REST) Services publish in: HTML XML CSV (Atom, JSON, RSS being prototyped) May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
High-Availability Design NSF TeraGrid Review High-Availability Design January 10, 2006 TeraGrid Wide Information Services Clients info.teragrid.org Service Provider Information Services (commercial XEN server) info.dyn.teragrid.org TeraGrid Dynamic DNS This is both a high-availability and high-throughput design Primary is a commercially hosted XEN image (TeraGrid physical server) Server failover propagates globally in 15 minutes … Static paths Dynamic paths May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Caching Design Goal MDS configuration: Caching programs: NSF TeraGrid Review Caching Design January 10, 2006 Goal Information persistence when service providers are down and when central information services go down MDS configuration: Service provider register their existence Central MDS do not aggregate automatically Central MDS publishes from local file-system cache Caching programs: External and asynchronous to MDS Query registered Service provider MDSs, and Other configured MDSs Store query results in file-system Replaces previously cached data only with new data Not if downstream is down Or if downstream returns an error Or if downstream doesn’t return valid data May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Primary MDS4 features used NSF TeraGrid Review Primary MDS4 features used January 10, 2006 Registration: Parallel upstream registration to multiple central servers Authenticated/authorized upstream registration Aggregation: Some information is automatically aggregated (WS services registration and default GLUE) Information providers: Useful RP custom information providers Scheduling load and queue contents (User Portal) CTSS 4 Capability kit registration TeraGrid ID and description cross-reference (TGCDB) Secure MDS: Authenticated/authorized queries (non-anonymous) Multiple index services in same container DefaultIndexService AND SecureIndexService WebMDS XSLT, Xpath May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
Information Services Users NSF TeraGrid Review Information Services Users January 10, 2006 User Documentation http://www.teragrid.org/ User Portal http://portal.teragrid.org/ Inca Testing Harness Gateways We our eating our own dogfood, publishing for internal consumption Users and developers like yourself are our also our target Peer Grids User Applications info.teragrid.org May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
CTSS Capability Kit Availability NSF TeraGrid Review CTSS Capability Kit Availability January 10, 2006 Charlie Catlett (cec@uchicago.edu)
Where are the GridFTP services? NSF TeraGrid Review Where are the GridFTP services? January 10, 2006 Charlie Catlett (cec@uchicago.edu)
Queue Contents in User Portal NSF TeraGrid Review Queue Contents in User Portal January 10, 2006 Charlie Catlett (cec@uchicago.edu)
Inca Capability Kit Testing NSF TeraGrid Review Inca Capability Kit Testing January 10, 2006 Charlie Catlett (cec@uchicago.edu)
CTSS 4 Capability Kits For each capability kit on each resource NSF TeraGrid Review CTSS 4 Capability Kits January 10, 2006 For each capability kit on each resource Current support level, and target support level Development, Testing, Production Support organization and contact Inca status URL Multiple version of a kit with different support levels Generic capabilities, web and non-web, local, central, peer grid… May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
CTSS 4 Capability Kit Software NSF TeraGrid Review CTSS 4 Capability Kit Software January 10, 2006 For each kit software component on each resource Name, version, how to access it Multiple versions of a single component May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
CTSS 4 Capability Kit Services NSF TeraGrid Review CTSS 4 Capability Kit Services January 10, 2006 For each kit service on each resource Name, type, version, and Endpoint (contact location) GSI OpenSSH, GridFTP, SRB servers, PreWS & WS GRAM, MDS4 Multiple services of the same type May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
What’s in Development? Expanded content NSF TeraGrid Review What’s in Development? January 10, 2006 Expanded content TeraGrid Gateways list Extended GridFTP service information (striping, bandwidth, etc) Local HPC Software (Meta)Scheduling support information Information Services Metadata Enhanced caching, aggregation, and storage Improved information access tginfo, universal command line query tool WS/REST, Web 2.0 style information access Multiple formats: CSV TEXT, XML, JSON, RSS/Atom, … GLUE 2.0 Usage metrics Coordinated software and services -> CTSS Local (uncoordinated) HPC software May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
The Team Information Services design, development, and support team NSF TeraGrid Review The Team January 10, 2006 Information Services design, development, and support team Eric Blau Jason Brechin Lee Liming JP Navarro Laura Pearlman Information Services downstream developers and consumers End-to-end data transfer optimization team @ PSC (Derek Simmel, others) INCA (Kate Ericson) Operations (Jason Brechin, Tony Rimovsky) User Documentation (Mike Dwyer, Diana Diehl) User Portal (Maytal Dahan) Other lurkers… May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)
More Information Information Services main page: NSF TeraGrid Review More Information January 10, 2006 Information Services main page: http://info.teragrid.org/ (links to content and documentation) User Documentation (CTSS 4 kits, software, services): http://www.teragrid.org/userinfo/software/ctss.php User Portal (scheduler load & queue contents): https://portal.teragrid.org:443/gridsphere/gridsphere?cid=resources Inca Monitoring Framework http://www.teragrid.org/userinfo/software/ctss.php E-mail: <navarro@mcs.anl.gov> May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)