Download presentation
Presentation is loading. Please wait.
1
Using Globus to Locate Services Case Study 1: A Distributed Information Service for TeraGrid John-Paul Navarro, Lee Liming
2
OSGCC 2008Globus Primer: An Introduction to Globus Software2 UCSD UT UC/ANL NCSA PSC ORNL PU IU A National Science Foundation Investment in Cyberinfrastructure $100M 3-year construction (2001-2004) $150M 5-year operation & enhancement (2005-2009) NSF’s TeraGrid * l TeraGrid DEEP: Integrating NSF’s most powerful computers (60+ TF) u 2+ PB Online Data Storage u National data visualization facilities u World’s most powerful network (national footprint) l TeraGrid WIDE Science Gateways: Engaging Scientific Communities u 90+ Community Data Collections u Growing set of community partnerships spanning the science community. u Leveraging NSF ITR, NIH, DOE and other science community projects. u Engaging peer Grid projects such as Open Science Grid in the U.S. as peer Grids in Europe and Asia-Pacific. l Base TeraGrid Cyberinfrastructure: Persistent, Reliable, National u Coordinated distributed computing and information environment u Coherent User Outreach, Training, and Support u Common, open infrastructure services * Slide courtesy of Ray Bair, Argonne National Laboratory
3
OSGCC 2008Globus Primer: An Introduction to Globus Software3 l Provide a mechanism that allows resource providers, users, and partners the ability to publish and discover information about available capabilities u What are the TG compute resources? u What capabilities does resource X provide? u Where are the login services? u Where can I get data collection Y? u Who has a queue prediction service? u Who has a weather forecasting service? l Provide a mechanism that is suitable for TeraGrid’s open community u Publishers register information (as opposed to turning it over to a central database) u Central index (like Google) enables aggregation, discovery u Multiple access interfaces (WS/SOAP, WS/ReST, browser) The Challenge
4
OSGCC 2008Globus Primer: An Introduction to Globus Software4 TG Information Services… …IS NOT……IS… A central database (Data Warehouse)A central index/aggregation (Google) A new user interface A way user interfaces access information A single implementation/toolIncludes several tools A single software interface Accessed using several useful interfaces A specific set of dataPhased growing set of data Changed data ownershipOwnership maintained as appropriate Way to manage scientific informationWay to manage Grid meta-data A data management system (database) An information publishing system …is a coordinated way to publish, index, and access public [Tera]Grid information using software interfaces.
5
OSGCC 2008Globus Primer: An Introduction to Globus Software5 Issues - Technical l Information is stored in many legacy systems u Databases (several types, restricted access) u Static & dynamic web browser interfaces l Schema are many and diverse u Impractical to design a relational database that supports all of these data types and relations l Many kinds of clients (browsers, SOAP, ReST) l High availability is critical u The service will be depended on both by TG operations (testing, documentation, planning) and by many TG users and partners, so it must be available all the time and very stable u Goal is 99.5% availability
6
OSGCC 2008Globus Primer: An Introduction to Globus Software6 Issues - Social l TG is a community of independent service providers u Independence is prized u Ownership of information (and its quality control) is important u Participation in other grids is typical l Publishers have low threshold for tech hassles u Publishing mechanism must be simple l The solution must add to (not replace) existing interfaces
7
OSGCC 2008Globus Primer: An Introduction to Globus Software7 MDS4 Overview l Components u Index Service – aggregates information and provides a query interface u Trigger Service – aggregates information and takes actions when conditions are met u WebMDS - subsets and transforms XML based on XPath queries, XSLT transforms and style sheets u Information provider APIs – integration with legacy systems u APIs and command-line clients for developers l Implemented as Web services l Uses WSRF (lifecycle, resource properties, etc.) l Included in the Globus Toolkit 4.0
8
OSGCC 2008Globus Primer: An Introduction to Globus Software8 The MDS4 Hourglass Schemas Information Users : Schedulers, Portals, Warning Systems, etc. Other sources of information Services (GRAM, RFT, RLS) Queueing Systems (PBS, Torque, etc.) WS standard interfaces for subscription, registration, notification
9
OSGCC 2008Globus Primer: An Introduction to Globus Software9 Clients TeraGrid’s IS Architecture Cache WS/REST HTTP GET WS/SOAP WS MDS4 Tomcat WebMDS Apache 2.0 TeraGrid Central Services TeraGrid Repositories Partners WS/SOAP WS MDS4 Resource Provider Services
10
OSGCC 2008Globus Primer: An Introduction to Globus Software10 Central vs. Distributed Services l Publisher Content u Publisher-owned and maintained information u Data probably originates somewhere in the local system l Publisher Code u An MDS4 index service u Or: Any Web service that has WS- ResourceProperties l Central Content u Aggregated publisher content l Central Code u Redundant servers u Information caching (persistence) u MDS4 index services (WS/SOAP) u WebMDS/Tomcat, Apache 2.0, … (WS/REST) u Content published in: HTML, XHTML/XML, XML, Atom, RSS, …
11
OSGCC 2008Globus Primer: An Introduction to Globus Software11 Registration l Publisher registers available content u Local service maintains a registration with the central indices u Registration expires automatically, so refresh is needed periodically u Publishers retain ownership and operation of their own information service (can be registered with other grids!) l Index services pull content u Registrations are subject to access control u Uses registration data to contact service and get latest content u Caches content locally, subject to purge policy u Cache allows for service outages, etc.
12
OSGCC 2008Globus Primer: An Introduction to Globus Software12 High-Availability Design … info.dyn.teragrid.org info.teragrid.org TeraGrid Dynamic DNS Dynamically direct clients to one or more servers Set by Information Services administrators Changes propagate globally fast (TTL = 15 minutes) Clients Dynamically Changes Doesn’t Change RP/partner services TG wide servers (Patrick Dorn & NCSA NetEng)
13
OSGCC 2008Globus Primer: An Introduction to Globus Software13 Information Services Users User DocumentationUser Portal Database? Gateways Peer Grids User Applications info.teragrid.org Others
14
OSGCC 2008Globus Primer: An Introduction to Globus Software14 Queue Contents in User Portal
15
OSGCC 2008Globus Primer: An Introduction to Globus Software15 Where are GridFTP services?
16
Where Can I Login? OSGCC 2008Globus Primer: An Introduction to Globus Software16 To login to cobalt at NCSA, ssh to grid-co.ncsa.teragrid.org! Don’t use this one (yet).
17
OSGCC 2008Globus Primer: An Introduction to Globus Software17 Results - TeraGrid l Considerable excitement from information owners… u A way to raise awareness for their information & capabilities u Doesn’t require them to replace legacy systems or turn information over to someone else l …and information consumers u Simple, consistent access mechanisms for lots of information types u A mediating agency for independent service operators l Integration to date: u Compute service descriptions and queue status u Software & service availability registry u Central documentation u Verification & validation testing service
18
MDS4 has… l WS/WSRF interface l WS/REST interface (browser-accessible) l XSLT/Xpath support l Registration, polling, subscription, notification capabilities l Index & trigger service l GLUE CE providers l Plug-in API for custom info providers MDS4 doesn’t have… l Your own custom info providers l Schema validation l Many clients (unless you count browsers!) l XSLT style examples l High-availability deployment OSGCC 2008Globus Primer: An Introduction to Globus Software18
19
Other Uses of MDS4 l Directory of service deployments u E.g., caGrid service registry l Monitoring/alert service u Trigger service notifies when an expected service registration isn’t there anymore l Monitoring/recording service u Subscriber periodically records value of a registered resource property (e.g., free space, services registered, system load) OSGCC 2008Globus Primer: An Introduction to Globus Software19
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.