TeraGrid Information Services: Building on Globus MDS4

Slides:



Advertisements
Similar presentations
TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
Advertisements

Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Using Globus to Locate Services Case Study 1: A Distributed Information Service for TeraGrid John-Paul Navarro, Lee Liming.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Simo Niskala Teemu Pasanen
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
TeraGrid’s Integrated Information Service “IIS” Grid Computing Environments 2009 Lee Liming, JP Navarro, Eric Blau, Jason Brechin, Charlie Catlett, Maytal.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
GIG Software Integration: Area Overview TeraGrid Annual Project Review April, 2008.
TeraGrid Information Services December 1, 2006 JP Navarro GIG Software Integration.
GIG Software Integration Project Plan, PY4-PY5 Lee Liming Mary McIlvain John-Paul Navarro.
TeraGrid Information Services John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration and Information.
TeraGrid Information Services JP Navarro, Lee Liming University of Chicago TeraGrid Architecture Meeting September 20, 2007.
GRAM: Software Provider Forum Stuart Martin Computational Institute, University of Chicago & Argonne National Lab TeraGrid 2007 Madison, WI.
CTSS 4 Strategy and Status. General Character of CTSSv4 To meet project milestones, CTSS changes must accelerate in the coming years. Process –Process.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
TeraGrid CTSS Plans and Status Dane Skow for Lee Liming and JP Navarro OSG Consortium Meeting 22 August, 2006.
Rochester Institute of Technology Cyberaide Shell: Interactive Task Management for Grids and Cyberinfrastructure Gregor von Laszewski, Andrew J. Younge,
Evolving Interfaces to Impacting Technology: The Mobile TeraGrid User Portal Rion Dooley, Stephen Mock, Maytal Dahan, Praveen Nuthulapati, Patrick Hurley.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
Data, Visualization and Scheduling (DVS) TeraGrid Annual Meeting, April 2008 Kelly Gaither, GIG Area Director DVS.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
CTSS Version 4 User Support Documentation Mike Dwyer, Kerry Hagan, Diana Diehl.
TeraGrid’s Common User Environment: Status, Challenges, Future Annual Project Review April, 2008.
Software Integration Highlights CY2008 Lee Liming, JP Navarro GIG Area Directors for Software Integration University of Chicago, Argonne National Laboratory.
TeraGrid Capability Discovery John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration University of Chicago/Argonne National Laboratory.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Integrated Information Services “IIS” JP Navarro, U. of Chicago/ANL OGF 30 October 28, 2010.
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
TeraGrid User Portal and Online Presence David Hart, SDSC Area Director, User-Facing Projects and Core Services TeraGrid Annual Review April 6, 2009.
NSF TeraGrid Review January 10, 2006
TeraGrid Information Services
GPIR GridPort Information Repository
Computing Clusters, Grids and Clouds Globus data service
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
NSF TeraGrid Review January 10, 2006
z/Ware 2.0 Technical Overview
TeraGrid Information Services Developer Introduction
Data Bridge Solving diverse data access in scientific applications
Information Services Discussion TeraGrid ‘08
Open Source distributed document DB for an enterprise
Globus —— Toolkits for Grid Computing
Middleware independent Information Service
CUAHSI HIS Sharing hydrologic data
Shaowen Wang1, 2, Yan Liu1, 2, Nancy Wilkins-Diehr3, Stuart Martin4,5
TeraGrid’s GLUE 2 Implementation
Grid Computing.
Grid Portal Services IeSE (the Integrated e-Science Environment)
Viet Tran Institute of Informatics Slovakia
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
University of Technology
Cyberinfrastructure and PolarGrid
Chapter 7 –Implementation Issues
Enterprise Integration
Data Management Components for a Research Data Archive
Grid Computing Software Interface
Presentation transcript:

TeraGrid Information Services: Building on Globus MDS4 NSF TeraGrid Review January 10, 2006 TeraGrid Information Services: Building on Globus MDS4 John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration University of Chicago, Argonne National Laboratory GlobusWorld May 13, 2008 Good afternoon, I’m JP Navarro I’m a member of the TeraGrid’s Grid Infrastructure Group where I focus on Software Integration and Information Services activities This talk will introduce the TG’s information services motivation, goals, architecture, current capabilities, and plans and how these build on Globus MDS4 Charlie Catlett (cec@uchicago.edu)

What is the TeraGrid? 1 NSF funded facility with NSF TeraGrid Review What is the TeraGrid? January 10, 2006 1 NSF funded facility with 11 resource providers and several other partners Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Grid Infrastructure Group (UChicago) NSF TeraGrid Review What is the TeraGrid? January 10, 2006 Operating a coordinated high-performance compute, network, storage, and visualization infrastructure Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Open Scientific Discover NSF TeraGrid Review What is the TeraGrid? January 10, 2006 To enable Open Scientific Discover Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

One facility in what sense? NSF TeraGrid Review One facility in what sense? January 10, 2006 Common People Interfaces, examples: Request allocations & accounts “POPS” Identify available resources Learn how to use resources Find resource status Ask for help Community events Events Helpdesk User Documentation & Knowledge Base What does the TeraGrid provide as a Grid that is different from what independent facilities would offer 1st were providing COMMON interfaces intended for PEOPLE POPS: single interface to request allocations on any or all resources “roaming” User Portal: standard user aware interface User Documentation: public documentation and information Proposal System “POPS” User Portal May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

(Tera)Grid in what sense? NSF TeraGrid Review (Tera)Grid in what sense? January 10, 2006 Coordinated/Standard (User) Software Interfaces Coordinated software Coordinated development & runtime Unix environment Standardized (Grid) service interfaces Grid Services Remote Login (GSI-OpenSSH) Coordinated Software Unix Environment Data Movement (GridFTP, RFT) Grid clients & tools Data Management (SRB) Development tools & languages What else does the TeraGrid provide as a Grid that is different from what independent facilities would offer The 2nd major thing we provide providing coordinated/standard software interface Remote Execution (GRAM) Communication, Data, and Math tools & libraries Information/Discovery (MDS4, Tomcat, Apache2) May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

How can users keep it all straight? NSF TeraGrid Review Challenges January 10, 2006 Integrating many more types of infrastructure resources, Including more services providers, Providing more types of services, With more resources, providers, and services diversity, Supporting specialization. The TG started as 4 institutions running the same hardware, OS, and software Since then we’ve diversified to 11 resource providers, ~dozen platforms (Linux and vendor Unix), 20 machines, traditional and community/gateway users Expanding variety of service types, service providers, and specialization amongst the service providers How can users keep it all straight? May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Meeting the Challenge 1 Re-architect CTSS NSF TeraGrid Review Meeting the Challenge 1 Re-architect CTSS January 10, 2006 Took Coordinated TeraGrid Software and Services “CTSS” v3: Monolithic coordinated capabilities In production since mid 2006 Re-architected into CTSS v4 “capability kits”: Small required core integration kit with minimum needed to integrate resources Plus ~10 optional user capability kits (at least 1 required) CTSS v3 (with exceptions) CTSS v4 TeraGrid Core Integration Remote Login Data Movement Data Management To address this challenge the TeraGrid took a two (2) prong approach. First we took the Coordinated TeraGrid Software and Services Application Development & Runtime Science Workflow Remote Compute Wide Area GPFS Parallel Application (MPI) Visualization May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Meeting the Challenge 2 Information Services NSF TeraGrid Review Meeting the Challenge 2 Information Services January 10, 2006 Information Services Vision: Create a coordinated way for TeraGrid participants to publish about the services they offer, Create a way for the TeraGrid to aggregate and index the information from TeraGrid participants, and to publish this information to the public in a form that can easily be used by user software, user interfaces, and TeraGrid service providers themselves to discover capabilities and how to access them Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Information Services Design Goals NSF TeraGrid Review Information Services Design Goals January 10, 2006 Applies Grid concepts to original information publishing Publishing is the responsibility of the information owner Publishing is done using standard (content) schemas Publishing thru standard interfaces regardless of content and where the data comes from Publishing services should be available globally (subject to authentication/authorization) Information owners publish to EVERYONE, not just the TeraGrid Publishing is a grid service Applies Grid concepts to aggregated information publishing Aggregation uses standard information services interfaces to retrieve information Publishing aggregated information is done exactly like original information publishing This is how a collaboration, such as the TeraGrid, aggregates participant information Applies Grid concepts to querying information Querying can use standard interfaces regardless of content Querying can use standard interfaces for original or aggregated information Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

High-Level Architecture NSF TeraGrid Review High-Level Architecture January 10, 2006 TeraGrid Wide Information Services Apache 2.0 WS/REST HTTP GET Clients Cache Tomcat WebMDS TeraGrid Wide Respositories WS/SOAP Clients WS MDS4 Service Provider Information Services WS/SOAP WS MDS4 Clients Adapter Local Info May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Information Services Tooling NSF TeraGrid Review Information Services Tooling January 10, 2006 WS/SOAP (Globus 4.0.x MDS4) Benefits Indexing, Trigger Registration, Publish, Subscribe Security/Authorization Robust WSRF interface Content XML WS/* (Tomcat 5.0, Apache 2.0) Very common web services platform Supports several web service interfaces (including simple) Supports multiple styles like REST, Web 2.0 Can be highly scalable Many formats: HTML, XHTML/XML, XML, RSS/Atom, … WebMDS (Globus 4.0.x) Live MDS4 content access XPath support XSLT transforms Many formats: HTML, XHTML/XML, XML, RSS/Atom May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Service Provider vs TG Wide Services NSF TeraGrid Review Service Provider vs TG Wide Services January 10, 2006 Content: Locally owned and published information Can come from existing local systems Services: 1 general purpose MDS service 1 remote execution MDS services May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Service Provider vs TG Wide Services NSF TeraGrid Review Service Provider vs TG Wide Services January 10, 2006 Content: Aggregate/index service provider information Plus central information (TeraGrid databases) Cached Authenticated registrations Services: Several redundant servers (>99.5% availability) Each server: Information caching programs MDS4 index services (WS/SOAP) WebMDS/Tomcat, Apache 2.0, … services (WS/REST) Services publish in: HTML XML CSV (Atom, JSON, RSS being prototyped) May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

High-Availability Design NSF TeraGrid Review High-Availability Design January 10, 2006 TeraGrid Wide Information Services Clients info.teragrid.org Service Provider Information Services (commercial XEN server) info.dyn.teragrid.org TeraGrid Dynamic DNS This is both a high-availability and high-throughput design Primary is a commercially hosted XEN image (TeraGrid physical server) Server failover propagates globally in 15 minutes … Static paths Dynamic paths May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Caching Design Goal MDS configuration: Caching programs: NSF TeraGrid Review Caching Design January 10, 2006 Goal Information persistence when service providers are down and when central information services go down MDS configuration: Service provider register their existence Central MDS do not aggregate automatically Central MDS publishes from local file-system cache Caching programs: External and asynchronous to MDS Query registered Service provider MDSs, and Other configured MDSs Store query results in file-system Replaces previously cached data only with new data Not if downstream is down Or if downstream returns an error Or if downstream doesn’t return valid data May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Primary MDS4 features used NSF TeraGrid Review Primary MDS4 features used January 10, 2006 Registration: Parallel upstream registration to multiple central servers Authenticated/authorized upstream registration Aggregation: Some information is automatically aggregated (WS services registration and default GLUE) Information providers: Useful RP custom information providers Scheduling load and queue contents (User Portal) CTSS 4 Capability kit registration TeraGrid ID and description cross-reference (TGCDB) Secure MDS: Authenticated/authorized queries (non-anonymous) Multiple index services in same container DefaultIndexService AND SecureIndexService WebMDS XSLT, Xpath May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

Information Services Users NSF TeraGrid Review Information Services Users January 10, 2006 User Documentation http://www.teragrid.org/ User Portal http://portal.teragrid.org/ Inca Testing Harness Gateways We our eating our own dogfood, publishing for internal consumption Users and developers like yourself are our also our target Peer Grids User Applications info.teragrid.org May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

CTSS Capability Kit Availability NSF TeraGrid Review CTSS Capability Kit Availability January 10, 2006 Charlie Catlett (cec@uchicago.edu)

Where are the GridFTP services? NSF TeraGrid Review Where are the GridFTP services? January 10, 2006 Charlie Catlett (cec@uchicago.edu)

Queue Contents in User Portal NSF TeraGrid Review Queue Contents in User Portal January 10, 2006 Charlie Catlett (cec@uchicago.edu)

Inca Capability Kit Testing NSF TeraGrid Review Inca Capability Kit Testing January 10, 2006 Charlie Catlett (cec@uchicago.edu)

CTSS 4 Capability Kits For each capability kit on each resource NSF TeraGrid Review CTSS 4 Capability Kits January 10, 2006 For each capability kit on each resource Current support level, and target support level Development, Testing, Production Support organization and contact Inca status URL Multiple version of a kit with different support levels Generic capabilities, web and non-web, local, central, peer grid… May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

CTSS 4 Capability Kit Software NSF TeraGrid Review CTSS 4 Capability Kit Software January 10, 2006 For each kit software component on each resource Name, version, how to access it Multiple versions of a single component May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

CTSS 4 Capability Kit Services NSF TeraGrid Review CTSS 4 Capability Kit Services January 10, 2006 For each kit service on each resource Name, type, version, and Endpoint (contact location) GSI OpenSSH, GridFTP, SRB servers, PreWS & WS GRAM, MDS4 Multiple services of the same type May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

What’s in Development? Expanded content NSF TeraGrid Review What’s in Development? January 10, 2006 Expanded content TeraGrid Gateways list Extended GridFTP service information (striping, bandwidth, etc) Local HPC Software (Meta)Scheduling support information Information Services Metadata Enhanced caching, aggregation, and storage Improved information access tginfo, universal command line query tool WS/REST, Web 2.0 style information access Multiple formats: CSV TEXT, XML, JSON, RSS/Atom, … GLUE 2.0 Usage metrics Coordinated software and services -> CTSS Local (uncoordinated) HPC software May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

The Team Information Services design, development, and support team NSF TeraGrid Review The Team January 10, 2006 Information Services design, development, and support team Eric Blau Jason Brechin Lee Liming JP Navarro Laura Pearlman Information Services downstream developers and consumers End-to-end data transfer optimization team @ PSC (Derek Simmel, others) INCA (Kate Ericson) Operations (Jason Brechin, Tony Rimovsky) User Documentation (Mike Dwyer, Diana Diehl) User Portal (Maytal Dahan) Other lurkers… May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)

More Information Information Services main page: NSF TeraGrid Review More Information January 10, 2006 Information Services main page: http://info.teragrid.org/ (links to content and documentation) User Documentation (CTSS 4 kits, software, services): http://www.teragrid.org/userinfo/software/ctss.php User Portal (scheduler load & queue contents): https://portal.teragrid.org:443/gridsphere/gridsphere?cid=resources Inca Monitoring Framework http://www.teragrid.org/userinfo/software/ctss.php E-mail: <navarro@mcs.anl.gov> May 13, 2008 Globus World 2008 Charlie Catlett (cec@uchicago.edu)