NSF TeraGrid Review January 10, 2006

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
The DRIVER Infrastructure (Digital Repository Infrastructure Vision for European Research) Paolo Manghi ISTI - National Research Council, Italy.
CAP Support in Esris Open Source Geoportal Server WMO Information System (WIS) CAP Implementation Workshop Geneva, 6-7 April 2011 Clive Reece
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Using Globus to Locate Services Case Study 1: A Distributed Information Service for TeraGrid John-Paul Navarro, Lee Liming.
TeraGrid Science Gateway AAAA Model: Implementation and Lessons Learned Jim Basney NCSA University of Illinois Von Welch Independent.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
TeraGrid’s Integrated Information Service “IIS” Grid Computing Environments 2009 Lee Liming, JP Navarro, Eric Blau, Jason Brechin, Charlie Catlett, Maytal.
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
GIG Software Integration: Area Overview TeraGrid Annual Project Review April, 2008.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.
TeraGrid Information Services December 1, 2006 JP Navarro GIG Software Integration.
GIG Software Integration Project Plan, PY4-PY5 Lee Liming Mary McIlvain John-Paul Navarro.
TeraGrid Information Services John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration and Information.
TeraGrid Information Services JP Navarro, Lee Liming University of Chicago TeraGrid Architecture Meeting September 20, 2007.
CTSS 4 Strategy and Status. General Character of CTSSv4 To meet project milestones, CTSS changes must accelerate in the coming years. Process –Process.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Microsoft SharePoint Server 2010 for the Microsoft ASP.NET Developer Yaroslav Pentsarskyy
Web Services based e-Commerce System Sandy Liu Jodrey School of Computer Science Acadia University July, 2002.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
1 PY4 Project Report Summary of incomplete PY4 IPP items.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
TeraGrid CTSS Plans and Status Dane Skow for Lee Liming and JP Navarro OSG Consortium Meeting 22 August, 2006.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Module 9 User Profiles and Social Networking. Module Overview Configuring User Profiles Implementing SharePoint 2010 Social Networking Features.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
Distributed Data for Science Workflows Data Architecture Progress Report December 2008.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
SAN DIEGO SUPERCOMPUTER CENTER Welcome to the 2nd Inca Workshop Sponsored by the NSF September 4 & 5, 2008 Presenters: Shava Smallen
TeraGrid’s Common User Environment: Status, Challenges, Future Annual Project Review April, 2008.
Software Integration Highlights CY2008 Lee Liming, JP Navarro GIG Area Directors for Software Integration University of Chicago, Argonne National Laboratory.
TeraGrid Capability Discovery John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration University of Chicago/Argonne National Laboratory.
Monitoring and Information Services Core Infrastructure (MIS-CI) Service Description Mark L. Green OSG Integration Workshop at UC Feb 15-17, 2005.
De Rigueur - Adding Process to Your Business Analytics Environment Diane Hatcher, SAS Institute Inc, Cary, NC Falko Schulz, SAS Institute Australia., Brisbane,
Integrated Information Services “IIS” JP Navarro, U. of Chicago/ANL OGF 30 October 28, 2010.
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
TeraGrid User Portal and Online Presence David Hart, SDSC Area Director, User-Facing Projects and Core Services TeraGrid Annual Review April 6, 2009.
Enhancements to Galaxy for delivering on NIH Commons
TeraGrid Information Services
GPIR GridPort Information Repository
Simulation Production System
NSF TeraGrid Review January 10, 2006
TeraGrid Information Services: Building on Globus MDS4
TeraGrid Information Services Developer Introduction
Information Services Discussion TeraGrid ‘08
GWE Core Grid Wizard Enterprise (
POW MND section.
Joseph JaJa, Mike Smorul, and Sangchul Song
Shaowen Wang1, 2, Yan Liu1, 2, Nancy Wilkins-Diehr3, Stuart Martin4,5
TeraGrid’s GLUE 2 Implementation
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Patrick Dreher Research Scientist & Associate Director
Module 01 ETICS Overview ETICS Online Tutorials
Remedy Integration Strategy Leverage the power of the industry’s leading service management solution via open APIs February 2018.
Presentation transcript:

NSF TeraGrid Review January 10, 2006 TeraGrid’s Integrated Information Service “IIS” Grid Computing Environments 2009 Lee Liming, JP Navarro, Eric Blau, Jason Brechin, Charlie Catlett, Maytal Dahan, Diana Diehl, Rion Dooley, Michael Dwyer, Kate Ericson, Ian Foster, Ed Hanna, David L. Hart, Chris Jordan, Rob Light, Stuart Martin, John McGee, Laura Pearlman, Jason Reilly, Tom Scavo, Michael Shapiro, Shava Smallen, Warren Smith, Nancy Wilkins-Diehr TeraGrid Grid Infrastructure Group (GIG) University of Chicago, Argonne National Laboratory November 2009 Charlie Catlett (cec@uchicago.edu)

Outline Introduction: 1st IIS System Architecture NSF TeraGrid Review Outline January 10, 2006 Introduction: Conceived in 2006; Production in 2007; Presented at GCE’07. IIS Vision 1st IIS System Architecture Distributed CI provider operated local information services Centralized federation wide information services Registries -> XML document entries 2nd IIS Information Architecture Registry architecture and data format The Capability Kit meta-registry Current information registries Leveraging IIS Examples – Providers and Consumers Conclusion and Future Work We are presenting the many changes over the last 2 years November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Vision Human discovery of cyber-infrastructure NSF TeraGrid Review Vision January 10, 2006 Provide an Authoritative Integrated Information Service enabling: Human discovery of cyber-infrastructure Science Gateways, Portals, Documentation, CLIs Software discovery of cyber-infrastructure For automated resource, service, and software selection and access For auto-configuration (applications, gateways, workflow engines) Providers to advertise their cyber-infrastructure offerings Advertise any information about any CI capability Providers own data, and independently control publishing Streamlined operations Change integration and management Automated testing, and monitoring Enabling Discovery: BOTH by humans and by software, these combined require a formal information model and repository driving both To be authoritative it needs to be the same information, human information based on software information Of course: to make it as easy as possible for everyone to discover and/or advertise NOTE we don’t consider it in IIS scope to develop human interfaces, or to manage the source information (we’re not a database) IIS is the infrastructure information pipeline (Grid) November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Easy to Advertise and Discover NSF TeraGrid Review Vision January 10, 2006 Provide an Authoritative Integrated Information Service enabling: Human discovery of cyber-infrastructure Science Gateways, Portals, Documentation, CLIs Software discovery of cyber-infrastructure For automated resource, service, and software selection and access For auto-configuration (applications, gateways, workflow engines) Providers to advertise their cyber-infrastructure offerings Advertise any information about any CI capability Providers own data, and independently control publishing Streamlined operations Change integration and management Automated testing, and monitoring Easy to Advertise and Discover This directly impacted our data format, tool selection, and interface design November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Distributed Architecture Components NSF TeraGrid Review Distributed Architecture Components January 10, 2006 WS/REST Federation Wide Integrated Information Service WS/SOAP Apache 2.0 XML Repository Tomcat WebMDS Clients TeraGrid Wide Databases WS MDS4 Distributed design Aggregation flexible by information type Top level implementation requirement was to make it easy to discover and advertise Service Provider Local Information Service WS MDS4 Clients HTTPD November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

High-Availability Architecture NSF TeraGrid Review High-Availability Architecture January 10, 2006 High-Level Aggregation Service Provider Publishing Clients info.teragrid.org Dynamic DNS This is both a high-availability and high-throughput design Dynamic DNS provides a 15 minute ttl in failover situations Automated difference detection, manual synchronization … November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Information Architecture NSF TeraGrid Review Information Architecture January 10, 2006 Registry Architecture Named Registries, with schema compliant Registry Entries, which are each an XML Document The Capability Deployment Meta-Registry Universal Identifiers Site and Resource Identifiers Capability Identifier Registry entry cross-references Extensibility Meta-Registry Extensions New Registries XML Registry C Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Registry A Entry 1 Entry 2 Entry 3 … Universal identifiers enable the “Integrated” vision of IIS Note that the Meta-Registry is part of the information architecture, nothing in the system architecture requires it’s existence <Reg1.Entry> <id>entry1</id> <foo>bar</foo> </Reg1.Entry> November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

TeraGrid Capability Meta-Registry NSF TeraGrid Review TeraGrid Capability Meta-Registry January 10, 2006 Each Capability Deployment Where (site and resource) What (name, class, and description) Support information Status information Software and services component information Extensions November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Capabilities Kit Registry by Class NSF TeraGrid Review Capabilities Kit Registry by Class January 10, 2006 CTSS Gateways Renci Portal … Application Development & Runtime TeraGrid Core Integration (local info service) Co-scheduling, meta-scheduling Common Client Computation & Scheduling Clients Data Collections Data Management Data Movement servers, Clients Distributed Parallel Application Support Distributed Programming Systems Local Compute Login Nimbus/Cloud Computing Parallel Application Support Remote Computation Science Gateway Support Visualization Software (VTSS) WAN GPFS, WAN Lustre file-systems Workflow Support Local Local HPC Software Central A single capability meta-registry enables a universal capability discovery interface CTSS are the capabilities coordinated across HPC and Storage resources Credential Server (MyProxy) Integrated Information Services User Portal November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Other Registries Gateways Registries Local RP Registries NSF TeraGrid Review Other Registries January 10, 2006 Gateways Registries Science Gateway Web Services Application Registry Local RP Registries Local HPC Software Catalog CTSS Extension Registries Batch System Load (%) Batch Queue Contents (requires authorization) OGF GLUE2 A single capability meta-registry enables a universal capability discovery interface CTSS are the capabilities coordinated across HPC and Storage resources TeraGrid Central Database Registries Site/Organization and Resource identifiers (IDs) and descriptions Project/Allocation to Resource authorization list TeraGrid Science Gateway Catalog TeraGrid System Outages November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Leveraging IIS Examples NSF TeraGrid Review Leveraging IIS Examples January 10, 2006 Resource Description Repository Publishing TeraGrid User Portal Batch Load & Queue Data TeraGrid User Documentation Software Discovery CTSS Software Local HPC Software Science Gateway Software Science Gateways Web Services “WS” Application Registry Advanced Scheduling Information Inca Verification & Validation User Profile Service Discovery CLI Interface Significant progress over last 2 years, and our first paper In GCE’07 I had an opportunity to present TeraGrid Information Services November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Resource Description Repository “RDR” TeraGrid Core Integration Local Compute Data Collections (Storage) TeraGrid Core Services uses RDR to collect and store validated, current and historical resource description information: Common Resource Information Compute Resource Information Data Collections Information Storage Information

TGUP Batch Load & Queue Data Remote Computation -> Local Compute http://portal.teragrid.org/ IIS provides queue & batch load information from all RP sites for TGUP to use in system monitor <LoadRP xmlns=""> <ComputeResourceLoad xmlns=""> <ResourceID>pople.psc.teragrid.org</ResourceID> <SiteID>psc.teragrid.org</SiteID> <LoadInfo hostname="tg-login1.pople.psc.teragrid.org" timestamp="2009-11-11T13:46:19Z"> <Load> <Type>queue</Type> <Value>98</Value> </Load> November 20, 2009 GCE09

TeraGrid User Documentation NSF TeraGrid Review TeraGrid User Documentation January 10, 2006 We our eating our own dogfood, publishing for internal consumption Users and developers like yourself are our also our target http://www.teragrid.org/ http://www.teragrid.org/userinfo/software/ctss.php November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Software Discovery TeraGrid context: Problems: >650 CTSS software package deployments >1600 Local HPC software package deployments >40 Science Gateways offering software packages Problems: How can users discover what software is available, and how to access it? How can Science Gateways or Web Applications discover what software is available thru web service interfaces and invoke it? November 20, 2009 GCE09

Software Discovery Solutions: Which enables, for example: Single IIS interface to multiple software repositories including 3rd party HPC software and Science Gateway software. A custom Gateway web services registry. Which enables, for example: Scientists to discover that Gaussian is available both from the command line and through a full service gateway such as GridChem (www.gridchem.org). Science Gateways and Applications to discover and invoke Gaussian web services automatically. November 20, 2009 GCE09

Software Discovery Design NSF TeraGrid Review Software Discovery Design January 10, 2006 Kit Registry CTSS Kit Software Comprehensive Software Discovery Gateways Kit Software Local HPC Kit Software WS Enabled Software Discovery A single capability meta-registry enables a universal capability discovery interface CTSS are the capabilities coordinated across HPC and Storage resources Local HPC Software Registry Gateway Web Services Registry November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Gateway WS Application Registry Each Gateway hosts a service (RESTful or otherwise) that publishes local web service metadata. Information Services aggregates all configured Gateway hosted GAWSR metadata, creating a central registry. Content of GAWSR metadata is rich enough to dynamically launch jobs via web services. (ie, the registry has enough metadata to allow a user/client to dynamically launch jobs) Following slides demonstrate two clients using the GAWSR. The first & the latter is a. November 20, 2009 GCE09

Dynamic execution of web services written in Java November 20, 2009 GCE09

RIA Flex application showing the available metadata November 20, 2009 GCE09

Local HPC Software Local HPC Software NSF TeraGrid Review January 10, 2006 Local HPC Software What value add does IIS provide an infrastructure November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Advanced Scheduling Information NSF TeraGrid Review Advanced Scheduling Information January 10, 2006 CTSS Co-scheduling Meta-scheduling Computation & Scheduling Clients Local Compute Remote Computation Science Gateway Support Workflow Support GLUE2 Registry What value add does IIS provide an infrastructure November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Inca Verification & Validation Running on TeraGrid since 2003 Verifies IIS published information through automated, user-level testing Total of ~2200 tests running on 18 login nodes, 2 grid nodes, and 3 servers Email notifications for critical services Status views from detailed test information to summary and historical reports Data published as XML, HTML, or graphed IIS compatible REST interface: info.teragrid.org XSL XML CTSS kit registrations http://info.teragrid.org/web-apps/HTML/kit-reg-v1/remote-compute.teragrid.org-4.0.2/bigred.iu.teragrid.org/ http://inca.teragrid.org/inca/HTML/kit-status-v1/remote-compute.teragrid.org-4.0.2/bigred.iu.teragrid.org/ http://inca.teragrid.org/ November 20, 2009 GCE09

NSF TeraGrid Review User Profile Service January 10, 2006 Provide authenticated users with user-centric information HTTPS with Basic Authentication In html, csv, json, perl, and xml formats What value add does IIS provide an infrastructure November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Discovery CLI Interface NSF TeraGrid Review Discovery CLI Interface January 10, 2006 The tginfo CLI: http://info.teragrid.org/tginfo/ What value add does IIS provide an infrastructure November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Conclusion Federation Wide Standards NSF TeraGrid Review Conclusion January 10, 2006 Federation Wide Standards Information Integration Identifiers Information Discovery REST APIs Standard Capability Naming and Description Schemas Federation Wide Information Discovery Using a Central Federation Wide Index Using a DNS/WWW model Central Discovery  Distributed Information Access Enable User Interfaces Web 2.0, Science Gateways, and traditional Web servers ** IIS does not develop those interface Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

Conclusion & Future Work NSF TeraGrid Review Conclusion & Future Work January 10, 2006 Information Architecture Capability Definition Meta-Registry (BioMedical Informatics -- BIRN) Capability Implementation Registry More Capabilities and Capability Classes Clouds/IaaS, SaaS, Distributed Programming Environments (SAGA) , Data Collections Science Gateway Security Configuration Information (SAML) System Architecture Fully REST based registration services (Apache CXF, Globus CRUX) Fully REST based aggregation services More REST based discovery interfaces (with XPATH, XSLT support) More custom REST service, some providing custom user services Separate IIS project Packaged, documented, and distributed for other projects Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)

More Information Web Sites People http://info.teragrid.org/ NSF TeraGrid Review More Information January 10, 2006 Web Sites http://info.teragrid.org/ http://www.teragrid.org/gateways/ http://info.teragrid.org/web-apps/html/index/ (REST APIs) People JP Navarro, Lee Liming (IIS Architecture and Coordination) Nancy Wilkins-Diehr (Gateway Information) Warren Smith (Execution and Scheduling Information) Ed Hannah (Resource Description Information) Kate Ericson (Monitoring and Validation Information) Rion Dooley (Authenticated User Custom Information) What value add does IIS provide an infrastructure November 20, 2009 GCE09 Charlie Catlett (cec@uchicago.edu)