USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May
GRID Exploratory Preliminary investigation to explore feasibility of utilizing GRID technologies for the improvements to a variety of business needs within the USGS and its communications with external users. First focus area is on the delivery and reception of data that will primarily employ the services of the GridFTP and certificate authority. The second focus area will to explore the utility of the GRID to scale data sharing.
Current Data delivery – FTP based Pull – Semi anonymous ftp –Product ready – sent to user with instructions and password –User ftp via “anonymous” and with provided password –Ftp demon positions user to appropriate directory –User pulls data Push – routine data flows to high volume users –Account provided on remote system –When data available it is pushed to remote system
For routine multiple usage customers –Establish “Certificate process” with customer Self-signed certificate authority Customer generates private/public key pair Generate user certificate with public key Add user certificate to list of trusted users –Customer must install GridFTP client Globus toolkit data management client bundle Gsincftp Java Commodity Grid Kit for Windows Potential Future data delivery – GRIDftp based
For routine multiple usage customers –Pull – Product ready notifies user that data is ready User using GRIDftp and user certificate for authentication provided access and pulls data –Push – Account provided on remote system with host certificate and our user certificate These GRID certificate establish Virtual Organization between the two parties When data available, is GRIDftp used to pushed data to remote system Potential Future data delivery - GRIDftp based
For single usage customers Process to Establish “Certificate process” with customer Customer must install GridFTP client Currently seems too complex (not worth the effort) Would like to have simplified method such as a one time use a “user certificate” Integrated with browser, built in GRIDftp client Potential Future data delivery – GRIDftp based
Exploritory # 2 Currently, underway is a project working on Calibration and Validation (Cal/Val). The current approach is to establish a Web based mechanism to promote and ease the sharing of data between the Cal/Val collaborators. Phase 1 of the project has been completed and can be found at (user code = calval99 password = wgiss03). The project is beginning to move into Phase 2.. The manual mechanics for building the web site and coordination of the data and data storage sites bring in the question of the scalability as a move is made toward an operational state.
Cal/Val WTF Strategy Phase 1Phase 2Operational Data Coverage Sites Data Type User sites Data Provider Sites
GRID Opportunities Explore GRID services to identify opportunities that will improve the ability to scale Cal/Val In parallel to the transition to Phase 2 explore and evaluate Catalogue Serves methods. The catalog manager services will be examined will be Metadata Catalog Service (MCS) and the Storage Resource Broker (SRB) Metadata Catalog (MCAT). Evaluate results and propose scenario for GRID service to support future phases of Cal/Val.
Opportunities with the GRID The catalog manager services will be examined will be Metadata Catalog Service (MCS) and the Storage Resource Broker (SRB) Metadata Catalog (MCAT). Chose MCS from standpoint of “simplicity”
Concept
MCS Example Stand alone Java
Web based Java Server Pages (JSP)
Observations for MCS 2 Installation easier than the Globus Toolkit Required installation of other packages –My SQL –Java Virtual Machine –TomCat (Java Webserver Apache Support) –Apache AXIS (Webserver container) Need to write code to load metadata
Observations for MCS 2 (cont) No tools to coordinate metadata from multiple sites Easy to use Java API –Writing simple query application was quick –Both stand-alone Java and web-based (JSP) Authentication not incorporated
Next Steps Move to version 3 of Globus toolkit –Includes backward compatibility with version 2 –Web services may reduce firewall issues Explore additional possible GRID opportunities –RLS - Replicate Location Service –MDS – Monitor & Discover Services –Retry MCS version 3