iRODS at the ASDC – Performance Results and Lessons Learned Federation between Atmospheric Science Data Center and NASA Center for Climate Simulation Mike Little, Andrei Vakhnin, Tiffany Mathews, Beth Huffer, & Brandi Quam, NASA Langley Research Center, Hampton, VA Dan Duffy, Scott Sinno, NASA Goddard Spaceflight Center, Greenbelt, MD,,,,,, Abstract The Atmospheric Science Data Center (ASDC) is in the process of upgrading mechanisms by which its data can be discovered, accessed and understood. One mechanism which shows particular promise is Integrated Rule-Oriented Data System (iRODS). The ASDC, in conjunction with the NASA Center for Climate Simulation (NCCS), have conducted testing of iRODS as a data discovery and delivery mechanism and has found excellent performance. The ASDC then federated their implementation with the NCCS and established a production-level presence, making all data products available through this mechanism. We present lessons learned from its deployment, including the automation of the population of the directory system (iCAT) from ASDC's ontology. ASDC-NCCS Federation Performance Testing The ASDC, seeking more efficient high performance data delivery tools, experimented with federating iRODS with NCCS and consulted with RENCI, etc.. Testing consisted of functionality checks, and file transfer performance. Client software at the remote NCCS site, connected to their local iRODS server, was able to take advantage of the same functionality as clients directly connected to the ASDC server. Goal #1 The ASDC will strive to expand beyond its existing customer base by increasing accessibility to a broader, worldwide market; through the use of innovative technologies, the ASDC will enhance data access capabilities and develop plans to share data with new user communities. Goal #4 The ASDC will continue to foster innovation by actively assessing emerging technologies and their applicability to existing and projected customer needs and requirements in order to mitigate gaps in capability The 2013 ASDC strategic plan defines six goals that emphasize the vision and support the mission and values of the ASDC. The ASDC’s adoption of iRODS supports two goals: Characteristic Local File System ftp Data Transfer iRODS Data Transfer Latency 120ms 400ms Time to copy 9GB file 2 min 10 min Time to copy 10 9GB files 20 min 40 min iRODS Federation Between ASDC and NCCS ASDC-NCCS Federation Lessons Learned Planning Federation across computer security domains must engage a significant number of infrastructure managers, including local and Agency CIO offices, all the various computer security managers, and local and Agency network managers. Coordinating the infrastructure owners and debugging obstacles was the challenge. A precision ontology, while not essential, made information sharing across knowledge domains far easier than vague metadata that invariably means different things to different communities. A use case to help drive eradication of the obstacles is essential to creating a broadly capable information sharing capability. A local technical expert must be identified to leverage all functionality of iRODS. Implementing It is imperative to ensure infrastructure managers have clarity regarding requirements for their respective components needed to support iRODS federation. iRODS redesign between versions 2.x and 3.x preclude multi-generational federations. Operating and Maintaining Continuous monitoring/evaluating of connectivity is necessary to detect unannounced infrastructure changes. ODISEES Ontology Driven Interactive Search Environment Earth Science Semantic Web Tool iRODS Clients Assimilation & Climate Modeling (Via NCCS) Climate Modeling LIS Modeling Support NCCS NCCS File System Weather Modeling iCAT Rules Engine iRODS 3.3 Center Firewall ASDC-NCCS Federation Conclusions The use of iRODS is a highly effective way to expose information across knowledge domains. It provides a useful interface that can be used by software to access data without creating a local repository. A federation through iRODS is highly sensitive to changes in connectivity, protocol filtering, proxy filters and other interception-type computer security tools. iCAT Rules Engine iRODS 3.3 Internet Center Firewall and Computer Security Appliances ASDC-NCCS Federation Future Work Development of iRODS micro-services to interface ASDC access tools and clients to NCCS data, including the ODISEES client. Identification of other potential collaborators in sharing ASDC data via iRODS. Conversion of ODISEES ontology interface from batch upload to dynamic link to Allegrograph rdf-triple database. Testing of Registered vs. Ingested data products to determine scaling factors. ASDC Support The ASDC’s Data Products Online (DPO) GPFS File system consists of 12 x IBM DC4800 and 6 x IBM DCS3700 Storage subsystems, 144 Intel 2.4 GHz cores, 1,400 TB usable storage. . DPO (Data Products On-line) F S 1 2 3 4 ECS Data Pool Acknowledgements & Resources Thanks to John Kusterer, Phil Webster, Matthew Tisdale, and Al Settell for all their shared knowledge as well as their insights to lessons learned regarding each of the mentioned technologies. Their collaboration helped to make this poster possible. This is not an inclusive list, these are eight featured data products from a list of over forty. Remote Sensing Data Products Resources iRODS: ODISEES: Beth Huffer (Developer) ASDC: Earth’s Surface Earth’s Surface