May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams (LLNL) Presented at: The “EO GRID” Workshop Frascati, Italy UCRL-PRES
May 6, 2002Earth System Grid - Williams Funded by the Scientific Discovery through Advanced Computing (SciDAC), this program seeks a new paradigm in the climate change community evolving from centralized data sharing to distributed data-sharing. Enabling geographically distributed teams of researchers to effectively and rapidly acquire knowledge and understanding of massive amounts of climate data holdings. Multiple interfaces to ESG will allow researchers to focus on science and not issues with data receipt, format, and data set manipulation. Earth System Grid (ESG): Overview
May 6, 2002Earth System Grid - Williams ESG: Why is ESG Important to the U.S. Climate Change Program Climate model output and quality observations are vital to providing timely assessments of climate change and impacts. Recent U.S. and IPCC assessment efforts made it clear the lack of accessibility to model simulations is a major problem for future assessments. Access to retrospective climate data (input and output) needed to enable a feedback mechanism to tie researchers directly back to quality control and diagnostics of models. Researchers require access to “format independent” climate and observational data for case-study & training. In the U.S., climate simulation can be viewed as a systems problem, requiring a team of multi-agencies and institutions working together in collaboration.
May 6, 2002Earth System Grid - Williams ESG: U.S. Collaborations & Development ORNL: Climate storage & computational resources ORNL: Climate storage & computational resources LANL: Next generation coupled models & computing LANL: Next generation coupled models & computing ANL: Computational grids, & grid-based applications ANL: Computational grids, & grid-based applications USC/ISI: Computational grids, & grid-based applications USC/ISI: Computational grids, & grid-based applications NCAR: Climate change predication and scenarios NCAR: Climate change predication and scenarios LBNL: Climate storage facility LBNL: Climate storage facility LLNL: Model diagnostics & inter-comparison LLNL: Model diagnostics & inter-comparison
May 6, 2002Earth System Grid - Williams ESG: Requirements & Priority Matrix
May 6, 2002Earth System Grid - Williams ESG: U.S. Department of Energy (DOE) Next Generation Internet (NGI) Project ESG-I (past): Focused on developing techniques for the high-speed data movement between sites and users (e.g., the secure highly efficient File Transfer service, called gridFTP, developed by ANL (i.e., Globus)) Developed replica catalogs for keeping track of data locations Developed request manages for coordinating multiple transfers Developed a grid-enabled version of LLNL’s data analysis package
May 6, 2002Earth System Grid - Williams ESG: ESG-I Architecture
May 6, 2002Earth System Grid - Williams ESG: ESG-I Team Presented their work at Supercomputing 2001 parallel disk system LDAP/Sever Metadata Catalog ANL tape system parallel disk system Network LDAP/Sever Metadata Catalog LLNL LDAP/Sever Metadata Catalog LBNL tape system LDAP/Server Metadata Catalog SC ‘01 RAID Local Disks CLOUD TERRAIN U & V
May 6, 2002Earth System Grid - Williams ESG: DOE SciDAC Project ESG-II (present): Building upon the substantial work of ESG-I Grid-wide services supporting authentication, authorization, data discovery, and user specified analysis Metadata services supporting remote data browsing, querying, accessing, displaying, etc. Filtering services performing intelligent model specific analysis before delivering the results to the user Integrate next-generation data analysis and visualization applications (such as ongoing work at LLNL and NCAR), web- based data portals and other thin clients supporting the Distributed Oceanographic Data System (DODS), and collaborative problem-solving environments.
May 6, 2002Earth System Grid - Williams ESG: ESG-II Architecture
May 6, 2002Earth System Grid - Williams ESG: Metadata Services METADATA EXTRACTION METADATA EXTRACTION METADATA DISPLAY METADATA DISPLAY METADATA BROWSING METADATA BROWSING METADATA QUERY METADATA QUERY ESG CLIENTS API & USER INTERFACES Data & Metadata Catalog Dublin Core Database COARDS Database mirror Dublin Core XML Files COMMENTS XML Files METADATA HOLDINGS METADATA ANNOTATION METADATA ANNOTATION METADATA VALIDATION METADATA VALIDATION METADATA ACCESS (update, insert, delete, query) METADATA ACCESS (update, insert, delete, query) SERVICE TRANSLATION LIBRARY SERVICE TRANSLATION LIBRARY CORE METADATA SERVICES METADATA AGGREGATION METADATA AGGREGATION METADATA DISCOVERY METADATA DISCOVERY METADATA & DATA REGISTRATION METADATA & DATA REGISTRATION PUBLISHING HIGH LEVEL METADATA SERVICES SEARCH & DISCOVERY ADMINISTRATION BROWSING & DISPLAY ANALYSIS & VISUALIZATION
May 6, 2002Earth System Grid - Williams Grid and Network Infrastructure Online storage systems Computational resources ? R CAS ESG services: information, replica, metadata, community authorization M Data consumers Data producers ESG: Collaboration Network
May 6, 2002Earth System Grid - Williams ESG: Example of a Web-based Data Portal ( currently serving 40+ simulations of AMIP, CMIP, and PCM data, and growing )
May 6, 2002Earth System Grid - Williams ESG: Example of a Client Application
May 6, 2002Earth System Grid - Williams ESG: Example of a Script Access The next-generation language, Python, is used to access the Earth System Grid at LLNL Import cdms db = cdms.open(“ldap://localhost:389/database=demo,ou=PCMDI,o=LLNL,c=US”) f = db.open( “ncep_reanalysis_mo”) ds = f(‘ts’)
May 6, 2002Earth System Grid - Williams ESG: Concluding Statements ESG is a highly collaborative effort and will allow users to quickly access data storage facilities storing petabytes of raw or processed data in an application independent manner. Payoffs of this distributed collaborative infrastructure, would include: distributed data-sharing Simplified data discovery of climate data Large-scale climate data processing and analysis Increased collaboration among climate research scientists Aid in climate assessments and estimates of future climate variability and trends For more information on ESG, visit our website at: