Presentation is loading. Please wait.

Presentation is loading. Please wait.

AOLI 2015 The NMME Experience: A Research Community Archive Lessons learned from Climate Model data archive and use AOLI Meeting 2015 Eric Nienhouse NCAR.

Similar presentations


Presentation on theme: "AOLI 2015 The NMME Experience: A Research Community Archive Lessons learned from Climate Model data archive and use AOLI Meeting 2015 Eric Nienhouse NCAR."— Presentation transcript:

1 AOLI 2015 The NMME Experience: A Research Community Archive Lessons learned from Climate Model data archive and use AOLI Meeting 2015 Eric Nienhouse NCAR

2 AOLI 2015 We build data archives and curate data for diverse communities. ESG-NCAR Climate Data at NCAR Climate models (CESM) NMME Archive RCMs (NARCCAP) Large data volume Heavily accessed RDA Research Data Archive Reanalysis + obs products Subset and re-format svcs ECMWF, ICOADS, JRA-55 Actively curated ACADIS Advanced Collaborative Arctic Data Information Service NSF Arctic projects Self publishing tools Many disciplines Highly varied data Long term preservation 20K annual users, 300 data providers, 10K collections, 4.5PB, 2PB+ yearly downloads. Sample of NCAR Data Archives

3 AOLI 2015 Community Use and Access Data products are growing in popularity among non traditional disciplines Over 2000 users monthly Diverse and growing user base Data reduction increasingly utilized Seeking more ways to access data Diversity of dataset disciplines

4 AOLI 2015 NCAR flops and bytes, 2000-2030

5 AOLI 2015 North American Multi-Model Ensemble The goal of NMME is to improve intra-seasonal operational weather prediction Based on leading North American climate models: CCCMA-CamCM3/4, NOAA-GFDL-FLORB, NASA-GEOS-5, UM-RSMAS-CCSM4, NCAR-CESM1, NOAA-CFSV2 Supported by NOAA/CPO with contributions from NSF, DOE, NASA Project objectives: Continued real-time forecasts which incorporate updated model refinements. Coordinated predictability research that identifies the benefit of the multi- model approach and guides model development and applications. Development of an intra-seasonal protocol for model evaluation. Enhanced data distribution to facilitate use of NMME operational and model evaluation data.

6 AOLI 2015 Data publication, standard interfaces and federated discovery NMME Data Publication Data publication at NCAR with THREDDS Data Services for Data Access Daily and monthly hindcasts since Jan 1981 Over 300TB, 8K datasets, 800K files 12TB served monthly Enhanced discovery through ESGF DAP/NCSS services on some datasets CoG hosting NMME project website Community space for project Search access to ESGF NMME data Data access documentation NMME news and other dissemination North American Multi-Model Ensemble

7 AOLI 2015 Data Management Highlights and Challenges Data requirements brought consistency and challenges to providers Preparation: Data Management Planning Project included solid data management plan and set expectations Intercomparison was a primary use case Data Provider Guidelines developed collaboratively Priority output variables identified by expected community needs Guidelines leveraged existing convention profiles (CMIP, CHAP) Based on Climate and Forecast (CF) conventions NetCDF (4) format chosen based on community needs & data volume Takeaway: Consistency goal was difficult without tools and training…

8 AOLI 2015 Data Management Highlights and Challenges Data consumer needs implied an iterative enhancement approach Data Publication Pipeline at NCAR Transfer to archive (Globus, shipped disk, scripted FTP) Create inventory manifest QC (NetCDF, Metadata) File correction (Naming, CF Metadata) Archive (Tape/HPSS) Publish to Data Service Publish to Data Portal Update & Refine (Recall)

9 AOLI 2015 Data Management Highlights and Challenges Data requirements brought consistency and challenges Some Lessons learned Plan to publish multiple versions of datasets Plan to re-process and iteratively improve metadata Plan for modeling groups to re-run at least some simulations Plan for diverse networking capabilities and limitations Use version control on all scripts used for processing and publication Use an issue tracker for data publication management and workflow Plan for extra provider resources for data processing, iteration and re-runs Automate whenever possible

10 AOLI 2015 Data Management Highlights and Challenges Data requirements brought consistency and challenges Greatest Challenges Workflow and automated processing tools for providers Storage resources for provider processing and transfer Archive gap identification and analysis tools Institutional security policies made collaboration difficult

11 AOLI 2015 Data Management Highlights and Challenges Coordinating efforts and sharing information is critical Wins identified Globus Online made transferring high volume data reasonable Open Data and zero click through data access enables use CMIP conventions and (ESGF-style) data catalog enables tool and script re-use NetCDF4 with compression reduces storage and maximizes network bandwidth Opportunities & Ideas End user Script repository Data Services (OpenDAP, WMS, LAS) leveraged by consistent archive OpenExchange user self help system with expert input Potential for additional services (Jupyter Hub)

12 AOLI 2015 Removing Barriers to Scientific Data Use Common Problems: Finding and preparing data for analysis is expensive. Search, download, evaluate, repeat is slow. Scientifically related data is hard to find. Tools for data evaluation are lacking in workflows. Human experts cannot scale to meet growing needs.

13 AOLI 2015 Removing Barriers to Scientific Data Use Imparting knowledge to inform data consumers is a growing need. Knowledge Published Data Analysis

14 AOLI 2015 Challenges of obtaining data for analysis Big data challenges include increasing efficiency of obtaining data and information Discover Published Data Evaluate Access Analysis Discovery is improving Metadata federation Search engines Schema.org Evaluation & Access is Challenging Download often required Little guidance in workflow Human experts fill in gaps

15 AOLI 2015 How do we improve the path to analysis? Open data really helps (services, ease of access) Connect information to data workflows (wikis, experts) Focus on usability with user centered, iterative design Increase access to information throughout access workflow Separate repositories, from workflows, from applications Enable third party innovation with API access Build in metrics to measure and guide improvements Community building, open services, user experience driven tools for enabling innovation A good starting point: Develop communities around data projects and products… … and enable building systems of systems

16 AOLI 2015 Data Management Highlights and Challenges Data requirements brought consistency and challenges to providers Wins identified Globus Online made transferring high volume data reasonable Open Data and zero click through data access enables use CMIP conventions and (ESGF-style) data catalog enables tool and script re-use NetCDF4 with compression reduces storage and maximizes network bandwidth Opportunities End user Script repository Data Services (OpenDAP, WMS, LAS) leveraged by consistent archive OpenExchange user self help system with expert input Potential for additional services (Jupyter Hub)

17 AOLI 2015 Thank you! ejn@ucar.edu http://https://www.earthsystemgrid.org/search.html?Project=NMME https://www.earthsystemcog.org/projects/nmme/

18 AOLI 2015 BLANK SLIDE


Download ppt "AOLI 2015 The NMME Experience: A Research Community Archive Lessons learned from Climate Model data archive and use AOLI Meeting 2015 Eric Nienhouse NCAR."

Similar presentations


Ads by Google