A User’s Perspective on Acquisition and Management of CMIP5 Data Jennifer Miletta Adams George Mason University / COLA ESGF2F, December 2014.

Slides:



Advertisements
Similar presentations
Program Management Portal: Overview for the Client
Advertisements

Successful Strategies for Overcoming the Obstacles in Acquisition, Management, and Analysis of CMIP5 Data Jennifer Miletta Adams IGES/COLA AMS 2013.
Climate Analytics on Global Data Archives Aparna Radhakrishnan 1, Venkatramani Balaji 2 1 DRC/NOAA-GFDL, 2 Princeton University/NOAA-GFDL 2. Use-case 3.
Preparing CMOR for CMIP6 and other WCRP Projects
Introducing Web ViewPoint V5 AAH
CMIP5 Download Tutorial Jennifer M. Adams 12 January 2012 /data/cmip5/extras/CMIP5_Tutorial.pptx.
An Update on GrADS and the GDS and their Application to a Searchable Metadata Catalog Jennifer Miletta Adams IGES/COLA.
California Digital Library Applications in the Real World: The Counting California Experience with the DDI Patricia Cruse Ilona Einowski Juri Stratford.
May 14, 2001California Digital Library Using DDI Extensions as Intermediary for Data Storage and Data Display Patricia Cruse Marsha Fanshier Fredric Gey.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Components and Architecture CS 543 – Data Warehousing.
Cloud Computing for Chemical Property Prediction Paul Watson School of Computing Science Newcastle University, UK Microsoft Cloud.
Online Surveys A Look at Cardiff-TeleForm Denise H. Wells Planning and Research Central Piedmont Community College.
Introduction to R Statistical Software Anthony (Tony) R. Olsen USEPA ORD NHEERL Western Ecology Division Corvallis, OR (541)
January, 23, 2006 Ilkay Altintas
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
GrADS: Essential Component of COLA’s Cyberinfrastructure Brian Doty Jennifer Adams.
WEB API: WHY THEY MATTER ECOL 453/ Nirav Merchant
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Ensemble Handling in GrADS
Execute Workflow. Home page To execute a workflow navigate to My Workflows Page.
Ocean Observatories Initiative OOI Cyberinfrastructure Life Cycle Objectives Review January 8-9, 2013 Tom O’Reilly Monterey Bay Aquarium Research Institute.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
The european ITM Task Force data structure F. Imbeaux.
A/WWW Enterprises 28 Sept 1995 AstroBrowse: Survey of Current Technology A. Warnock A/WWW Enterprises
Improved Access to RDA from the MSS OSD Executive Meeting April 28, 2009.
Editing Building Block (EBB) Validation Tool for FDI and ITS Balance of Payments Working Group 02 April 2012 Unit B4, IT for Statistical Production Georges.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California EDGE: The Multi-Metadata.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Coupling protocols – software strategy Question 1. Is it useful to create a coupling standard? YES, but … Question 2. Is the best approach to make a single.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
SAN DIEGO SUPERCOMPUTER CENTER Administering Inca with incat Jim Hayes Inca Workshop September 4-5, 2008.
Product-Generation in ESG: some explorations of the user experience Steve Hankin – March, 2007.
Creating SmartArt 1.Create a slide and select Insert > SmartArt. 2.Choose a SmartArt design and type your text. (Choose any format to start. You can change.
1 Adventures in Web Services for Large Geophysical Datasets Joe Sirott PMEL/NOAA.
Metadata Content Entering Metadata Information. Discovery vs. Access vs. Understanding Cannot search on content if it is not documented. Cannot access.
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
Curator: Gap Analysis (from a schema perspective) Rocky Dunlap Spencer Rugaber Georgia Tech.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344.
® IBM Software Group © 2006 IBM Corporation Rational Asset Manager v7.2 Using Scripting Tutorial for using command line and scripting using Ant Tasks Carlos.
Data Product Catalogue for SeaDataNet and Emodnet Chemistry M. Treguer, T. Loubrieu, IFREMER.
Open Science Grid Build a Grid Session Siddhartha E.S University of Florida.
Gridmake for GlueX software Richard Jones University of Connecticut GlueX offline computing working group, June 1, 2011.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
STAR Scheduler Gabriele Carcassi STAR Collaboration.
Enabling the Transition of CPC Products to GIS Format Brian Doty Jennifer Adams Michael Halpert Viviane Silva.
SEE-GRID-SCI Grid Operations Procedures Antun Balaz Institute of Physics Belgrade Serbia The SEE-GRID-SCI initiative.
PHP: Further Skills 02 By Trevor Adams. Topics covered Persistence What is it? Why do we need it? Basic Persistence Hidden form fields Query strings Cookies.
The status of data server for MICS Asia project Qizhong Wu, Zifa Wang, Zhe Wang, et al th international Workshop on Atmospheric Modeling.
Reiner Schlitzer Alfred Wegener Institute for Polar and Marine Research Ocean Data View Achievements and Future Developments.
WP1.4 Index and Search George Kakaletris University of Athens.
Making FAAM Flights Discoverable
GSICS Collaboration Servers Statuses and Updates
A User’s Perspective on Acquisition and Management of CMIP5 Data
- The DHIS2 Training Environment -
Operating System.
Flanders Marine Institute (VLIZ)
Web Traffic Analysis Script PHP Web Traffic Analysis Script PHP Web Traffic Analysis Software.
Quicken File Password related Issues
Cisco Real Exam Dumps IT-Dumps
Quicken File Password related Issues
National Center for Atmospheric Research
Task 5 : Supporting CCI Contributions to Obs4MIPs
CMIP6 use case and adoption of RDA outputs
ECMWF usage, governance and perspectives
The best choice is massive of success. Adobe Campaign Classic Business Practitioner (AD0 - E300) Exam.
Presentation transcript:

A User’s Perspective on Acquisition and Management of CMIP5 Data Jennifer Miletta Adams George Mason University / COLA ESGF2F, December 2014

COLA’s CMIP5 Data Collection

Workflow Requirements No,,,, et al. Script-Based Flexible Automated Runs in a UNIX environment

Workflow Elements 1. Create list of desired data: ”All available models and ensembles for a subset of experiments, realms, frequencies, and variables” 2. Keep track of what has already been acquired 3. Identify what data are available 4. Get needed data 5. Make data user-friendly

Programmatic View of Workflow while(1) { list(acquired); for(desired) { search(available); for(available) { if(!acquired) needed; } download(needed); }

Keep Track of Acquired Data 11 keywords are required: cmip5 /data /Experiment /Realm /Frequency /MIP-Table /Variable /Institute.Model /Ensemble /Version /datafiles.nc

Discovery of Available Data Build a Dataset search URL: &latest=true &replica=false &facets=id &limit=0 &project=CMIP5 &experiment=piControl &realm=atmos &time_frequency=mon &cmor_table=Amon &variable=clt&variable=hfls….&variable=vas

Download Needed Data 1.Build a file search URL to determine number of files for each data set 2.Build a wget URL to download wget scripts; then give them unique names 3.Keep authentication certificates up-to-date 4.Monitor execution of wget scripts in a staging area 5.Put files in place under local directory structure

Make Data User-Friendly Create GrADS descriptor files Aggregate files over time dimension Make use of ensemble dimension when appropriate Identify missing or overlapping time periods Assign non-standard dimensions (e.g. basin averages) Handle 365-day calendars Interpolate data on non-rectilinear grids For ocean and sea ice realms ESMF’s RegridWeightGen generates the interpolation weights Rotate vector fields from grid-relative to Earth-relative coordinates before interpolation

Version number not with dataRetained during wget script acquisition 1000 File limit per wget scriptPlease minimize file granularity! User authenticationAutomated with MyProxyClient Errors from wget Never mind why, just keep trying. Failure is an option. Some data nodes are friendlier than othersData node blacklist Missing or overlapping data DO NOT hide missing data with a non-linear time axis! Rotation of grid-relative vectorsPlease publish gridspec files! Data on wacky gridsESMF’s RegridWeightGen Special thanks to: Luca Cinquini, Estani Gonzalez, Gavin Bell, Lawson Hanson, and the CMIP5 Helpdesk! ComplicationsSolutions