Presentation is loading. Please wait.

Presentation is loading. Please wait.

Climate Analytics on Global Data Archives Aparna Radhakrishnan 1, Venkatramani Balaji 2 1 DRC/NOAA-GFDL, 2 Princeton University/NOAA-GFDL 2. Use-case 3.

Similar presentations


Presentation on theme: "Climate Analytics on Global Data Archives Aparna Radhakrishnan 1, Venkatramani Balaji 2 1 DRC/NOAA-GFDL, 2 Princeton University/NOAA-GFDL 2. Use-case 3."— Presentation transcript:

1 Climate Analytics on Global Data Archives Aparna Radhakrishnan 1, Venkatramani Balaji 2 1 DRC/NOAA-GFDL, 2 Princeton University/NOAA-GFDL 2. Use-case 3. User Interface 4. Future Work Analysis products bundled with Live Access Server: The rapidly growing climate modeling enterprise challenges us in different facets such as the availability and utilization of computing resources, data storage, usability and maintenance, analysis capabilities etc. With the availability of high- resolution climate data, the data volume gears up significantly, making it a challenge to apply user- developed climate analytics. The CMIP5 project is a significant example for such a scenario. This scientific project was designed by a team spanning twenty or more modeling centers across the planet: many of the largest supercomputers in the world were given over many months to the running of the experiments; the data is now stored in a distributed archive of nodes governed by the Earth System Grid Federation (ESGF), with a core measuring more than 1 PB, and a total of about 20 PB. With the global availability of the data archives, there is an explosion of interest in climate analytics and research work. Often, there is a need for replicating a specific analysis suite to analyze the behavior of different climate datasets, compare inter-model- experiments to evaluate and address the specifics. In this presentation, we provide an innovative solution to deploy user-developed climate analytics on CMIP5 ESGF federated archive in the form of a web service. Originally, climate analytics were applied to GFDL datasets using the Curator database to locate the internal resources/variables, etc. Later, the approach significantly transformed to being able to apply climate analytics on ESGF’s global data archives. 1. Introduction “Climate analytics on global data archives” is an ongoing work under the auspices of the ExArch Project. This project is principally a framework for the scientific interpretation of multi-model ensembles at the peta-and exa-scale. It applies a strategy, a prototype infrastructure and demonstration usage examples in the context of the imminent CMIP5 archive. The work is sponsored by a coordinated effort among science agencies of the G8 countries, including NSF. Say, there is an innovative analysis script developed by a user, who has developed it using local analysis resources and some small subset of data downloaded from the CMIP5 federated archive. Her analysis is widely cited, and there is interest worldwide in replicating her study on other datasets from the archive. How is this to be achieved? Participating Institutions BCC, BNU, CCMA, CMCC CNRM-CERFACS, COLA- CFS, CSIRO-QCCCE, INM, IPSL, LASG-CESS, LASG- IAP, MIROC, MPI-M, MRI, NASA-GISS, NASA- GMAO, NCAR, NCC, NOAA-GFDL, NOAA- NCEP.. CMIP5 models ̴ = 52 climate models Experiments Long-term, near-term, atmosphere-only (Total ̴ = 116 experiments) Don’t forget the ensemble members! Frequency 3-hourly, 6-hourly, climatology monthly mean, daily, fixed, monthly, sub-hourly, yearly Realms Aerosol, atmosphere, land, land-ice, ocean, sea-ice, ocean Biogeochemistry Climate fields: Total ̴ = 550 CMIP5 Tree 2.1 Bring analysis to data: Input: Get dataset Identifiers (D1,D2) for the experiments E1,E2 in comparison. (This includes model-name, experiment, ensemble_member, frequency, realm, CMIP table) Eg: NOAA-GFDL.GFDL-ESM2G.historical.mon.atmos.Amon.r1i1p1 Input: Get start_time (t 0 E1,t 0 E2) and end_time (t 1 E1,t 1 E2) for experiments E1, E2 in comparison – as input Input: Get CMIP5 variable name (V) to be analyzed Input: Get climate analytics plot type to be applied to datasets (D1,D2) THREDDS CATALOG FEEDER (python-based) NetCDF files Analysis Products 1. Crawls through ESGF Root THREDDS catalogs and locates datasets D1,D2. 2. Fetches the OPeNDAP aggregation URL for variable “V “ in datasets D1,D2. 3. Prepares arguments to be passed to the analysis script templates along with the start_time(s) (t 0 E1,t 0 E2) and end_time(s) (t 1 E1,t 1 E2). 4. Runs the analysis scripts (any language. Currently, tested with Ferret) server-side. 5. Sends analysis products back to Thredds Catalog feeder. 6. Throws exceptions if the timer ranges are not available for specified experiments or if specified variables are not part of a given experiment. 4.1. 2d map comparisons for tropical oceanic fields 4.2. Statistical Downscaling 4.3 And many more.. Fig. 1. CMIP5 Tree Step 1: Select the data sets to be compared Step 2: Select the plot type Step 3: Select the variable and/or region Step 4: Select the year range to be compared Acknowledgement: This work was partly funded by the international ExArch project under the G8 initiative by National Science Foundation Award 1119308. Many thanks to: ExArch, Andrew Wittenberg (GFDL), Roland Schweitzer (Weathertop Consulting/PMEL) Fig. 2. Bring analysis to data Fig. 4. precip map comparison Fig. 5. Statistical Downscaling overview Step 5: View output Fig. 3. Analysis output from LAS


Download ppt "Climate Analytics on Global Data Archives Aparna Radhakrishnan 1, Venkatramani Balaji 2 1 DRC/NOAA-GFDL, 2 Princeton University/NOAA-GFDL 2. Use-case 3."

Similar presentations


Ads by Google