Review of IRI DL and Important DL Functions & Code

Review of IRI DL and Important DL Functions & Code
Steve Solecki & Ángel Munoz

Review of DL layout and options
Useful DL functions and code Averaging Anomalies Correlations Binning data (histograms) EOFs and PCs 3. Practice problems 4. Your Input With an emphasis on hands on practice! This is not intended to be all that formal, so ask questions!

Navigating the data library:
Ultimately, the method you use depends on what YOU are looking for… However, it is often easiest to scroll down to the bottom of the homepage of the data library, and under finding data, selecting datasets by source or category…

Data by Category: This option provides a sorted listing of nearly all the datasets in the Data Library based on the type of data they contain – not all of it is climate related and we will likely use some of them eventually Along with each dataset, a brief description, spatial and temporal resolutions, and spatial and temporal limits are included Sometimes this option is best if you’re unsure of what dataset you want/need or just want to explore

Data by Source: This is a complete list of the datasets in the Data Library organized by their source This method is a bit more advanced then the previous one, but in time, once you become more familiar with dataset names and the organizations that make them, this approach should become more feasible (note: the little descriptions next to the datasets are helpful for determining its use) Some datasets that may be useful can be found under: NOAA, NASA, JONES, KAPLAN…of course this is just a few among many and we will eventually explore more varied sets

Others are easier to figure out…. yearly-anomalies [T] correlate RANGE
Useful DL Functions…..What do you think they do? Are there others that you think we need? T 3 boxAverage [X Y]regridAverage [X Y] average {Y cosd}[XY][T] sdv T 12 splitstreamgrid RANGE STEP maskgt and masklt T 15 runningAverage Others are easier to figure out…. yearly-anomalies [T] correlate RANGE RANGEEDGES T (SEASON) seasonalAverage [XY] weighted-average Remember that ‘T’ specifies time and XY are used when dealing with space!

SOURCES .NOAA .NCEP .EMC .CMB .GLOBAL .Reyn_SmithOIv2 .monthly .sst
What does each line of this tell the DL to do? expert SOURCES .NOAA .NCEP .EMC .CMB .GLOBAL .Reyn_SmithOIv2 .monthly .sst T (Jan 1982) (Dec 2000) RANGE T 3 boxAverage T 12 STEP

Ways to Average data (precip., temp., etc.)
Review: Ways to Average data (precip., temp., etc.) There are three ways to average in space in the IRI DL: 1. Select a latitude/longitude domain 2. Select a complex boundary, such as country, administrative region, or more generally, any polygon [or “shape file” in GIS] 3. Select stations in a region of interest

1. To average over a longitude/latitude domain: this approach requires using only one data source as we play with the boundaries, for precipitation in this case. We can manually select the range or enter it under data selection. a. Go to Data by source and use the dataset we worked with in class (NASA>GPCP>v2p1> satellite- gauge…) b. Go into “Data selection” and set longitude (X) and latitude (Y) ranges – must restrict the range! c. Restrict ranges – this will change the boundaries; try something like 25 N to 10 N and 60 W to 80 W; lets try and see what we can get for in and around Aruba just for fun… or somewhere else that you prefer d. Stop selecting – this finishes the process of setting the boundaries and changes what shows up under the expert mode box to be what you want e. Then go into “Filters” and select average over X Y; or you can just type in [X Y] average – if just type it into expert mode, you don’t need to bother with the filters. *We need to include this last step, otherwise we will not get the average (datasets are just data, we need to usually manipulate it to get what we want)* Now you can click on any of the different views to see how the data shows up…line chart, color bars, scatterplot

Finding our country ID and coding:
Go to Data Sources and scroll down to features and click on it Select Political, since we are interested in that type of feature (country boundaries to be exact)…other stuff is here too that we may or may not ever use Select ‘World’ and then ‘Countries’ and then ‘Country Names’ since we first need our country ID Select Tables, columnar table and then find the country ID for…. Now, make a note of that ID and then go back to the ‘Countries’ option Select country geometry, click on the expert mode box, and after the code that is already in the expert mode box, type in: objectid X VALUE This is all the code we need, so now copy this entire code and past it in the expert mode box where we have our prcp code Expert mode should now look like this: expert SOURCES .NASA .GPCP .V2p1 .satellite-gauge .prcp SOURCES .Features .Political .World .Countries .the_geom objectid X VALUE Finally we need to get a weighted average: so we can type in [X Y] weighted-average after the objectid line of code. This yields a calculation that will be a set of country averaged precipitation values that depend only upon time (T) and the country ID (objectid).

3. Averaging over stations…
Click on the “Searches” link to the right of the map, and search on the archive of stations, e.g. by station name (sometimes country name is included) by longitude and latitude ranges You will be given the option to further select the list of stations that responds to the criteria you specified. OR Go to: This link shows us all station data collection points: From this point, zoom in on the map to an area where there is a station you’re interest in…lets try one in Japan using the SENDAI station

Practice Using DL Functions:
Locate a dataset that provides combined satellite-gauge precipitation data and: 1a. Determine annual average rainfall over New Zealand between 1970 and 2005. 1b. Do the same but for the Jun-Aug season for the same time period.

Earth science data is commonly viewed in term of anomalies (i. e
Earth science data is commonly viewed in term of anomalies (i.e., difference between observations and climatology) rather than as raw values. Anomalies can be produced with Ingrid by first calculating a climatology and then calculating the difference between it and the observed data. However, Ingrid also has a single command that does all of these calculations. yearly-anomalies is the function we used to remove climatology values so that we get a better idea of relationships… NOAA ENCEP

Find a SST dataset that includes the time period you need
Let’s say we’re tired of looking at the ENSO region and have suddenly become interested in finding out more about SST anomalies in the N. Atlantic… Where would we begin? Find a SST dataset that includes the time period you need Maybe consider bookmarking the page or saving the link Restrict the range to be over the N. Atlantic (you can estimate) Restrict the time period if need be Call for the DL to show you the anomalies

2a. Determine the three-month average precipitation over Madagascar 2b. Do the same, but this time consider anomalies. Now, locate a monthly SST dataset that is still updated (hint: we may have used it for a HW) 3. Split the data, so that it gives monthly values and determine the average value over the area where the PDO signal is typically found (try between 20N to 65N and 150 E to 140 W). Make a map that shows the average values for 1983.

IRI’s definition of the correlation function:
The correlate command calculates the Pearson product moment correlation for the two latest items on the stack over the indicated grid. For the correlation to be computed, the gridding of the two items on the stack must match – this last point is important to remember! The datasets used must have the same ranges, in terms of time and space.

Try the following for practice:
Create a correlation map that shows the relationship between surface temperature anomalies and palmer drought index values for Jun-Aug and then Dec-Feb between 1980 and 1995 using the following coordinates as boundaries: 10N – 40N and 30W – 40E. For surface temperature anomalies, find an air temperature (over land) anomalies dataset and then find the palmer drought indices dataset. What are the relationships? Do they differ by season? Is the relationship significant? If you already did this, try correlating precipitation anomalies with the NAO dataset and see what shows up… See subsequent slides if you want to do it step by step otherwise try it out without looking at first! December 1983 Palmer Drought Index Map

Entering commands into expert mode:
Now we need to combine the datasets in order to get what we want… Copy and past the Palmer Drought Severity Index source into the temperature anomaly expert mode window (below it) expert SOURCES .UEA .CRU .Jones .CRUTEM3 .tanom SOURCES .NCAR .CGD .CAS .Indices .PDSI Based on this ordering, the PDSI data will be gridded onto the resolution used for the temp anomaly dataset, if we don’t like it we can just switch the order. 2. Restrict the ranges; I provided them but sometimes you may need to select your own using the data selection tab or data views X RANGE Y RANGE

3. Restrict the data to show data for the season we are interested in
expert SOURCES .UEA .CRU .Jones .CRUTEM3 .tanom X RANGE Y RANGE T (Jun 1980) (Aug 1995) RANGE SOURCES .NCAR .CGD .CAS .Indices .PDSI 4. Have a seasonal average determined expert SOURCES .UEA .CRU .Jones .CRUTEM3 .tanom X RANGE Y RANGE T (Jun 1980) (Aug 1995) RANGE T (Jun-Aug) seasonalAverage SOURCES .NCAR .CGD .CAS .Indices .PDSI

5. Regrid the second dataset with respect to the first datasets resolution
expert SOURCES .UEA .CRU .Jones .CRUTEM3 .tanom X RANGE Y RANGE T (Jun 1980) (Aug 1995) RANGE T (Jun-Aug) seasonalAverage SOURCES .NCAR .CGD .CAS .Indices .PDSI [X Y]regridAverage 6. Find the correlation between the two datasets expert SOURCES .UEA .CRU .Jones .CRUTEM3 .tanom X RANGE Y RANGE T (Jun 1980) (Aug 1995) RANGE T (Jun-Aug) seasonalAverage SOURCES .NCAR .CGD .CAS .Indices .PDSI [X Y]regridAverage [T]correlate 7. Look at a correlation map created from this data (use color with coasts data view)

Jun-Aug: Temperature anomalies first then Palmer Drought indices Palmer Drought Indices first then temperature anomalies Palmer drought indices is a bit finer....maybe we prefer this one?

Dec-Feb: Temperature anomalies first then Palmer Drought indices Palmer Drought Indices first then temperature anomalies

Let’s try a Linear Data Interpolation for the last example, to make the resolution a bit better for our maps! grid to a higher resolution (e.g., 1° x 1°) expert SOURCES .UEA .CRU .Jones .CRUTEM3 .tanom X RANGE Y RANGE T (Jun 1980) (Aug 1995) RANGE T (Jun-Aug) seasonalAverage X GRID Y GRID SOURCES .NCAR .CGD .CAS .Indices .PDSI [X Y]regridAverage [T]correlate Tells the DL to regrid between -13 (W) and 32 to a 1 degree resolution and the same for points between 35 and 60N

grid to a higher resolution (e.g., 1° x 1°)
JJA Season We can do this with all the others too if we want

DJF Season

Is the relationship significant for either season?
Need to recall how to do this based on last years stats class…. First determine what correlation values needs to be surpassed in order to qualify as significant. We can do this with a simple t-test: # years = 1995 – 1980 = 16 year span (to be significant at α = .05, value must be > ~ 1.75) So then…. 1/√n /√15 = .258 Sample corr – 0/ √n /.258 = 2.33 – YES; Sample corr – 0/ √n /.258 = 1.94 – YES; Sample corr – 0/ √n /.258 = 1.55 – NO; …therefore any values less than +/ fail to be significant… So from this we can say that generally the relationship is significant over several regionals including: portions of Great Britain, Russia in JJA and portions of the north coast of Africa and Russia among others during DJF – not the greatest resolution in either case!

Result of correlating NAO w/ precipitation anomalies

[X Y]weighted-average T (May 1961) (Oct 2009) RANGE
What does each line of this tell the DL to do? expert SOURCES .NOAA .NCEP .CPC .PRECL .prcp SOURCES .Features .Political .World .Countries .the_geom objectid 24 VALUE [X Y]weighted-average T (May 1961) (Oct 2009) RANGE T (May-Oct) seasonalAverage T differences

4. Find the NOAA NCDC DAILY precipitation dataset, then zoom in to Alaska and fine the NOME station (use the first one you find). After selecting this stations create a plot that shows a running average of precipitation for every 30 days

Binning data: As per IRI the distrib1D command is what we need to use: distrib1D returns the frequency distribution (as binned counts) of data from an input variable based upon a user-specified binning interval and range limits defined in the DATA lower upper step RANGESTEP command. In doing this, distrib1D creates a new grid of bins defined by the RANGESTEP command that has the same name as the input variable. Four our case, let’s take a Gulf Coast station data and use it to find the frequency of rainfall amounts:

Use the Pensacola/Forest Sherman station

At first, you probably get something that looks like this:
expert SOURCES .NOAA .NCDC .GHCN .v2beta IWMO VALUES .prcp T (Oct 1980) (Dec 2000) RANGEEDGES T (Oct-Dec) seasonalAverage DATA RANGESTEP distrib1D For this station, are these values ok? Note: 0 = minimum value, 50 = maximum value, and 2 = bin size

NO! Based on the OND averages, precipitation goes way beyond 50 mm and the increase of 2 mm per bin is too small The range is much greater! We need to change: DATA RANGESTEP Lets try this: DATA RANGESTEP

expert SOURCES .NOAA .NCDC .GHCN .v2beta IWMO VALUES .prcp T (Oct 1980) (Dec 2000) RANGEEDGES T (Oct-Dec) seasonalAverage DATA RANGESTEP distrib1D Looks more normally distributed?

expert SOURCES .NOAA .NCDC .GHCN .v2beta IWMO VALUES .prcp T (Oct 1980) (Dec 2000) RANGEEDGES T (Oct-Dec) seasonalAverage DATA RANGESTEP distrib1D Looks more normally distributed too?

5. Take any station in Peru and use it to find the frequency of rainfall amounts. Compare your results with someone else's. How do your results differ? Why might this be?

Interpreting EOFs and Time Series
-singular value decomposition (svd) reduces the number of values in a dataset -SVD analysis results in a more compact representation of correlations -provides insight into spatial and temporal variations exhibited in the fields of data being analyzed -highlights certain patterns with respect to a timeseries -each eigenvector explains a certain percentage of variability -usually the first 1-2 explains the most variance

{Y cosd}[X Y][T]svd The svd function computes the singular value decomposition of the SST dataset weighted over the cosine of the latitude. Often, spatial data will be weighted over the cosine of the latitude to account for area changes between meridians at varying latitudes. Five new variables appear under the Datasets and Variables subheading: normalized eigenvalues, structures, singular values, time series, and weights.

Example of an EOF in DL – under svd results

How would we interpret these? For this analysis, Kenya was considered…
The leading mode shows negative values over Kenya (and other portions of the Horn of Africa) and over the western Indian and western Pacific Oceans (near the area where ENSO forms). Positive values are found in the eastern Indian Ocean and over most of Indonesia. Based on the time series, it appears there are many interannual fluctuations in the signal. PC1 explains about or 40.99% of the variance. The second mode is markedly different from the first. There are positive values situated over the western Indian Ocean and Kenya, while negative values appear over portions of Indonesia and. Negative values interestingly enough remain over the western Pacific. This relationship appears to only account for or 14.51% of the variance, less than half that of PC1. In terms of the time series, it seems that there is less interannual variability and more of a potential positive trend present (particularly between roughly the mid1980s and then again present). Climatological interpretation: Based on EOF1, the spatial pattern shown appears to be consistent with the interannual variability caused by ENSO. EOF2 appears to be more complicated. There does not appear to be as much interannual variability. The spatial pattern seems to maybe be showing the regional warming that has taken place over the Indian Ocean . This warming has impacted some countries around the Horn of Africa as a result of more convergence occurring over the Indian Ocean. As a result, it rains more over the Indian Ocean leaving lesser amounts of moisture available to be transported towards the eastern coast of Africa (which likely is causing some shorter rainy seasons and/or droughts and a longer dry season). Alternatively, it is possible that the second EOF is also showing a pattern related to the monsoon season, where low pressure centers become concentrated around the eastern Indian Ocean and zones of subsidence (or high pressure) form over the western Indian Ocean, resulting in drier conditions for countries like Kenya.

Sample correlation maps of SST with EOFs from previous slide
Based on the SST correlation maps, it is evident that PC1 is most strongly correlated with ENSO (particularly the La Nina phase) in the equatorial Pacific. Smaller pockets of SSTs over the North Atlantic (maybe associated with NAO), North Pacific (possibly linked with PDO) and Indian Oceans appear to be significantly correlated to PC1 as well. Based on this map, it is evident that SSTs of a variety regions impact rainfall across Kenya and other Horn of Africa countries. PC2 does not seem to be as strongly correlated with as many SSTs of ocean basins. There does seem to be significant correlations between PC2 and SSTs of the eastern equatorial Pacific (off the coast of Peru), portions of the Indian Ocean and small areas of the North Atlantic. It is possible that PC1 is strongly correlated with SSTs related to interannual to decadal variability, while PC2 is more or less correlated to the warming of certain portions of different ocean basins, resulting in warmers SSTs.

SOURCES SOURCES .NCAR .CGD .CAS .Indices .PDSI
What does each line of this tell the DL to do? SOURCES SOURCES .NCAR .CGD .CAS .Indices .PDSI Y (10N) (50N) RANGEEDGES X (150W) (30W) RANGEEDGES T ( ) VALUES T (Jun-Aug) seasonalAverage {Y cosd}[X Y][T]svd .Ts SOURCES .KAPLAN .EXTENDED .ssta [T]correlate X Y fig: colors | contours coasts :fig

6. Construct a correlation map between Mauna Loa CO2 values and the UEA’s precipitation anomalies dataset for How does it look over each of the seven continents? Where is the strongest correlation? (Hint; there may not be an anomaly dataset, but you can still use a precipitation one)

Review of IRI DL and Important DL Functions & Code

Similar presentations

Presentation on theme: "Review of IRI DL and Important DL Functions & Code"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Review of IRI DL and Important DL Functions & Code

Similar presentations

Presentation on theme: "Review of IRI DL and Important DL Functions & Code"— Presentation transcript:

Similar presentations

About project

Feedback