Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christine Laney Ken Ramsey Mark Servilla Information management issues and the Trends project: A drawing board for making cross-site comparisons feasible.

Similar presentations


Presentation on theme: "Christine Laney Ken Ramsey Mark Servilla Information management issues and the Trends project: A drawing board for making cross-site comparisons feasible."— Presentation transcript:

1 Christine Laney Ken Ramsey Mark Servilla Information management issues and the Trends project: A drawing board for making cross-site comparisons feasible

2 THANK YOU!!  LTER Information Managers – you know who you are  LTER Network Office – Mark, James B., Bob, Duane, Marshall, Inigo, James V.  NCEAS – Mark, Callie, Will, Jim, Rick  Jornada staff – Ken, Justin & technicians

3 Trends in Long-Term Ecological Data: a multi-agency synthesis project Objectives to create a platform for synthesis by producing a compendium of easily accessible long term graphs and data from long-term ecological research sites to illustrate the utility of this platform in addressing important within-site and network-level scientific questions

4 Products Folio-sized book to be published by Oxford Univ. Press Website (data, metadata, graphs) for synthesis and analysis

5 Book organization  Introduction: value and importance of long-term research  Within-site graphs/tables arranged by four themes in the LTER Planning Process Climate and variability in the physical environment, including disturbance characteristics Human population and economy Biogeochemistry (e.g., atmospheric deposition, surface water chemistry) Biotic structure (e.g., ANPP, plant biomass, species richness)  Among-site comparison graphs: e.g., atmospheric chemistry, N fertilization, climate variability, ENSO signal responses  Site descriptions and photos organized by biomes

6 Website – Trends Data Store www.ecotrends.info  Initial design: Static datasets, metadata & book graphs listed by chapter and figure number, some search capability Metadata for static data provide access to raw data and the script used to generate the derived product Prototype near completion  Final design: Routinely harvested and derived data from ongoing projects Metadata & links back to sources and revisions Search, sort, analysis & graphing tools Prototype in development

7 Coming soon: www.ecotrends.info

8 Participating Sites

9 Process Selecting variables  Submitted broad request for long-term data  Downloaded data from other online compilations  Examined submitted data for consistent variables across sites (e.g., precipitation, nitrogen, etc.)  Refined data request and requested additional data from sites for variables that should exist, but may not have been submitted (e.g., ANPP, species richness, etc.)  Generated “wish list” of variables that may be important for cross-site and network-level questions, but long-term data don’t exist yet at very many sites (e.g., soil respiration, foliar nutrients). This will be used in planning grant activities. Selecting variables for the webpage  Use variables from book in static form first  Update data sets with time and include additional variables

10 Contributors: 26 LTER (84%), 13 FS (12%), 6 ARS sites (3%) & Santa Rita ER (<1%) Climate datasets ~300 Biogeochemistry datasets ~150 Biotic datasets~100 Others ~50 Total : over 600 datasets Plus 190 llustrative graphs Human population and economy: collected for all LTER sites from census data (funded by NSF supplement) Metadata: Most data have at least rudimentary metadata, few have full EML with attribute level description of the datasets. Progress to date

11 What we’re doing with the data  Downloading and storing data & documentation  Writing R or SAS scripts to generate: Datasets containing monthly or annual averages or totals, depending on the variable Strict time plots with simple linear regression Tables that record all derived statistics Plots that show change over time among different sites for each variable Anomaly plots of monthly climate data  Generating metadata with EML for each derived product. Metadata contains links to original data and associated scripts.  Recording each product (data, metadata, graphs), along with links between products, in a multi-purpose database.

12 Step 1. Graph similar data through time for sites with those data. Step 2. Determine trend line by site. Nitrate in precipitation MULTI-SITE ANALYSES

13 Step 3. Compare slopes of trend lines among sites.

14 Step 4. Compare slopes spatially. Mean change in total deposition of N in nitrate form in precipitation

15 Challenges, solutions & opportunities  Obtaining data  Quality and quantity of data and documentation  Utilizing data toward specific goals  Properly documenting received data and products derived from the data  Making final products accessible to editorial committee and available on website

16 Obtaining Data: time-intensive and inconsistent process on both sides!  Located data on individual websites Few had their long-term data separated out from short-term data Unable to search for long-term data  Utilized metacat via LTER, KNB, Morpho Slow search engine Unable to search for particular record lengths Unable to sort filtered records by time Metadata often available without attached data files  No pre-knowledge of types of available long-term data beyond basics (precip, temp, etc).  Result: a lot of emails and phone calls!

17 Challenges, opportunities & solutions  Obtaining data  Quality and quantity of data and documentation  Utilizing data toward specific goals  Properly documenting received data and products derived from the data  Making final products accessible to editorial committee and available on website

18 Quality and quantity of data and documentation  Lots of great data, varied level of detailed metadata in text or EML format  Small problems with single datasets  large problems with many datasets  Online data sometimes not quality- checked or ready for use – but no markers to say so  Examples:

19 Looks nice…but….

20

21 The nit-picky details  Dates as an example: 2-digit years range of dates in single cell (e.g., 02/01-03/2006 or 02/01/2006,02/03/2006) date with a letter appended to the end (ex: 02/01/1999A) single digit day and month, especially when there are no delimiters between month, day, year.

22 Preferred data formats for synthesis  Simple ascii delimited with commas, spaces, tabs, etc. with headers, or very simple excel spreadsheets. If fixed-width, give widths and spaces.  Metadata in separate file  All data in single file, not separated by year. If not possible, each file in exactly the same format. Complex formatting systems, like multisheets & several tables in one sheet, are more difficult to interpret and extract information.

23 Challenges, opportunities & solutions  Obtaining data  Quality and quantity of data and documentation  Utilizing data toward specific goals  Properly documenting received data and products derived from the data  Making final products accessible to editorial committee and available on website

24 Utilizing data toward specific goals  Selected variables with specified summary time spans (monthly or yearly) with specified units.  Converting short time scales to longer time scales – OK  Converting long time scales to shorter – Impossible  Unit conversion – often simple F  C W/m^2  MJ/m^2  Can be really difficult Flow in m from a weir  m^3/s using weir dimensions Raw shield count data without calibrations given  % moist: impossible.  Missing data – leads to bias in particular months/years especially with totals. Lots of consultation with metadata and PIs. What happens when metadata is incomplete & PIs are unavailable?

25 Challenges, opportunities & solutions  Obtaining data  Quality and quantity of data and documentation  Utilizing data toward specific goals  Properly documenting received data and products derived from the data  Making final products accessible to editorial committee and available on website

26 Properly documenting received data and products derived from the data  Morphing system Hierarchical folder system with emails Attempted EML documentation. Help from NCEAS. Current Versioning System (CVS) & multipurpose SQL Server & MySQL database. Documentation of deriving data and graphs  EML template  Scripts Metacat (versioning)

27 Challenges, opportunities & solutions  Obtaining data  Quality and quantity of data and documentation  Utilizing data toward specific goals  Properly documenting received data and products derived from the data  Making final products accessible to editorial committee and available on website

28 Trends editorial page jornada-www.nmsu.edu

29 Voting page

30 Trends IM meeting, 15 min breakout  Site involvement/commitment to Trends Within site:  Percentage of IM time/resources spent compared to PIs  Percentage of time/resources spent on Trends compared to time spent on site needs  Too much, enough, too little? Among sites:  Has there been communication between sites about trends data requests?  Has Trends triggered any new collaborations or strengthened old ones?  Communication Progress reports: often and/or adequate enough? Recommendations for further communications

31 Trends IM meeting, 15 min breakout  Keeping track of data use & proper citation Now (by the trends project itself) In the future via the website

32 Trends IM meeting, 15 min breakout  International site involvement Interest in Trends project – how can ILTER sites use the current set of data in their own research Reasons pro and con for initiating a similar effort among ILTER sites What would it take to do a Trends-like project at the international level? List of contacts


Download ppt "Christine Laney Ken Ramsey Mark Servilla Information management issues and the Trends project: A drawing board for making cross-site comparisons feasible."

Similar presentations


Ads by Google