1 of 53 Lecture 3 Metadata Steve Burian Hydroinformatics Fall 2013 This work was funded by National Science Foundation Grants EPS 1135482 and EPS 1208732.

Slides:



Advertisements
Similar presentations
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
Advertisements

Value of Metadata Lesson 8: Value of Metadata CC image by John Norris on Flickr.
What is Metadata Lesson 7: What is Metadata CC image by bonus on Flickr.
FGDC & ISO: What is the Current Status and Considerations when Moving Forward? Viv Hutchison USGS Core Science Systems November 10, 2010 Salem, OR.
An Leabharlann UCD Órna Roche UCD James Joyce Library Metadata Documenting your data
Creating Geospatial Metadata for the Long-term Lynda Wayne Federal Geographic Data Committee Geospatial One-Stop GeoMaxim.
Writing Metadata. First records are the hardest. Not all fields may need to be filled in. Tools are available. Training classes can be taken. Can often.
Oregon Spatial Data Library Partnership Metadata Training OU Knight Library Eugene, Oregon December 3, 2009 Kuuipo Walsh Institute for Natural Resources.
Caro-COOPS Data Management: Metadata. Cast-Net addresses the need for improved connectivity among coastal observing systems by creating a regional framework.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
NOAA Metadata Update Ted Habermann. NOAA EDMC Documentation Directive This Procedural Directive establishes 1) a metadata content standard (International.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
Geospatial Metadata Overview WV AGP GIS Conference, June 2008 Presented by: Eric Hopkins, GIS Analyst West Virginia GIS Technical.
Metadata (for the data users downstream) RFC GIS Workshop July 2007 NOAA/NESDIS/NGDC Documentation.
U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 3: Describe (Metadata, Documentation) CC image by bonus on Flickr.
Metadata Understanding the Value and Importance of Proper Data Documentation Exercise 2 Reading a Metadata File Exercise 3 Using the Workbook Exercise.
Agenda: DMWG SM policy status ESIP meeting recap Reminder - DM Webinar Series New and updated web pages on DM website Metadata Training Sessions CDI meeting.
LTER Information Management Training Materials LTER Information Managers Committee Metadata.
U.S. Department of the Interior U.S. Geological Survey Planning for Data Management Creating data management plans for your project.
Metadata RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Coastal GeoTools Charleston SC January 2003 Making Metadata Work for You Lynda Wayne FGDC Metadata Education Coordinator / GeoMaxim.
Data Management: Documentation and Metadata for Engineering and Physical Sciences Ivey Glendon, Metadata Librarian Jeremy Bartczak, Intellectual Access.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
Introduction to Geospatial Metadata – ISO 191** Metadata National Centers for Environmental Information (NCEI)
An Introduction to Metadata Tammy Walker Beaty Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN Data Management.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Introduction to Geospatial Metadata – ISO 191** Metadata National Coastal Data Development Center A division of the National Oceanographic Data Center.
1 Integrated Services Program The Virginia Metadata Training Workshop Summer, 2006 Lyle Hornbaker Integrated Services Program
Data Management: Documentation & Metadata Sherry Lake, Senior Data Consultant Bill Corey, Data Consultant Jeremy Bartczak, Intellectual Access & Metadata.
Vers national spatial data infrastructure training program Value of Metadata Introduction to Metadata An overview of the value of metadata to.
Extensible Markup Language (XML) Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879).ISO 8879 XML is a.
GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June Metadata publishing with the IPT.
Preparing Metadata Records Suresh K.S. Vannan ORNL, Oak Ridge, TN Viv Hutchison US Geological Survey, Denver, CO
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
Metadata Data about data. Useful links... html html
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
CSDGM Overview CSDGM Tools and Resources. Resources Series Materials: ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Tr aining/Intro_to_Geospatial_Metadata/
Introduction to Metadata. Introduction to Metadata  What is metadata?  When is metadata created?  What is included in a metadata record?  What is.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
2008 EPA and Partners Metadata Training Program: 2008 CAP Project Geospatial Metadata: Introduction Module 1: Introduction & Overview of the FGDC CSDGM.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Why Standardize Metadata?. Why Have a Standard? Think for a moment how hard it would be to… … bake a cake without standard units of measurement. … put.
Sneak Preview: Sneak Preview: The New US Geospatial Metadata Standard GeoMaxim Federal Geographic Data Committee (FGDC) Lynda Wayne, GISP Sharon Shin.
Introducing Geospatial Metadata ---1 Don’t Duck Metadata Introducing Geospatial Metadata A Metadata Workshop.
The FGDC and Metadata. To maintain an organization's internal investment in geospatial data To provide information about an organization's data holdings.
U.S. Department of the Interior U.S. Geological Survey The Biological Data Profile Extending the FGDC Metadata Standard Kirsten Larsen.
Barry Weiss 1/4/ Jet Propulsion Laboratory, California Institute of Technology Quality Elements in ISO Metadata Design for Proposed SMAP Data.
WVGISTC, 425 White Hall, PO Box 6300, Morgantown, WV Introduction to Geospatial Metadata A Half-Day Workshop, 15 May 2006 Presented by: Eric Hopkins,
ESRI Education User Conference – July 6-8, 2001 ESRI Education User Conference – July 6-8, 2001 Introducing ArcCatalog: Tools for Metadata and Data Management.
Metadata ESA Workshop. In this session we will discuss…  Metadata: what are they? and why should they be created?  Metadata standards  Creating metadata.
A look to the past for the future- The North American Profile Sharon Shin Metadata Coordinator Federal Geographic Data Committee.
Advertising your data Alecia Aleman 1, Ruth Duerr 2 1 National Aeronautics and Space Administration (NASA) 2 National Snow and Ice Data Center, University.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
How to Create an Essential Metadata Record Using an Online Tool aka ‘ Now You Have No Excuse For Not Creating.
Geog. 377: Introduction to GIS - Lecture 16 Overheads 1 5. Metadata 6. Summary of Database Creation 7. Data Standards 8. NSDI Topics Lecture 16: GIS Database.
The Bear River Watershed Information System Jeffery S. Horsburgh Utah Water Research Laboratory Utah State University David.
Understanding the Value and Importance of Proper Data Documentation 5-1 At the conclusion of this module the participant will be able to List the seven.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Lecture 4 Data Management & Metadata
Presented by Sharon Shin, FGDC Developed by Lynda Wayne, GeoMaxim-FGDC
Data Management: Documentation & Metadata
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Creating Geospatial Metadata for the Long-term
Proposal of a Geographic Metadata Profile for WISE
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

1 of 53 Lecture 3 Metadata Steve Burian Hydroinformatics Fall 2013 This work was funded by National Science Foundation Grants EPS and EPS

2 of 53 Objectives Define science metadata Identify the types of information included in metadata records for environmental datasets Determine the dimensionality of a dataset, including the scale triplet of support, spacing extent for both space and time Generate metadata and describe datasets to support data sharing

3 of 53 The Data Life Cycle Plan Collect AssureDescribePreserveDiscoverIntegrateAnalyze

4 of 53 Metadata is “Information about Data” – WHO created the data? – WHAT is the content of the data? – WHEN were the data created? – WHERE is it geographically? – WHY were the data developed? – HOW were the data developed? What is Metadata? Content, quality, condition, and other characteristics

5 of 53 The Purpose of Metadata Support discovery of scientific data Facilitate acquisition, comprehension, and use of data by HUMANS Enable automated discovery, ingestion, processing and analysis by MACHINES

6 of 53 Metadata is All Around CC image by USDAgov on Flickr Details about the songs in your MP3 library. Details about the cereal you ate for breakfast this morning.

7 of 53 Data vs. Metadata Data Metadata 15.9 Little Bear River at Mendon Road Latitude = Longitude = Water temperature Degrees Celsius 9/30/2011 5:00 PM

8 of 53 provide When you provide data to someone else, what types of information would you want to include with the data? receive When you receive a dataset from an external source, what types of details do you want to know about the data? Sharing Data

9 of 53 Providing data: – Why were the data created? – What limitations do the data have? – What does the data mean? – How should the data be cited if it is re-used in a new study? Receiving data: – What are the data gaps? – What processes were used for creating the data? – Are there any fees associated with the data? – In what scale were the data created? – What do the values in the tables mean? – What software do I need in order to read the data? – What projection are the data in? – Can I give these data to someone else? Sharing Data

10 of 53 Necessary Meta/data Structure The degree of metadata format and structure necessary for different levels of projected secondary data utilization. (adapted from Michener et al., 1997).

11 of 53 Information Entropy Example of the normal degradation in information content associated with data and metadata over time (“information entropy”). (Figure taken from Michener, 2006).

12 of 53 What if instead? Paper using data is published Curated data published in a data repository Data annotated by additional users Data synthesized and leads to another publication Time Information Content of Data and Metadata

13 of 53 Metadata to Support Understanding and Using Data

14 of 53 Metadata for Data Use Research context – Hypotheses, site characteristics, experimental design, research methods Status of the dataset (e.g., raw? processed?) Spatial and temporal domain of the dataset Physical structure of the data

15 of 53 Scale Issues in Interpretation of Measurements and Modeling Results The Scale Triplet of Measurements Adapted from: Blöschl (1996) Spatial extent represented by grid Grid cell size Average over grid cell? Sample value at grid cell center? Interpretation for Geospatial Data

16 of 53 Issues in Data Interpretation Adapted from: Blöschl (1996) Spacing too large - aliasing

17 of 53 Issues in Data Interpretation Adapted from: Blöschl (1996) Extent too small - trend

18 of 53 Issues in Data Interpretation Adapted from: Blöschl (1996) Support too large - smoothing

19 of 53 Another Example: Raster Spatial Resolution Higher resolution Higher spatial accuracy Slower display Slower processing Larger file size Lower resolution Lower spatial accuracy Faster display Faster processing Smaller file size Decreasing Cell Size Increasing Cell Size 100 m 30 m 10 m 1 m Slide from David Tarboton

20 of 53 Data Use Case – NRCS SNOTEL

21 of 53 Data Use Case – NRCS SNOTEL “I am trying to obtain a record of hourly precipitation. Your sites have data for precipitation accumulation, but I cannot use that to back calculate precipitation for each hour because there are frequently losses in precipitation. For example, the accumulated precipitation might go from 7.5 to 7.4 to 7.3 and then increase again to 7.5. Can you offer any advice on obtaining the incremental rather than accumulated precipitation?”

22 of 53 An Interesting Response from NRCS “The short answer to your question is: don't use it. its all crap. The long answer is: the sensor used to detect precipitation and snow water equivalent is a pressure transducer modified from reading pressures of up to 2000 psi down to reading pressures of 0 to 5 psi and does not have the necessary stability in either accuracy or precision to measure hourly values of 0.1 inches especially within the environment of its setting with varying temperatures, barometric pressure, expansion and contraction of the precip gage itself, frost heave and the list goes on. These data on an hourly incremental basis are not stable. On a daily basis measured from midnight to midnight are sufficiently accurate for our purposes of water supply forecasting and hydrologic modeling - but even then can be plus/minus 0.5 inches at some of the more variable sites. As the time increment increases, the accuracy of the gage increases as well. daily values are decent, weekly more so and monthly pretty close as we edit the variability out. Hourly are crap. Bottom line - don't use hourly pcp data for anything other than amusement or instructional value of knowing/teaching what your data are, how they are collected and what they represent.”

23 of 53 Metadata Format and Standards

24 of 53 Web Quest…the good, the bad, the ugly  Work in teams of 2-3  Seek two examples of scientific metadata – one you find good and one your find to be not so good – identify specifically why  Make a list of the elements/information you find in the metadata  You have 10 minutes, send me the link to your best example for both categories – good and bad to

25 of 53 What Does Scientific Metadata Look Like?

26 of 53 A structure to describe data with: – Common terms to allow consistency between records – Common definitions for easier interpretation – Common language for ease of communication – Common structure to quickly locate information Encoding – structured text or Extensible Markup Language (XML) In search and retrieval, standards provide: – Documentation structure in a reliable and predictable format for computer interpretation – A uniform summary description of the dataset What is a Metadata Standard?

27 of 53 General Metadata Organization Information for data discovery – Title, keywords, spatial and temporal domain, abstract Information for interpretation and appropriate use – Research objectives, experimental design, sampling procedures, site selection, variables and units, data processing Information for automated use – Structural attributes of the data (schema) and format of the data (syntax)

28 of 53 Dublin Core Element Set – Emphasis on web resources, publications – Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM) – Emphasis on geospatial data – Commonly used by federal agencies – International Standards Organization (ISO) 19115/19139 Geographic information: Metadata – Emphasis on geospatial data and services – standards#fgdcendorsedisostandards standards#fgdcendorsedisostandards Examples of Metadata Standards

29 of 53 Ecological Metadata Language (EML) – Focus on ecological data – Water Markup Language (WaterML) – Emphasis on time series of hydrologic observations – More of a data encoding language – Examples of Metadata Standards

30 of 53 Scientific Metadata Comparison

31 of 53 Many standards collect similar information Factors to consider: – Your data type: Are you working mainly with GIS data? Rastor/vector or point data? Do you work for a federal agency? – Consider the FGDC Content Standard for Digital Geospatial Metadata. Are you working with data retrieved from instruments such as monitoring stations or satellites? Are you using geospatial data services such as applications for web-mapping applications or data modeling? – Consider using the ISO standard Are you mainly working with ecological data? – Consider Ecological Metadata Language (EML) Choosing a Metadata Standard

32 of 53 More Factors to consider: – Your organization’s policies – What resources ($$$) are available to create metadata? – What tools are available? – Availability of human support – Instructional materials – Use of controlled vocabularies – How many standards or output formats? Choosing a Metadata Standard

33 of 53 Tools for Creating Metadata FGDC CSDGM: – Mermaid (NOAA) – Metavist (Forest Service) – TKME (USGS) – ESRI ArcCatalog ISO: – ESRI ArcCatalog – XML Spy, Oxegyn, CatMD – EML: – Morpho (KNB)

34 of 53 Metadata Editor Example – ArcCatalog Customizable metadata styles ISO ISO FGDC CSDGM INSPIRE

35 of 53 Metadata Standards for Non-Data Objects Community Surface Dynamics Modeling System (CSDMS) Metadata for model components Under development Image from Jon Goodall and Mostafa Elag, Phyllis Mbewe, University of South Carolina

36 of 53 Concerns About Creating Metadata ConcernSolution Workload required to capture accurate, robust metadata Incorporate metadata creation into data development process – distribute the effort Time and resources to create, manage, and maintain metadata Include in grant budget and schedule Readability / usability of metadata Use a standardized metadata format Discipline specific information and vocabularies ‘Profile’ standard to require specific information and use specific values

37 of 53 The Value of Metadata

38 of 53 The descriptive content of the metadata file can be used to identify, assess, and access available data resources Data Discovery and Reuse online access order process contacts use constraints access constraints data quality availability/pricing keywords geographic location time period attributes

39 of 53 Find data by: – themes / attributes – geographic location – time ranges – analytical methods used – sources and contributors – data quality Data Discovery and Reuse

40 of 53 Example of How Metadata is Used

41 of 53 Data.gov – Federal e-gov geospatial data portal – DataONE – Repository for data and metadata – US Geological Survey – USGS Core Science Metadata Clearinghouse – Other Data Portals

42 of 53 Metadata allows you to repeat scientific process if: – methodologies are defined – variables are defined – analytical parameters are defined Metadata allows you to defend your scientific process: – demonstrate process – increasingly GIS/data-savvy public requires metadata for consumer information Data Accountability INPUT RESULTS

43 of 53 Metadata can be a declaration of: – Purpose: the originator’s intended application of the data – Use Constraints: inappropriate applications of the data – Completeness: features or geographies excluded from the data – Distribution liability: explicit liability of the data producer and assumed liability of the consumer Data Liability

44 of 53 Metadata can be a means to improve communications among project participants using common: – descriptions & parameters – keywords, vocabularies, thesauri – contact information – attributes – distribution information If reviewed regularly by all participants, metadata created early and updated during the project improves opportunity for coordinating: – source data – analytical methods – new information Project Coordination

45 of 53 Value of Metadata to Data Producers Avoid data duplication Share reliable information Publicize efforts – promote the work of a scientist and his/her contributions to a field of study

46 of 53 Value of Metadata to Data Users Search, retrieve, and evaluate data set information from both inside and outside an organization Find data: Determine what data exists for a geographic location and/or topic Determine applicability: Decide if a data set meets a particular need Discover how to acquire the dataset you identified Process and use the dataset

47 of 53 Value of Metadata to Organizations Metadata helps ensure an organization’s investment in data – Documentation of data processing steps, quality control, definitions, data uses, and restrictions – Ability to use data after initial intended purpose Transcends people and time – Offers data permanence – Creates institutional memory Advertises an organization’s research – Creates possible new partnerships and collaborations through data sharing

48 of 53 Summary (1) Metadata is documentation of data A metadata record captures critical information about the content of a dataset – e.g., spatial and temporal support, spacing, extent Metadata allows data to be discovered, accessed, and re-used

49 of 53 Summary (2) Metadata standards provide structure and consistency to data documentation Standards and tools vary – Select according to defined criteria such as data type, organizational guidance, and available resources Metadata is of critical importance to data developers, data users, and organizations

50 of 53 References Michener, W.K. (2006). Meta-information concepts for ecological data management, Ecological Informatics, 1(1), 3-7, Michener, W.K., J.W. Brunt, J.J. Helly, T.B. Kirchner, S.G. Stafford (1997). Nongeospatial metadata for the ecological sciences, Ecological Applications, 7(1), , Blöschl, G. (1996). Scale and Scaling in Hydrology, Habilitationsschrift, Weiner Mitteilungen Wasser Abwasser Gewasser, Wien, 346 p. Credits: Many ideas and some slides in this presentation were taken from: Henkel, H., V. Hutchison, S. Strasser, S. Rebich Hespanha, K. Vanderbilt, L. Wayne, (2012). DataONE education modules, DataONE Project, University of New Mexico, Albuquerque, NM, Available at: (last accessed )

51 of 53 Assignment 1. Metadata and the Data Life Cycle Your employer is developing a hydrologic model for the Little Bear River in Cache Valley and wants to model the impact of changes in land cover on hydrology in this watershed between 2002 and Your boss has asked you whether s/he can use the United States Geological Survey (USGS) National Land Cover Dataset (available for 1992, 2001, and 2006) in the study.

52 of 53 National Land Cover Dataset GIS gridded data product Nation-wide coverage Data available for 1992, 2001, 2006 Vegetation/land cover types Used for model inputs and parameterization

53 of 53 For your recommendation, consider: 1.What does the data represent? 2.How were the data created, collected, and/or observed? 3.What was the source of the data? 4.What is the format or syntax of the data? 5.What manipulations, transformations, or derivations have been performed to produce the data? 6.What are the spatial and temporal support, spacing, and extent for these datasets? 7.What are appropriate uses for the dataset that you have selected? 8.What are the limitations to the data? 9.Are there differences in the way the data for the different years were produced that make them incompatible?