Presentation is loading. Please wait.

Presentation is loading. Please wait.

GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor.

Similar presentations


Presentation on theme: "GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor."— Presentation transcript:

1 GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor

2 Lesson topics 1.Definition of metadata 2.Examine information included in a metadata record 3.Examples of metadata standards and how to choose 4.Illustrate the value of metadata to data users, data providers, and organizations 5.Describe the utility of metadata for a variety of scenarios beyond discovery

3 The data lifecycle

4 Data collection CC image by Justin See on Flickr CC image by CIMMYT on Flickr CC image by acordova on Flickr CC image by kukkurovaca on Flickr CC image by SEDAC on Flickr CC image by ISAS on Flickr

5 From field notes to datasets Average temperature of observation for each species Species Average Temperature Temperature Standard Deviation Number of Observations Minimum Temperature Maximum Temperature Northern Red-legged Frog 4.4---14.4 Tailed Frog 7.03.03410 Arizona Toad 10.0---110 Strecker's Chorus Frog 10.52.011916 Oregon Spotted Frog 11.015.52022 New Jersey Chorus Frog 11.54.517322 Wood Frog 12.55.5897028.8 Spring Peeper 13.25.656932 Red-legged Frog 13.35.916427

6 From datasets to published papers CC image by Heather Kennedy on Flickr

7 Working with data provide When you provide data to someone else, what types of information would you want to include with the data? receive When you receive a dataset from an external source, what types of details do you want to know about the data?

8 Working with data Providing data: Why were the data created? What limitations, if any, do the data have? What does the data mean? How should the data be cited if it is re-used in a new study? Receiving data: What are the data gaps? What processes were used for creating the data? Are there any fees associated with the data? In what scale were the data created? What do the values in the tables mean? What software do I need in order to read the data? What projection are the data in? Can I give these data to someone else?

9 What is metadata? “Data about data” “Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.” NISO, Understanding Metadata

10 Metadata “The metadata accompanying your data should be written for a user 20 years into the future -- what does that person need to know to use your data properly? Prepare the metadata for a user who is unfamiliar with your project, methods, or observations.” Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC)

11 What is metadata? WHO created the data? WHAT is the content of the data? WHEN were the data created? WHERE is it geographically? HOW were the data developed? WHY were the data developed? Photo by Michelle Chang. All Rights Reserved Metadata is: Data ‘reporting’

12 Levels of metadata PROJECT LEVEL Descriptive information DATA LEVEL Granular information

13 Metadata in real life You use it all the time…

14 Metadata standards Dublin Core (DC), Darwin Core (DwC), EML, DDI, NBII, FGDC/CSDGM, ISO 19139, ISO 19115, DIF, LDIF, e- GMS, AGLS, METS, MODS, PREMIS, OAI-PMH, MARC, CDWA, CIDOC/CRM, DACS, DIG35, GILS, GML, ISBD, LCSH, KML, MARCXML, MEI, MODS, MIX, OAIS, ANSI/NISO Z39.88, PB Core, PRISM, QDC, RDF, SGML, VSO, XML, XMP

15 What is a metadata standard? A Standard provides a structure to describe data with: o Common terms to allow consistency between records o Common definitions for easier interpretation o Common language for ease of communication o Common structure to quickly locate information In search and retrieval, standards provide: o Documentation structure in a reliable and predictable format for computer interpretation o A uniform summary description of the dataset CC image by ccarlstead on Flickr

16 What does a metadata record look like? Ocean Currents and Biogeochemistry: Nearshore Water Profiles (Monthly CTD and Chemistry; SBC- LTER) web link New York City Community Health Survey, 2009 (ICPSR) web link Mountain hemlock tree-ring width chronologies from the western Oregon Cascade Mountains (USFS Research Data Archive) web link

17 Muddiest point… What did you find unclear about the concept of metadata?

18 Even if the value of data documentation is recognized, concerns remain as to the effort required to create metadata that effectively describe the data. Concerns about creating metadata

19 ConcernSolution workload required to capture accurate robust metadata incorporate metadata creation into data development process – distribute the effort time and resources to create, manage, and maintain metadata include in grant budget and schedule readability / usability of metadata use a standardized metadata format discipline specific information and ontologies ‘profile’ standard to require specific information and use specific values

20 The value of metadata Data creators Data users Organizations Metadata helps…

21 What is the value to data creators? Metadata allows data creators to: o Avoid data duplication o Share reliable information o Publicize efforts – promote the work of a scientist and his/her contributions to a field of study CC image by US Embassy Guyana on Flickr

22 What is the value to data users? Metadata gives a user the ability to: o Search, retrieve, and evaluate data set information from both inside and outside an organization o Find data: Determine what data exists for a geographic location and/or topic o Determine applicability: Decide if a data set meets a particular need o Discover how to acquire the dataset you identified; process and use the dataset CC image by ASEE on Flickr

23 What is the value to organizations? Metadata helps ensure an organization’s investment in data o Documentation of data processing steps, quality control, definitions, data uses, and restrictions o Ability to use data after initial intended purpose Transcends people & time o Offers data permanence o Creates institutional memory Advertises an organization’s research o Creates possible new partnerships and collaborations through data sharing

24 Information Entropy DATA DETAILS Time of data development Specific details about problems with individual items or specific dates are lost relatively rapidly General details about datasets are lost through time Accident or technology change may make data unusable Retirement or career change makes access to “mental storage” difficult or unlikely Loss of data developer leads to loss of remaining information TIME (From Michener et al 1997)

25 Information Entropy TIME DATA DETAILS Sound information management, including metadata development, can arrest the loss of dataset detail.

26 A closer look: the utility of metadata Metadata can support: o data distribution o data management o [project management] If it is: o considered a component of the data o created during data development o populated with rich content derive classify collect planimetricimagery analysis alternative committee review PLAN charette meta

27 Data distribution via metadata metadata publication data portals data discovery

28 Distribution: data discovery The descriptive content of the metadata file can be used to identify, assess, and access available data resources. online access order process contacts use constraints access constraints data quality availability/pricing keywords geographic location time period attributes

29 Distribution: metadata publication A metadata collection can be published to the internet via: website catalog web accessible folder (WAF) Z39.50 metadata clearinghouse metadata service geospatial data portal Internet Metadata CollectionUser Query Internet / Intranet Dataset

30 Distribution: data portals Examples of metadata search portals: Data.gov Federal e-gov geospatial data portal http://www.geo.data.gov Metacat Repository for data and metadata http://knb.ecoinformatics.org/index.jsp US Geological Survey USGS Core Science Metadata Clearinghouse: http://mercury.ornl.gov/clearinghouse ICPSR Political and Social Science data portal

31 Data management via metadata Data Accountability Discovery & Re-use Maintenance & Update Data Liability

32 Management: maintenance & update Metadata records can used to track data provenance accuracy Data Maintenance: Are the data current? o Do we have data older than ten years? o was before some political or geophysical event that resulted in significant change? Are the data valid? o prior to most current source data o prior to most current methodologies Data Update: Contact information Distribution policies, availability, pricing, URLs New derivations of the dataset

33 Discovery: data reuse If you create metadata, other people can discover your data If you create metadata, you can find your own data CC image by Oceanit Daily Photo on Flickr

34 Management: data discovery & reuse Find your data by: o themes / attributes o geographic location o time ranges o analytical methods used o sources & contributors o data quality Discoverable data is usable data! CC image by NASA Goddard Spece Flight Center on Flickr

35 Management: data accountability Metadata allows you to repeat scientific process if: o methodologies are defined o variables are defined o analytical parameters are defined Metadata allows you to defend your scientific process: o demonstrate process o increasingly GIS-savvy public requires metadata for consumer information INPUT RESULTS

36 Management: data accountability Metadata is an exercise in data accountability. It requires you to assess: What do you know about the dataset? What don’t you know about the dataset? What should you know about the dataset? Are you willing to associate yourself with the metadata record ?

37 Management: data liability Metadata is a declaration of: Purpose o the originator’s intended application of the data Use Constraints o inappropriate applications of the data Completeness o features or geographies excluded from the data Distribution Liability o explicit liability of the data producer and assumed liability of the consumer What to do… What not to do…

38 Review: the utility of metadata Metadata can support: Data distribution o discovery o metadata publication o data portals Data management o maintenance & update o discovery & reuse o data accountability o data liability [Project management]

39 Choosing Metadata Standards Image courtesy of Viv Hutchinson

40 Darwin Core | biological diversity, taxonomy Dublin Core | general DDI (Data Documentation Initiative) | social & behavioral sci. DIF (Directory Interchange Format) | environmental sci. EML (Ecological Metadata Language) | ecology, biology ISO 19115 | geographic data Multiple standards exist Browse by discipline: http://www.dcc.ac.uk/resources/metadata-standardshttp://www.dcc.ac.uk/resources/metadata-standards

41 Comparing metadata standards EMLFGDC Title Abstract Entity DescriptionEntity Type Definition Intellectual RightsUse Constraints

42 Choosing a metadata standard Many standards collect similar information Factors to consider: 1.Your data type raster/vector GIS data, images, surveys/text, etc. 2.Organization [funder] policies 3.Future preservation/sharing location 4.Tools to support creation & distribution 5.Other factors: Availability of human support; instructional materials; use of controlled vocabularies; output formats

43 Summary o Metadata is documentation of data o A metadata record captures critical information about the content of a dataset o Metadata allows data to be discovered, accessed, and re-used o A metadata standard provides structure and consistency to data documentation o Standards and tools vary – select according to defined criteria such as data type, organizational guidance, and available resources o Metadata is of critical importance to data developers, data users, and organizations o Metadata can be effectively used for: data distribution data management project management o Metadata completes a dataset. Creating robust metadata is in your OWN best interest!

44 On Thursday Barnard Classroom 5 th Floor


Download ppt "GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor."

Similar presentations


Ads by Google