Metadata an introduction 09/09/99 Patty Frontiera * Slides will be made available online
Metadata, an Introduction Objectives for this lecture: What is metadata? Why is it important? Efforts to standardize metadata: –FGDC –NSDI –CSDGM How would you use or create metadata
Metadata, an Introduction What is Metadata: –Data about data What is Meant by Data? –Broad sense: any information resource or object –Here: geospatial data, in digital format
Metadata, an Introduction Familiar Metadata: The bibliographic catalog record for a book Author: Aronoff, Stanley. Title: Geographic information systems : a management perspective / Stan Aronoff. Ottawa : WDL Publications, c1989. xvi, 294 p. : ill., maps ; 25 cm. Notes: Includes bibliographical references. Subjects: Geography--Data processing. Information storage and retrieval systems--Geography.
Metadata, an Introduction Familiar Metadata for Geospatial Data A Map Legend –Who made the map –Date of Map (reference date) –Scale of the map –Description
Metadata, an Introduction Geospatial Metadata “...data about spatial data...” identifies and describes datasets, coverages, images, etc provides information about data quality, lineage, source materials, spatial reference, subject themes contains the “data dictionary” defining attributes and relationships
Metadata, an Introduction Simple Metadata for Geospatial Data Originator:REGIS, UC Berkeley Title: Roads in Alameda County Date Created: 10/20/97 Ground Date:11/06/94 Filename: rds197.shp Filesize: 1MB Fileformat: ArcView Shapefile Source Scale: 1:24K Projection/Coordinate Info: UTM Zone 10
Metadata, an Introduction Objectives for Metadata Identification - inventory data holdings; facilitate browsing/searching for relevant information Evaluation - determining “fitness for use” based on application requirements Interpretation - extracting and utilizing data correctly in terms of schema, accuracy/ precision, reference
Metadata, an Introduction Why Create / Use Metadata? The Data Creator: –Create metadata to document data; to record data processing information; to facilitate re-use of data The Data Manager: –Wants to know what data has been created and how. –Wants an inventory of existing data (to avoid costly duplication, to advertise, to promote) –Wants documentation of data to facilitate re-use. The Data Seeker: –Metadata provides text fields and info that can be searched –Metadata helps evaluate usefulness of / quality of data found
Metadata, an Introduction Standardizing Metadata GIS data expensive to create. Want to protect that investment. Large organizations that create/use GIS data have greatest stake in creating / obtaining metadata. The biggest of these organizations is the US Government. Recognizing the need to promote coordinated development, use, sharing, and dissemination of geospatial data, the Federal Geographic Data Committee (FGDC) was formed in 1990.
Metadata, an Introduction The FGDC - Federal Geographic Data Committee An organization of representatives from 16 Federal agencies (DOA, DOC, DOD, EPA, FEMA, HUD, LOC, NASA,..) Works in cooperation with with organizations from state, local and tribal governments, the academic community, and the private sector Coordinates the development of the National Spatial Data Infrastructure (NSDI).
Metadata, an Introduction The NSDI - National Spatial Data Infrastructure Policies, standards, and procedures for organizations to cooperatively produce and share geographic data. The 16 federal agencies that make up the FGDC are developing the NSDI in cooperation. Recognition at the highest level of the importance of geospatial data.
Metadata, an Introduction The FGDC and Metadata Recognized the need for formal metadata. From FGDC worked on metadata issues and developed a standard by which geospatial metadata should be documented. The FGDC Content Standard for Digital Geospatial Metadata The CSDGM
Metadata, an Introduction FGDC Metadata Defined by Executive Order in April 1994 as formal format for Federal use To be applied to all new federal data sets, effective January 1995; all legacy data on a schedule Current Version: CSDGM Version
Metadata, an Introduction What’s in the FGDC Metadata Standard? The standard establishes a common set of terminology and definitions for concepts related to metadata, including: –the names of data elements to be use (e.g. title, originator, progress,..) –definitions of these elements (progress=state of data set) –Information about valid values for these elements (progress = completed or in work or planned) The standard contains approximately 300 data elements, many of which can be repeated. (multiple keywords for example), that are organized into 7 main sections.
Metadata, an Introduction The 7 Sections of the CSDGM: Identification: –Title, Extent, Purpose/Abstract, Keywords, Currentness, Use Restrictions Data Quality: –Accuracy, Completeness, Lineage, Source Materials Spatial Data Organization: –Point, Vector, or Raster data Spatial Reference: –Coordinate System, Projection Information Entity and Attribute Information: –Features, Attributes, Possible Attribute Values (data dictionary) Distribution Information: –How to obtain data, formats available, fees Reference: –Who made the metadata record? In accordance with what standard?
Metadata, an Introduction FGDC Metadata Standard The size of the FGDC Metadata Content Standard reflects the diversity of the users of the Standard (different users have different needs for documenting their datasets). Only Sections 1 and 7 need to be completed in order to have a metadata record that meets the minimum FGDC metadata standard requirements.
Metadata, an Introduction FGDC Metadata Standard - Recent News FGDC participating in the ISO Metadata Standard Committee Essential Metadata: 1.Title 2.Abstract 3.Bounding Geospatial Coordinates 4.Metadata Language 5.Metadata Characterset 6.Dataset Language 7.Dataset Characterset 8.Responsible Party Name 9.Responsible Party Address ( or Phone or Postal) 10.Themecode 11.Reference Date
Metadata, an Introduction What the FGDC Metadata Standard Does: Details a format for creating a metadata record for a geospatial data set. What it Doesn’t Do: –Doesn't specify how to collect metadata or organize your metadata –Doesn't impose a standard for data storage or transfer –Doesn't specify how to present or communicate metadata
Metadata, an Introduction Why Should you care about the CSDGM? –if you want to use or participate in the NSDI –as a method for learning about metadata –soon to be a ISO standard –more and more automated / commercial metadata tools implement this standard (e.g., ArcInfo document.aml)
Metadata, an Introduction Should you implement it if you do Metadata? –What are the alternatives? –Cost / benefit analysis Can you afford to? Can you afford not too? How to implement it if you choose to? –Tools and information See the FGDC website Existing free software tools too specific or too general Commercial software is available
Metadata, an Introduction How much metadata is enough? Metadata for internal documentation should be gathered at a level that meets local needs and budget. Basic documentation will suffice for discovery of information (inventory/search) Detailed documentation desired to provide end-users with adequate information for re-use
Metadata, an Introduction Metadata Implementation Tips Organizations that use but don’t create GIS Data : Organizational commitment to gathering metadata Gain familiarity with the FGDC Metadata Standard Try to obtain metadata in as complete compliance with the FGDC Standard as possible Obtain metadata in digital form Consider how you might want to store and use metadata so you can collect it in a form that facilitates of these goals
Metadata, an Introduction Organizations that Create GIS Data Think about how your organization will want to use metadata Determine min. to max. levels of metadata collection that will meet your needs and budget Inventory your data holdings and determine what data will be catalogued (only newly created data? legacy data?) Select software for metadata creation and storage that meet your local needs, budget, staffing Don’t re-invent the wheel (see/use what’s out there)
Metadata, an Introduction The CSDGM is just one of the building blocks of the NSDI
Metadata, an Introduction Building Blocks of the NSDI Standards: Such as the CSDGM Framework: Identify data themes and formats for framework datasets to be created and distributed by federal agencies. Partnerships: Managed by the FGDC working with public and private stakeholders in GIS data worldwide. Clearinghouse: Geospatial data is documented in accordance with the the CSDGM and made searchable and accessible on the Internet using free software developed by the FGDC.
Metadata, an Introduction The National Geospatial Data Clearinghouse Managed by the FGDC Part of the NSDI Made up of a collection of distributed data+metadata “nodes” on the Internet
Metadata, an Introduction What is the Clearinghouse Model? A clearinghouse provides... –discovery of spatial data –distributed search worldwide –uniform interface for spatial data searches –advertising for your data holdings A clearinghouse operates as… –a server of servers –entry point to constellation of servers –collection of distributed Z39.50 servers –a virtual “Altavista” for geospatial data
Metadata, an Introduction WebClient Gateway Clearinghouse “Nodes” or Servers This is all “Clearinghouse” NOAA California USGSNMD NRCS
Clearinghouse approach ClientMetadata Spatial Data Set 4 Use International voluntary-consensus standards 4 Develop free reference implementations and software for public and commercial re-use 4 Promote a common vocabulary for geospatial data discovery on the Internet Metadata, an Introduction
The Clearinghouse Architecture Distributed data producers and users Heterogeneous data Key components: –Data documentation (metadata in CSDGM format) –Networking (Internet) –Serving, searching, and accessing software Z39.50 Search and Retrieve Protocol WWW - World Wide Web
Metadata, an Introduction Current Clearinghouse Status Over 100 registered sites--state, national and global collections FGDC supporting work on Z39.50 server and client solutions using Web and Java Training available from FGDC Tutorial materials available on-line
Metadata, an Introduction REGIS, UC Berkeley Clearinghouse Node First NGDC Node in California Describes the spatial dataset developed for NOAA/BCDC Bay Area GIS Demonstration Project Provides entry point to UC Berkeley REGIS collection
Metadata, an Introduction SF Bay Metadata at REGIS Inventoried our datasets / focused on data being created for a related project Reviewed existing software and info Found one (FGDC Metadata Entry System) & rewrote to meet our needs Ongoing effort to create metadata, especially for legacy data Created an NSDI Clearinghouse Node to make metadata accessible via the Internet
Metadata, an Introduction Creating a Clearinghouse Node 1. Create metadata in CSDGM format 2. Use USGS MP software to: –proof metadata –create text, html, sgml formats of the metadata 3. Use the CNDIR software Isite to index the metadata records 4. Use the CNDIR server software Iserve to serve the software on the Internet 5. Notify the FGDC that node is up and searchable If possible, get an FGDC grant to help finance project!
Metadata, an Introduction Clearinghouse Node Requirements: A multi-user host computer (UNIX or Windows-NT) Full-time Internet connectivity at 56KB or greater Z39.50 server software (e.g. Isite) Metadata in CSDGM format, one set (text, SGML, and HTML files) per dataset
Metadata, an Introduction California Environmental Data Catalog Combined federal and state government / non-profit effort: –CERES - California Environmental Resource Evaluation System –CGIA - California Geographic Information Association –FGDC More than just GIS data: spatial and non-spatial data, documents, programs, and other resources Browse and Search mechanism Users can catalog their own collections online –design work has already been done –supporting many sites and collection types –will link to NSDI Clearinghouse