Successful Data Sharing Part I (Publishing Great Metadata) Tanya Haddad, Oregon Coastal Management Program Anna Verrill, GISP, NOAA Office for Coastal Management Prepared for: West Coast Governor’s Alliance Network Meeting November 3, 2014
Overview Data Sharing Metadata – Types – Training & Resources – Tools Metadata Workflows – Metadata best practices – Editing Tools and Tips Sharing and publishing Implementing at your organization – Catalog overview – Available software – Levels of sharing – Connecting to communities Helpful Stuff For useful resources available externally, pay attention to the links in blue boxes, e.g.: /resources/ /resources/
Data Sharing Do you know all your customers? Can you predict what they will all want, now and in the future? Will you always be around to answer questions? If you can’t answer yes to all of these questions, then your sharing system needs to be flexible, reusable, and you should take steps to make it easy for users to find what is relevant to them (when you are not around) So you want to share your geospatial data:
Metadata How is metadata relevant to Data Sharing? Most flexible data sharing systems are built upon some form of metadata Metadata contains information that helps a user understand the contents of a data set, compare similar data sets, decide which data fit their needs, etc. Metadata helps users “discover” your data
Types of Metadata Human readable – MS Word, Adobe PDF, HTML pages – Project reports, Grant reports, Journal articles Machine readable – Software generated – XML or JSON Guess which of these is easiest to search and compare across many projects? cc-by-sa Aikzhobi
Machine Readable? T-shirt Template: Encodings Content
Metadata Complexity Metadata can be brief, or very verbose Different metadata standards: – Dublin Core Schema – FGDC Content Standard for Digital Geospatial Metadata – International Organization for Standardization 19139, 19115, : Geographic information - Metadata Most geospatial metadata you encounter will be encoded as XML Metadata Basics Christine White provides a nice overview in her Metadata Basics unit: meetings/ocmdnV/OCMDN_V_Part_I _Intro_to_Data_Catalogs_Slides.pdf meetings/ocmdnV/OCMDN_V_Part_I _Intro_to_Data_Catalogs_Slides.pdf
Dublin Core (1995) A small set of vocabulary terms that can be used to describe web resources (video, images, web pages, etc.), as well as physical resources such as books or CDs, and objects like artworks Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights
FGDC CSDGM (1998) This standard was developed from the perspective of defining the information required by a prospective user to determine the availability of a set of geospatial data; to determine the fitness and the set of geospatial data for an intended use; to determine the means of accessing the set of geospatial data; and to successfully transfer the set of geospatial data FGDC
ISO – 19139, 19115, (2005) Modular, flexible system Very customizable Depicts relationships between datasets and collection level (parent/child relationships) Standardizes descriptors through the use of codelists Accommodates new technologies (such as documenting services) Accommodates International scope Undergoes revision/review in 5 year cycles XML schema implementation MI_ MD_ FC_ Services SV_ ISO → Core Information ISO → Extensions for Instrumentation and Gridded Data ISO → Entities and Attributes ISO → Services ISO → Data Quality J. Mize
Metadata Resources & Training FGDC has a great “Metadata Quick Guide” for CSDGM: – MetadataQuickGuide.pdf MetadataQuickGuide.pdf Try googling “Metadata Bob” NOAA’s Jaci Mize provides regular online training for both CSDGM and ISO metadata. Recordings and materials are available online here: – ftp://ftp.ncddc.noaa.gov/pub/Metadata/Onli ne_ISO_Training/Intro_to_CSDGM/ ftp://ftp.ncddc.noaa.gov/pub/Metadata/Onli ne_ISO_Training/Intro_to_CSDGM/ – ftp://ftp.ncddc.noaa.gov/pub/Metadata/Onli ne_ISO_Training/Intro_to_ISO/ ftp://ftp.ncddc.noaa.gov/pub/Metadata/Onli ne_ISO_Training/Intro_to_ISO/ NOAA CSC
Metadata Resources & Training NOAA has created workbooks for learning the ISO formats. Each: – parallels the standard, – provides FAQs, – implementation guide data-standards/documents/MD-Metadata.pdf data-standards/documents/MI-Metadata.pdf J. Mize
Metadata Tools ArcMap users all have access to ArcCatalog There are also many other stand alone tools for generating metadata: – EPA Metadata Editor – CatMDEdit – ISOMorph – MERMAid – GeoNetwork – GeoPortal For the XML geeks: – XMLSpy – oXygen Metadata Tool Pros and Cons Jaci Mize reviews these tools in her metadata training: ftp://ftp.ncddc.noaa.gov/pub/ Metadata/Online_ISO_Training /Intro_to_ISO/presentations/5 _ToolsforISOMetadata.pptx ftp://ftp.ncddc.noaa.gov/pub/ Metadata/Online_ISO_Training /Intro_to_ISO/presentations/5 _ToolsforISOMetadata.pptx
Making a plan for Metadata Plan to document the data you use most and that is most important to your organization Use common sense for guidelines – follow standards as appropriate, but you only need to be as complete as is necessary for the intended purpose: – Discovery, or Documentation, or both, or other? Use templates when possible! Review your work after creating a few records, adjust your processes accordingly Submit some records to a search system and see how your records look Complete metadata = Good Discovery Experience
Best Practices – Use Templates! When you need to create metadata for many items, it helps to streamline the task by creating a metadata template. Like a MS Word document template, a metadata template contains information that will be used again and again Consider creating a template for your organization to use, and then make your organization template more specific for individual projects ArcGIS can automatically update properties of an item and any connected metadata template, resulting in much less effort to complete an item's metadata With metadata templates, you can focus on documenting important information like the sources and quality of your data, and any special processes you performed
Best Practices – For Good Discovery Identification Information: – Title – Abstract (Description) – Publication date – Point of Contact Info – Resource URL (If data is downloadable or available as a service) – Website URL – Constraints Location Information: – West Bounding Longitude – East Bounding Longitude – North Bounding Latitude – South Bounding Latitude – Browse Graphic URL Descriptor Information: – Theme Keywords – Resource Description If you do nothing else, try to do these items well! Certain metadata items are critical for discovery to work well (or at all):
ArcCatalog – Identification Info Title Abstract Publication date Resource URL Website URL Point of Contact Info Rempel, McCune, OSDL
ArcCatalog – Location Info West Bounding Longitude East Bounding Longitude North Bounding Latitude South Bounding Latitude Browse Graphic URL (you can make a browse graphic however you like, and store it in any web accessible location referenced by URL) Rempel, McCune, OSDL
ArcCatalog – Descriptor Info Theme Keywords – Theme Reference: ISO – Theme Topics Distribution Information – Resource Description: Select “Downloadable Data” if downloadable and add Resource URL Rempel, McCune, OSDL
Sharing & Publishing Just like any other product, if you want people to use it, you have to share it, and they have to know about it This means licensing! It also means advertising! OK, so you know how to make data, and metadata, now what?
Sharing Best Practices Decide you want to share your data (license) Document it with: – Great Titles – Informative Abstracts – Credit to your Organization – Resource URLs – Any Caveats Partner with a friendly existing catalog in your network, or Host your own
Levels of Sharing (Data) Available on the web, in whatever format (e.g. image scan or PDF), but with an open license Available as machine readable structured data (e.g. Excel instead of image scan of a table) Available as above, plus in a non- proprietary format (e.g. CSV instead of Excel) All the above, plus using open standards from W3C to identify things with URIs so that people can link to your stuff All the above, plus linked. Link your data to other people’s data to provide context Tim Berners Lee W3C
Building your own Catalog What are my sharing options? How do different catalog options compare? How do I pick a path? What other the questions I should be asking? Are you cataloging one source of data or multiple? Will other catalogs want to harvest from you? Do you need to harvest? Do you need to add additional attributes to the resources you harvest? Do you need to customize your catalog, or are out-of the-box features good enough? T. Welch
Catalog Options 'Simple' Catalog ArcGIS Online Geoportal Server (ESRI) GeoNetwork (OSGeo) OpenGeoPortal CKAN Catalog Options Pros and Cons Tim Welch reviews these catalog options in his OCMDN overview: net/meetings/ocmdnIII/Welch_ Catalog_Tech_ pdf net/meetings/ocmdnIII/Welch_ Catalog_Tech_ pdf
Connecting to Communities First step is know your audience(s) Try to anticipate needs, but be open to access options and applications you may not be aware of Build good documentation habits into all your processes – your future self will be grateful!
Questions? centralasian