1 Lola M. Olsen Global Change Master Directory NASA’s Goddard Space Flight Center The Value of Controlled Vocabularies Beyond the DRM 2.0: The Importance.

Slides:



Advertisements
Similar presentations
Schedule of Releases (since Tromso meeting) and New Access Interfaces.
Advertisements

Global Change Master Directory (GCMD) Strategic Plan Wyn Cudlip BNSC/QinetiQ Presentation to IDN Task Team, WGISS25.
IDN Services and SERF Update Heather Weir
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Planned Title: Review of Evaluation of Geospatial Search Allan Doyle.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Discovering Earth Science Tools, Software, and Models through the CEOS IDN Tyler Stevens IDN GIS/Services Coordinator
Midwest Documentum User Group Harley-Davidson Documentum WCM 10/10/2006.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
Metadata (for the data users downstream) RFC GIS Workshop July 2007 NOAA/NESDIS/NGDC Documentation.
SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML National Water Quality Monitoring Council August 20, 2003.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
SCADM MEETING, SEPTEMBER 7, 2011 STATUS OF THE ANTARCTIC MASTER DIRECTORY.
Status of the Antarctic Master Directory SCADM Meeting, August 22, 2014.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
Guten Tag Michael Morahan CEOS-WGISS 29 May 17-21, 2010 Bonn, Germany.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML Advisory Committee on Water Information September 10, 2003 Kenneth J. Lanfear,
MD9.6 Release: Highlights Increased the character limit for all URL resources to 600 characters. Data_Center/Service_Provider Data_Set_Citation/Service_Citation.
FGDC and GOS Metadata: Foundations to Build the NSDI Sharon Shin FGDC Secretariat / Geospatial One-Stop.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
FlexElink Winter presentation 26 February 2002 Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica.
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Are Standards Really Standards Any More? Mélanie F. Meaux NASA / GCMD In response to Wyn Cudlip with regards to an IDN profile of ISO …
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
ESIP & Geospatial One-Stop (GOS) Registering ESIP Products and Services with Geospatial One-Stop.
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
WGISS-40: IDN Report Michael Morahan WGISS-40 Fall meeting / Harwell, United Kingdom
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Using Portals and Registries: Publishing Metadata to GCMD Lola Olsen 1, Tyler Stevens 2, 1 National Aeronautics and Space Administration (NASA) 2 Wyle.
GIS data sources; catalogs of data and services. USGS: National Mapping.
The Digital Library for Earth System Science: Contributing resources and collections GCCS Internship Orientation Holly Devaul 19 June 2003.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Canadian IDN Status Report Brian McLeod, Martine Rocheleau Cameron Wilson and Christine Therriault CCRS, Natural Resources Canada CEOS WGISS Plenary and.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Discovering Earth Science Data and Services Using NASA’s Global Change Master Directory: The Value for Earth Science Teachers Tyler Stevens NASA’s Global.
Documenting UAF Data Ted Habermann NOAA/NESDIS/National Geophysical Data Center.
WGISS and GEO Activities Kathy Fontaine NASA March 13, 2007 eGY Boulder, CO.
Registering Earth Science Data and Data Related Services Using NASA’s Global Change Master Directory (GCMD) Tyler Stevens (GIS/Services Coordinator) ESIP.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
ESIP AQ Cluster Community Components for the Air Quality SBA in AIP-2.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
A look to the past for the future- The North American Profile Sharon Shin Metadata Coordinator Federal Geographic Data Committee.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Advertising your data Alecia Aleman 1, Ruth Duerr 2 1 National Aeronautics and Space Administration (NASA) 2 National Snow and Ice Data Center, University.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
1 Lola M. Olsen CEOS IDN Task Team Lead Technology and Services Subgroup Challenges/Actions for the IDN May 2006.
High performance, full-featured text search engine written in Java. Technology suitable for nearly any application requiring full-text search, especially.
TRIG: Truckee River Info Gateway Dave Waetjen Graduate Student in Geography Information Center for the Environement (ICE) University of California, Davis.
Michael Morahan CEOS WGISS-43 Meeting
Tyler Stevens GIS/Services Coordinator
CAP-378 and “Conhecer para não ignorar”
Copyright 2012 Lola Olsen & Tyler Stevens.
GCMD’s New Keyword Search Interface ‘Alpha Version’
WGISS-41: IDN Report Michael Morahan CEOS WGISS-41 Meeting
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
Session 2: Metadata and Catalogues
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Overview of Oracle Site Hub
Proposal of a Geographic Metadata Profile for WISE
Presentation transcript:

1 Lola M. Olsen Global Change Master Directory NASA’s Goddard Space Flight Center The Value of Controlled Vocabularies Beyond the DRM 2.0: The Importance of Normalization For the Data Search SiCOP Conference 6 February 2007

2 Presentation Guide  The Global Change Master Directory (GCMD) and the DRM 2.0  Vision and Mission  Data Context, Data Descriptions, and Data Sharing  Content and Usage  Value (most appreciated aspects expressed from users)  Design Evolution  MD2 - MD9.7  The Data Reference Model 3.0 Presentation Guide

3  Strategic Vision:  To serve as a trusted source for Earth (and space) science metadata and related services.  To contribute to scientific discovery.  Mission:  To provide for the creation of “unique”, high quality data, services, and ancillary descriptions of data.  To design enabling authoring tools (with ability to tag entries) and robust scientific search software. [The ability to “tag” data at the time of writing is key. Tagging is more difficult when this ability is not integrated through tools and is less likely to be accurate and normalized.] Best to gather information when the “getting is good.”}  To assist the global community in the discovery of the scientific resources within the directory. The GCMD Vision and Mission

4 Presentation Guide Does the GCMD Follow the DRM 2.0?

5 Presentation Guide Data Context  Facilitates discovery of data through an approach to the categorization of data according to taxonomies.  Enables the definition of authoritative data assets within the COI, (using unique identifiers).  Provides linkages to data described, thereby managing the ‘info glut’, through:  Open API links, such as OPeNDAP (an open source framework that simplifies aspects of science data networking.)  Related_URL “controlled keyword” links to data.  New “use” metadata associated with detailed variables within the data sets.  ~ 20 Petabytes of data represented through the GCMD.

6 Number of Science Keywords by Topic

7 Number of Services Keywords by Topic

8 Ancillary Keywords Coming Soon: Orbit Types, Spectral/Frequency Domain, Launch Sites and 4 Level Taxonomy for Models.

9 Presentation Guide Data Description: “How do we understand what data are available?”  Provides a means to uniformly describe data - thereby supporting its discovery, harmonization, categorization, sharing, and rapid coordination/ communication.  GCMD uses the DIF “standard”. There are many advantages.  Descriptions must be identified UNIQUELY.

Major steps in evolution through modification to a multilevel Earth science hierarchy: Category > Topic > Term > Variable > Detailed Variable Two important trends were emerging that would affect evolution:  FGDC and concept of “metadata” for geospatial and other data initiated.  Web taking shape DIF evolves from 23 to 34 fields  Compatible with mandated FGDC and Dublin Core.  Era of metadata initiated. Other “standards” emerging: ANZLIC  Web expanding: Search interfaces abound; GCMD ready for this revolution DIF evolves to 35 fields in MD7. [3 added; 2 deleted]  DIF creation date and revision history added.  New field for paleoclimate data: paleo-temporal coverage.  Personnel subfields modified.  FGDC mandated, but DIF compatible with all required fields, serving users with added benefits of unique ID. Conversion tools available: FGDC=><=DIF DIF acquires new sibling: the SERF, allowing cross linkages between services & data.  Redesign of query language; XML syntax; separation of presentation from business/application logic, with unexpected gifts: SOA architecture; querying multiple data sources for spatial, temporal, RDF and RDBMS databases, full-text; Struts facilitated creation of customized portals.  LDA experiment MD9 ISO compliance and evolves to 36 fields:  3 New fields added: new address; 2 data resolution subfields The Evolving DIF **International Interoperability Forum functions at the international level through CEOS.

11 Data Set (DIF) Population by Topic

12 Data Services (SERF) Population by Topic

13 Presentation Guide Data Sharing Supports the access to data - enabled by capabilities provided by both the Data Context and Data Description standardization areas through: 1. Ad-hoc requests (such as a query of a data asset) - an OpenAPI supports ad hoc requests. Example: OPeNDAP. 2. Exchange of data (such as those that consist of fixed, reoccurring transactions among parties): Examples: GeoConnections (Canada) OAI with NCAR and NOAA Data centers that use docBuilder tools to submit metadata descriptions.

14 JCADM/AMD Collaborations: 18 NADC’s

15

16 GCMD Hits Recorded Jan 2001-Dec 2006

17  Maintenance reduction  Improving the Discovery of and Access to Data and Services  Ease of use, such as web site navigation.  Accuracy of Results  Content Requirements  Quality control  Integration with metadata authoring tools that allow real-time updating by data set holders/producers.  Integrated keyword and free-text search, with both as “refinements”.  Bidirectional linkages between data sets and data set services.  Providing virtual subsets of the directory  Standards: ISO 19115/19139; OpenGIS; XML; RDF.  NASA needs  Science User Working Group Recommendations; user and partner requests.  Evolving coding languages and databases [e.g., C to Perl to Java]. Evolution: Project Development Drivers

18 MD7 10/99 MD5 04/ MD2 10/94 MD4 04/96 MD6 04/98 MD8 OPS 06/01 MD8 08/03  DIFs  SERFs  10,000  5,000  200 MD History 5/04 MD7 MD5 Science keyword hierarchy FGDC Compatibility Isite free-text search MD2  500  2,000 Features  10/96 MD4 DIFmorph for translating between FGDC and DIF PC-based DIF Writing Tool Transitioned space science DIFs to NSSDC First web client distributed  DIFWEB tools X-Windows client JAM client First use of Oracle DIFmacs Authoring Tool MD  12/00  05/00  08/00 MD7 Switched code base from C to Perl Conference Calendar Personnel "Role" field Paleo_Temporal Coverage  1st request in FGDC  MD8   DIFbuilder tools  09/01  07/02 MD8 OPS Switched code base from Perl to Java XML syntax for metadata OPS for managing metadata  Services Prototype Launched  First time the coordinators were able to load their own DIFs/SERFs  docBUILDER tools  11/01 Upgraded Isite free-text JAVA Applet for geospatial search and "Advanced" search interface First web page to use Science keyword Topics to search Parent/Child display Related_URL field added HCIL and Matrix interface MD6

19 06/04 MD8 MD9.3 MD /05  DIFs  SERFs  500  1,200  1,000  15, MD9.5 MD9.6  17,000 03/05 02/06 07/06 08/03 New Home Page Portals DTD for DIF and SERF Open API MD8 Struts Compatible with ISO metadata standard Geographic coverage map added to record MD Lucene Search engine Search term highlighted in records Refinement search by keywords or full text search User Comment form MD9.3 Spatial search with google map Refinement option by data resolution for NASA portals Support foreign characters record display Subscription service for science keywords  docBUILDER tools available for public MD9.5 Relative Temporal coverage added to accommodate data pools Added two level hierarchy for Related_URL (e.g. support Get Data) MD9.6 Features MD9.4  08/05 Location and data center hierarchy Increased number of characters for fields Spatial and temporal resolution range keywords docBUILDER tool personalized templates MD9.4 MD History (MD 8 and beyond)

20 The Hype: Distributed Systems >Check if application is appropriate for needs. >Determine its ROI. >Offered LDA, as “Local Database Agents” - not “Latent Dirichlet Allocation”. >Be vigilant for change. >Know when to cut losses. >Scope the future. The Out-In-The-Wilderness Request: Example: Offline Authoring Tool >Check longevity to assure usefulness when development complete. >Determine the ROI in advance. >Know when to cut your losses. Following the “Hype” or Listening Too Intently to the “Wilderness Request”.

21 “The Web doesn’t have a single, comprehensive clearinghouse where you can find all of the data and domains of knowledge covering all geographies …. Instead there are hundreds of … “Very few geospatial information scientists are working on the challenge beyond the GCMD (Global Change Master Directory), whose database holds more than 15,000 [actually this number is 17,300 +] descriptions of data sets and services covering all aspects of earth and environmental sciences.” Finding Ourselves Liebhold (May 2005) O’Reilly Network

22  Controlled Keywords (& definitions) to reference and retrieve a record or sets of records.  Authoring Tools with Update Capability. {Heavy use of controlled keywords.}  Keyword & Full Text Search to Data and Services with ordered “Result Set”. (No need to build a client to query, although the option to do so is available through Open API.)  Customized Portals - virtual subsets of the directory, created through use of controlled keywords.  The “Get Data” feature, which takes the user directly to the data.  Unique data set and services entries.  Easy compliance to related standards through XML.  Results available through Google.  Well-designed home page, with access to full set of services provided. Internal View of GCMD Value (2007)

23 MD Software Version 9.7  Support for 2 additional levels of Science Keyword hierarchy.  Improved Features for docBUILDER Authoring Tools  Support for writing Platform and Instrument descriptions & new keywords  Support for “GET DATA” tab.  “Text Only” display for 508 compliance. (in docBuilder)  Improved multimedia sample.  Improved spatial coverage selection.  Ability to change entry identifier.  Reference Guide for use of international characters and symbols.  RSS Feed, in addition to Keyword Subscription Service, to signal new directory entries.  Upgrades to Java 1.5 and Tomcat 5.  Location Keywords & “Chronostratigraphic Units” recreated as true taxonomies.

24  Keyword Functionality Upgrade.  Functionality “abstracted” to use a SKOS data model for navigating arbitrary taxonomies.  Integrated SKOS query into query language.  Backed by Berkeley DB XML for querying.  Example: [skos:Parameters=‘EARTH SCIENCE|ATMOSPHERE’] AND [skos:Instrument=‘AVHRR’]  New Platform/Instrument Display Reflects Taxonomic Changes.  Support for loading, extracting, querying.  Support for navigating through new taxonomies.  Support for full text search.  Support for creating these descriptions in docBUILDER. MD Software Version 9.7

25 SKOS Application

26

27

28

29 Page 1 of SERF

30 Page 2 of SERF

31 Page 1 of DIF

32 Page 2 of DIF

33  docBUILDER Enhancements  Option for public vs private view.  Automated reminders to metadata authors.  Initial testbed for multilingual capabilities using SKOS.  Variable Keyword extensions for “use” metadata.  Client to ECHO for metadata sharing using web services. MD Software Version 9.8

34 The Data Reference Model 3.0, Web 3.0 & SOAs Data Resource Awareness Agent Data & Information & Knowledge Repository staticdynamic Figure 3-1 DRM standardization Areas LanguageLogic