Download presentation
Presentation is loading. Please wait.
Published byJoy Thornton Modified over 8 years ago
1
Location Keyword Analysis and Approach to Fixing Records: A Case Study GCMD Science Coordinators CGSync Technical Tag Up Meeting 2016-02-18
2
Outline Overview Introduction to GCMD Location Keywords Location Keyword Analysis Plan Moving Forward Reference Links Additional Report Details 2
3
Overview Location keywords in the collection-level metadata are inconsistent. The ECHO ‘Spatialkeyword’ field is free-text, uncontrolled, and needs the most attention. o Ex: ‘Spatialkeyword’ HAWAI The DIF ‘Location’ keyword field is controlled and well defined. o Ex: Continent > North America > United States of America > Hawaii UMM-C Recommends: Location data should be reconciled between both systems and controlled within the CMR through a keyword management system. Additionally, this element should utilize controlled vocabulary and ISO 19115 keyword type codes. GCMD Recommends: Adopt the GCMD Location keyword hierarchy and the common keyword management/life-cycle approach. Perform analysis of the quality of the location values in the records before enabling the Location keyword facet. 3
4
Introduction to GCMD Location Keywords A superset of locations that aligns with ISO, the World Factbook, and keyword recommendations from collaborations. Ex: The Antarctic Master Directory with a need for more extensive location keywords for the polar regions. Ex: Oak Ridge National Laboratory (ORNL) DAAC with a need for additional region names. A highly-recommended, normalized, and science vetted set of hierarchical keywords used for describing the location of Earth science data. An evolving set of keywords (over 20 years) that are very stable, but there are occasionally requests for additional location keywords. 4
5
Introduction to GCMD Location Keywords There are 5 controlled levels and one uncontrolled level in the location keywords. SyntaxExample Example (R) Location Category Continent Ocean (R) Location TypeNorth America Atlantic Ocean Location Subregion 1United States of America North Atlantic Ocean Location Subregion 2Maryland Bay of Fundy Location Subregion 3Baltimore (None) Detailed Location(Uncontrolled Keyword) (Uncontrolled Keyword) Keywords are also expressed as a UUID (5e64ca14-42f3-4222-8f2c- 5db3c7b71d8f) docBUILDER uses the GCMD location keywords. 5
6
Location Keyword Analysis: Summary DIF Records: 23,899 with ‘Location’ Keywords (89% of Total Records) 23,899 Pass GCMD KMS Validation (At Time of Migration to CMR) 26,918 DIF Records Total ECHO Records: 2,494 with ‘Spatialkeywords’ Values (55% of Total Records) 2,494 Fail GCMD KMS Validation (Missing Keyword Hierarchy) 4,495 ECHO Records Total ISO Records: Pending, but there are a small number of ISO records in CMR 6
7
Location Keyword Analysis: Details 7 Analysis Steps: ‘Spatialkeywords’ were extracted from ECHO records ‘Location’ keywords were extracted from GCMD KMS Comparative analysis was performed in Excel * Snapshot of records were taken on 2016-01-25 Plan To Bring Records in Alignment With The GCMD Location Keywords # of Keywords # of Unique Records (1) Add Full Location Keyword Hierarchy To Unique ECHO ‘Spatialkeywords’ Values2371,801 (2) Add/Update KMS With Unique ECHO ‘Spatialkeywords’ Values Not Represented in GCMD Location Keywords; Add Keyword Value to ECHO ‘Spatialkeywords’26693 (3) Review ECHO Records Without ECHO ‘Spatialkeywords’ Values And Add Location Keywords Where Appropriate (Some Previously as DIF Prior to Reconciliation)NA 2,001 (4) Review DIF Records Without Location Keyword Values And Add Location Keywords Where Appropriate NA3,019 Total Records To Review7,514
8
Plan Moving Forward Assuming That The GCMD Location Keyword Hierarchy Is Accepted Here Is What We Plan To Do: Fix the Location Keywords and Records (As Described in Previous Slide) Start with providers that have completed reconciliation Manually review a sample of the records Work with providers to modify their source metadata Three Sprint Effort (GCMD Work) o Sprint 1: Identify All Records Affected with Invalid Location Keywords and Share Report with Providers o Sprint 2: Task 1 and 2: Fix the Location Keywords (For Providers That Request Us to Change) o Sprint 3: Task 3: Review Records Without Location Keywords and Update Where Appropriate 8
9
Plan to Adopt a Common Management Approach Follow The Location Keyword Approach For All GCMD Keywords and Records (Compiled in QA Reports) Recommend MMT Supply Keyword Pick-lists Sourced from the GCMD Keywords Enable Location Keyword Facet in the Search Interfaces (GCMD and EDSC) Communicate the Keyword Process to Metadata Providers (via the Keyword Community Guide Document) Recommend that CMR Phase In Enforcement of the GCMD Keywords 9
10
Reference Links Keyword Community Guide Document ‘Draft’ (2015- 07-23) https://wiki.earthdata.nasa.gov/display/CMR/GCMD+Keyword+Docu mentshttps://wiki.earthdata.nasa.gov/display/CMR/GCMD+Keyword+Docu ments GCMD Location Keywords (KMS) http://gcmdservices.gsfc.nasa.gov/static/kms/locations/locations.csv World Factbook https://www.cia.gov/library/publications/the-world-factbook/ ISO Locations http://www.iso.org/iso/home/standards/country_codes.htm UMM - Collections (UMM-C) https://wiki.earthdata.nasa.gov/download/attachments/49448405/UMM- C_22sept%202015.docx?version=1&modificationDate=1442930358897& api=v2https://wiki.earthdata.nasa.gov/download/attachments/49448405/UMM- C_22sept%202015.docx?version=1&modificationDate=1442930358897& api=v2 10
11
Additional Report Details 11
12
Location Keyword Analysis: ECHO Details 12 26 Unique ECHO ‘Spatialkeywords’ Values ECHO SpatialKeywords # of ECHO Records GCMD Location Keywords Republic of the Congo142Continent > Africa > Central Africa > Congo, Republic UNITED STATES126Continent > North America > United States of America Russia81Continent > Europe > Eastern Europe > Russian Federation Syria78Continent > Asia > Western Asia > Middle East > Syrian Arab Republic Democratic Republic of the Congo72Continent > Africa > Central Africa > Congo, Democratic Republic Laos69Continent > Asia > Southeastern Asia > Loa People's Democratic Republic GLOBAL LAND51Geographic Region > Global Land ONTARIO, CANADA30Continent > North America > Canada > Ontario N.Atlantic Salinity maximum region15Ocean > Atlantic Ocean > North Atlantic Ocean > > N.Atlantic Salinity maximum region (uncontrolled) SUGAR LOAF KEY6Continent > North America > United States of America > Florida > Florida Keys > Sugar Loaf Key (uncontrolled) HAWAI5Continent > North America > United States of America > Hawaii POINT LOCATIONS3Ancillary Keyword Republic of Moldova2Continent > Europe > Eastern Europe > Moldova CANADA TO ATLANTIC1Ancillary Keyword ONTARIO, CANADA CARE1Continent > North America > Canada > Ontario BURLINGTON, VERMONT1Continent > North America > United States of America > Vermont > Burlington GAYLORD, MICHIGAN1Continent > North America > United States of America > Michigan > Gaylord BOSTON, MASS1Continent > North America > United States of America > Massachusetts > Boston PORTLAND, MAINE1Continent > North America > United States of America > Maine > Portland LA CROSSE, WISCONSIN1Continent > North America > United States of America > Wisconsin > La Crosse DES MOINES, IOWA1Continent > North America > United States of America > Iowa > Des Moines DAVENPORT, IOWA1Continent > North America > United States of America > Iowa > Davenport CENTRAL PACIFIC KWAJ1Ocean > Pacific Ocean > Western Pacific Ocean > Micronesia > Central Pacific Kwaj ALASKAN NORTH SLOPE1Continent > North America > United States of America > Alaska > > Alaskan North Slope (uncontrolled) FLORIDA KEYS1Continent > North America > United States of America > Florida > Florida Keys MidAtlantic to East Indian Ocean Baltic Sea to below Africa1Ancillary Keyword 693 * Keywords in Red Are Additions to KMS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.