Procedures to Develop and Register Data Elements in Support of Data Standardization September 2000.

Slides:



Advertisements
Similar presentations
Status on the Mapping of Metadata Standards
Advertisements

C6 Databases.
1 Metadata Registry Standards: A Key to Information Integration Jim Carpenter Bureau of Labor Statistics MIT Seminar June 3, 1999 Previously presented.
Metadata to Support the Survey Life Cycle Alice Born, Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (METIS) Geneva,
Edition 3 Metadata registry (MDR) Ray Gates May 12, /05/20151.
Analyzing Systems Using Data Dictionaries Systems Analysis and Design, 7e Kendall & Kendall 8 © 2008 Pearson Prentice Hall.
Is Your Data Facility ISO Compliant? Progress Towards Harmonizing the DDI and ISO/IEC Dan Gillman Information Scientist US Bureau of Labor Statistics.
Data Quality Class 3. Goals Dimensions of Data Quality Enterprise Reference Data Data Parsing.
PROCESS MODELING Transform Description. A model is a representation of reality. Just as a picture is worth a thousand words, most models are pictorial.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
ISO as the metadata standard for Statistics South Africa
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
Proposal for App Id and Service Provider Id registration Group Name: Shelby Kiewel Source: Shelby Kiewel, iconectiv / Ericsson,
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
1 SAIC Phong Ngo Considerations for Establishing and Managing A Registry for Metadata Phong Ngo (NCITS/L8 - SAIC) April 15-17, 1998 Metadata Registry Workshop.
Larry Fitzwater, U.S. EPA Judith Newton, NIST Lois Fritts, SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2026.
Metadata Open Forum 2008 ISO/IEC/IEC 11179: Metadata Registries A Tutorial from the National Cancer Institute Dianne M. Reeves, RN, MSN National Cancer.
Representing variables according to the ISO/IEC standard.
United States Health Information Knowledgebase (USHIK) A Data Registry Project 17 February 1999 Open Forum on METADATA REGISTRIES.
Veterans Health Administration Data Registry Gregg Seppala Data Administrator Veterans Health Administration Presentation for.
Classification and the Metadata Registry Judith Newton NIST IRS XML Stakeholders/ XML Working Group May 18, 2004.
METADATA HARMONISATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
SDC JE-8019 February 16, 1999 Bruce Bargmeyer EPA/OIRM/EIM Division Tel: (202) WWW URL:
IMDB Registration of Survey Variables Dec 19, 2005.
Metadata Registries Workshop April 15, 1998 Slide 1 of 20 ANSI X Douglas D. Mann Stewardship Naming & Identification Classification.
The Final Study Period Report on MFI 6: Model registration procedure SC32WG2 Meeting, Sydney May 26, 2008 H. Horiuchi, Keqing He, Doo-Kwon Baik SC32WG2.
Component 11/Unit 8b Data Dictionary Understanding and Development.
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
Statistics Portugal/ Metadata Unit Monica Isfan « Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
FEA Data and Information Reference Model (DRM): the Interoperability Message Presented by Eliot Christian, USGS based on work of ISO/IEC JTC1/SC32 Data.
Registry Services Bringing Value to US EPA, States, and Tribes Exchange Network Vendors Meeting April 24, 2007 Cynthia Dickinson EPA/OEI/OIC Data Standards.
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall Analyzing Systems Using Data Dictionaries Systems Analysis and Design, 8e Kendall.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
ISO/IEC : Framework for a Metadata Registry By Daniel W. Gillman Bureau of Labor Statistics USA.
Proposal for App Id and Service Provider Id registration Group Name: Shelby Source: Shelby, iconectiv / Ericsson,
SDC JE What is a Data Registry? v A place to keep facts about characteristics of data that are necessary to clearly describe, inventory,
LoG: A Methodology for Metadata Registry-based Management of Scientific Data July 5, 2002 Doo-Kwon Baik
Lois Fritts SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2022.
Update for ISO/IEC PDTR Semantic Metadata Mapping Procedure (SMMP) May, 2010 Tae-Sul Seo and Sung-Joon Lim ISO/IEC.
1 ISO/IEC 11179, Part 2: Classification Schemes Jim Carpenter Bureau of Labor Statistics Metatopia 2001 Conference September 20 – 21, 2001.
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
SDC JE-2027 January 18, 2000 Bruce Bargmeyer Chair, SC 32 – Data Management and Interchange U.S. Environmental Protection Agency Telephone: (202)
Shawn Jones INDUS Corporation January 18, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2029.
ISAN: International Standard Audiovisual Number Hollywood Post Alliance Technology Retreat January 27 & 28, 2005 S. Merrill Weiss Merrill Weiss Group LLC.
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
Tutorial on XML Tag and Schema Registration in an ISO/IEC Metadata Registry Open Forum 2003 on Metadata Registries Tuesday, January 21, 2003; 4:45-5:30.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
May 2007 Registration Status Small Group Meeting 1: August 24, 2009.
Framework for the Specification and Standardization of Data Elements Part1 of ISO/IEC by Daniel W. Gillman, editor.
Common Queries for MDRs WG4 SQL16 ISO/IEC JTC1 SC 32 WG2 input to WG4 on SQL-MM Part 8 November, 2010 ISO/IEC JTC1/SC32/WG2 N1484.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
Considerations for Establishing and Managing A Metadata Registry Phong Ngo (SAIC) February, 1999 Metadata Registration Open Forum Washington, D.C., USA.
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
Understanding the Value and Importance of Proper Data Documentation 5-1 At the conclusion of this module the participant will be able to List the seven.
Metadata requirements for archiving structured data Alice Born Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (9-11 April.
Metadata models to support the statistical cycle: IMDB
Component 11 Configuring EHRs
DATA MODELS.
ISO/IEC Past, Present, Future -- A Thumbnail Sketch
Template library tool and Kestrel training
Networking and Health Information Exchange
2. An overview of SDMX (What is SDMX? Part I)
Edition 3 Metadata registry (MDR)
2. An overview of SDMX (What is SDMX? Part I)
Metadata The metadata contains
Introduction to reference metadata and quality reporting
Presentation transcript:

Procedures to Develop and Register Data Elements in Support of Data Standardization September 2000

2 Based on: ISO/IEC Draft Technical Report 20943, Information Technology – Procedures for Achieving Metadata Registry (MDR) Content Consistency – Data Elements

3 Metadata Registry EPA’s metadata registry is the Environmental Data Registry (EDR): The EDR is based on an international standard for metadata registries.

4 International Standard for Metadata Registries ISO/IEC 11179: Information Technology - Data Management and Interchange - Metadata Registries (MDR)

5 Parts of the Standard l Part 1: Framework for the Specification and Standardization of Data Elements l Part 2: Classification for Data Elements l Part 3: Registry Metamodel (MDR3) l Part 4: Rules and Guidelines for the Formulation of Data Definitions l Part 5: Naming and Identification Principles for Data Elements l Part 6: Registration of Data Elements

6 Data Element Registration l Characteristics of the data element are recorded as metadata attributes l Registration depends on the amount and quality of information available l Data elements might range from: u Standard data elements–complete, with good quality u Application data elements–incomplete with questionable quality

7 Steps to Follow When Registering a Data Element l Understanding the data element l Content research l Definition and permissible values l Names and identifiers l Administrative and miscellaneous attributes l Data element concepts l Classification schemes l Quality control

Example of Registration Registration of a data element for the code used by the United States Postal Service (USPS) to represent a state or state equivalent. 8

9 Understanding the Data Element What kind of data will be stored in this data element? Is there a definition or description of data values? What will the data values look like–names, descriptions, numerals to be calculated, character strings, or identifiers? Are the data values determined by an arithmetic or statistical procedure? Step 1

The USPS standard format for preparing a domestic mail piece requires that the last line of the address contain city name, state code, and ZIP code. The data element to be registered must represent the list of data values for state code that are acceptable to the USPS for mail delivery. Understanding the Data Element - Example 10

11 Content Research Is this data element described in an existing standard? Does the data element exist in this registry or a federation of registries, that has the potential for being used? Step 2

l FIPS PUB 5-2, 6-4, 55-3 u Contain 2-letter state codes u Include a code for U.S. Minor Outlying Islands – not recognized by the USPS u U.S. does not intend to continue maintaining FIPS codes l National Supercomputer Centers Usage Database u Contains only 4 of the 8 outlying territories u Omits all 4 freely associated states National Standards: Content Research - Example 12

l U.S. Postal Service standards u Include codes for all states, outlying territories, and freely associated states of the United States u Do not recognize a code for U.S. Minor Outlying Islands, which must be identified on mail pieces by name u Include codes for military “States” National Standards : l ISO 3166-Part 2, Country subdivision code u Identifies U.S. outlying territories and freely associated states as Countries in Part 1 International Standards: (continued) Content Research - Example 13

(continued) l State USPS Code l Mailing Address State Code l Geographic Address State Code Existing Data Elements in the EDR: l The code for U.S. Minor Outlying Islands–not acceptable for mail delivery l Codes for the 12 Canadian provinces All of the Above Include: Content Research - Example 14

The preferred data source for a standard data element for state code for mail delivery within the U.S. for states and state equivalents is the USPS standard, available at: Decision - Preferred Data Source 15

16 Definition and Permissible Values A definition must capture the essential semantic content of a data element. Definitions are recorded in context (where did the definition originate or how is it applied?). Permissible values are the domains of acceptable values for the data element: Enumerated by a specific list of values? Defined by a description, procedure, or range? Step 3

17 Permissible Values – Value Domain l How are values represented (e.g., name, code, text, date)? l When did each value become valid/invalid? l What are the name and definition/description of the value domain? l How many characters are required in the database to store the value? l Is the data value recorded as a character string, numerals, integer, or other? l Are the data values formatted? Step 3

The code that represents a United States state or state equivalent in a mailing address. Context: USPS Standard Definition - Example 18

l Representation: Code l Value Domain Name: The state codes for states and state equivalents of the United States l Definition: All codes recognized by the U.S. Postal Service on a mail piece for identification of a state or state equivalent of the United States l Field length: 2 l Datatype: alphabetic l Format: character string l List of values: 62 values representing the 50 states, the District of Columbia, the 8 outlying territories and freely associated states, and the 3 codes for military states Permissible Values - Example 19

20 Names and Identifiers A name is a term or phrase that describes the data element–something to call it. Names are recorded in context (where did the name originate or how is it applied?). Identifiers are unique. They identify the Registration Authority, the organization, the data element, and the version of the data element if information about the data element changes. Step 4

l Name: State or State Equivalent Code l Context: USPS Standard l Identifier: u Registration Authority:EPA u Organization:OEI u Sub-organization:OIC u Data Element ID:29324 u Version: 1 Names and Identifiers - Example 21

22 Administrative and Miscellaneous Attributes l Submitting organization–the organization that has submitted the data element for registration l Stewardship contact–the organization delegated the responsibility for maintaining the data element l Data element comment–provides remarks about usage, procedure, and other explanatory information that is not appropriate to include in the definition Step 5

23 Administrative and Miscellaneous Attributes l Data element example–an example of a value that is permissible for the data element l Data element origin–source of information about the data element, including document, standard, system, group, form, or message set l Creation/last change date–the system date when a data element was created or updated in the registry (continued) Step 5

l Submitting organization–Office of Environmental Information l Stewardship contact–Data Standards Branch l Data element comment–this data element is used to identify states and state equivalents for all United States mailing addresses, including military addresses l Data element example–NJ (New Jersey) l Data element origin–EPA data standard workgroup l Creation/last change date–system date Administrative & Miscellaneous Attributes - Example 24

25 Data Element Concept l Provides conceptual information l May relate data elements that convey the same concept with different representations l Singular–refers to only one concept l Must have a name and definition, recorded in context l Specified through a conceptual domain, i.e., the set of possible valid values for a data element concept, expressed without representation Step 6

l Name: U.S. State or State Equivalent l Definition: An identifier for a primary political subdivision of the United States, including an outlying territory or an associated state l Data elements that might share this data concept include: u United States State Name–New Jersey u State Common Name–Garden State u Facility Location State Abbreviation–NJ l This data element concept uses a subset of the values in the conceptual domain: u Primary Geopolitical Subdivisions of Countries Data Element Concept - Example 26

l Name: Primary Geopolitical Subdivisions of Countries l Definition: Identifiers for the primary geopolitical subdivisions of the countries of the world l Value meanings might include: u The U.S. state of Alabama u The Canadian province of Alberta u The Malaysian state of Sabah u The U.S. state equivalent of District of Columbia Conceptual Domain - Example 27

28 Classification Schemes l Usage l Data standard l Application system l Data collection form l Keywords l Object class Data elements might be classified according to any of the following types of groups where the data element might be listed: Step 7

l Mailing address group l U.S. Postal Service Address Standard l Form R for Toxic Release Inventory l Keywords: State, Geopolitical Classification Schemes - Example 29

30 Quality Control l Registration status–records the position in the registration life cycle of the data element, that indicates the stage of quality review for a data element u Incomplete–all metadata are not entered u Recorded–all metadata are entered u Certified–metadata are valid u Standard–the preferred data element for Agency use Step 8

Quality Assurance Registration Status All data have been entered:Recorded Data are certified to be accurate:Certified After becoming Agency standard:Standard Quality Control - Example 31