Download presentation
Presentation is loading. Please wait.
Published byHester Shelton Modified over 9 years ago
1
Reducing Metadata Objects Dan Gillman November 14, 2014
2
Focus Metadata describing data Conforming to a standard may imply Creating too many objects Lack of meaningful roles Generating nightmares for – Discovery – Efficiency – Management – Semantic interoperability 2
3
Focus Can this be helped? Problems Illustrated by ISO/IEC 11179 Potential solution Incorporated into DDI-4 3
4
Preliminaries Metadata Definition: – Data used to describe some objects Metadata are data first – No data always metadata “relative” concept – Descriptive relationship is key 4
5
Preliminaries Re-use Power of metadata management Write once – Link many Similar to normalizing database schemas Allows for – Sharing meanings – Comparison – Targeted search – Efficient storage / retrieval 5
6
Preliminaries Problem Dependencies Many-to-One relationships Let B’ be new version of B But A can’t be related to both 6 A 1 0..* B’ A 1 0..* B
7
ISO/IEC 11179 About – description of data Title – Metadata registries Mechanism – organize semantics 6 part standard Framework (1)Definitions (4) Classification (2)Naming (5) Metamodel (3)Registration (6) 7
8
ISO/IEC 11179 Basic model – 8 DATA ELEMENT CONCEPT DATA ELEMENT CONCEPTUAL DOMAIN 0..* VALUE DOMAIN 1 CONCEPTUAL LEVEL REPRESENTATIONAL LEVEL 0..* 1 1
9
ISO/IEC 11179 Plus – 9 DATA ELEMENT CONCEPT 0..* PROPERTY 0..1 0..* OBJECT CLASS
10
ISO/IEC 11179 New Object Class or Property Implies new Data Element Concept – Implies new Data Element Change in Permissible Values Implies new Value Domain – Implies new Data Element Similarly for change in Value Meanings Implies new Permissible Values 10
11
Problems 11179 One kind of data element – No abstract vs application One kind of value domain – Processing codes not separated Processing steps Sentinel values – Missing, Etc. Software and application dependent 11
12
Problems Dimensional data Tables – Many cells – Each cell its own data element? No means to differentiate cells Time series – Similar problem 12
13
Data Documentation Initiative (DDI) Social Science data libraries and archives Since 1995 Consortium based since 2005 DDI Alliance University of Michigan 13
14
DDI 2 development threads Codebook – From earlier work – Latest version 2.5 Lifecycle – Includes processing – Latest version 3.2 Both rendered in XML-Schema Complex to read and use 14
15
DDI Modernization (DDI-4) Upgrade for Lifecycle Rendered in UML Built in sections Following Generic Statistical Information Model – Built under UNECE Statistical Division – DDI is Profile (ISO/IEC TR 10000-1) 15
16
DDI Variables Differs from 11179 Data Element Types – Conceptual No object class Only has Conceptual Domain – Represented Inherits from Conceptual Variable Has object class (called Unit Type) E.g., People, Establishment Has Value Domain Substantive – subject matter related 16
17
DDI Variables – Instance Inherits from Represented Variable Has Universe – specialized Object Class E.g., Patients, Hospitals Has second Value Domain Sentinel – processing related No DEC – implied Specificity cascade – For 11179 Property (DDI Variable) – For 11179 Object Class (DDI Unit Type) 17
18
DDI Variables Value Domain growth – Due to changing codes – 11179 Substantive * Sentinel – DDI Substantive + Sentinel Data Element growth – About the same – DDI is much more specific 18
19
DDI Variables 19 Represented Instance Sentinel Value Domain ` ` Conceptual Domain Substantive Value Domain Conceptual
20
DDI Variables 20 Represented Instance Universe ` ` Unit Type Conceptual `
21
Example DDI Sex of a patient Conceptual variable (CV) = sex – CD = {male, female} Represented variables (RV1 and RV2) – Inherit from CV – Unit type = Person – VD1 = {, } for RV1 – VD2 = {, } For RV2 21
22
Example DDI For 3 applications: SAS, SPSS, Excel – Sentinel CD = {Don’t know, Refused} – Universe = Patient (specialization of Person) Instance variable (IV) – for SAS – Two – inherit from RV1 or RV2 – SenVD = {, } 22
23
Example DDI Instance variable (IV) – for SPSS – Two - inherit from RV1 or RV2 – SenVD = {, – } Instance variable (IV) – for Excel – Two - inherit from RV1 or RV2 – user defined sentinel codes – SenVD = {, } 23
24
Example DDI Total objects (18) – 1 Unit Type – 1 Universe – 1 CV – 2 RV – 6 IV – 2 CD (sub & sen) – 5 VD – Including much inheritance 24
25
Example 11179 Sex of patient Object class = patient Property = sex DEC = sex of patient CD = {male, female} VD1 = {, } VD2 = {, } Two DE’s, one for each VD 25
26
Example 11179 2 more abstract DE’s Correspond to CV in DDI Sex of patient Object class = person Property = sex DEC = sex of person CD = {male, female} Need VD1 and VD2, too 26
27
Example 11179 DE’s for processing? – Missing sentinels for each application – Need 6 VD’s, one CD, 6 DE’s CD = {male, female, don’t know, refused} VD3 = {m, f,.d,.r} (SAS) VD4 = {0, 1,.d,.r} (SAS) VD5 = {m, f, -998, -999} Etc. 27
28
Example 11179 Total objects (25) – 2 Object Class – 1 Property – 2 DEC – 2 CD – 8 VD – 10 DE – Little inheritance – Each new application -> twice the VD’s 28
29
Example 11179 Less specificity More objects Lack of constructs 29
30
Contact Information Dan Gillman Information Scientist Office of Survey Methods Research www.bls.gov/osmr 202-691-7523 Gillman.Daniel@bls.gov
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.