Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

IPY and Semantics Siri Jodha S. Khalsa Paul Cooper Peter Pulsifer Paul Overduin Eugeny Vyazilov Heather lane.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Semiotics and Ontologies. Ontologies contain categories, lexicons contain word senses, terminologies contain terms, directories contain addresses, catalogs.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
The Semantic Web – WEEK 5: RDF Schema + Ontologies The “Layer Cake” Model – [From Rector & Horrocks Semantic Web cuurse]
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Clément Troprès - Damien Coppéré1 Semantic Web Based on: -The semantic web -Ontologies Come of Age.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
UNECE METIS work session on statistical metadata Luxembourg, 9 to 11 April SDMX as a source of standardised terminology: MCV and cross-domain concepts.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
CASE STUDY: STATISTICS NORWAY (SSB) Jenny Linnerud and Anne Gro Hustoft Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Luxembourg.
GSIM implementation in the Istat Metadata System: focus on structural metadata and on the joint use of GSIM and SDMX Mauro Scanu
United Nations Economic Commission for Europe Statistical Division Part B of CMF: Metadata, Standards Concepts and Models Jana Meliskova UNECE Work Session.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Statistics Portugal/ Metadata Unit Monica Isfan « Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
Trait ontology approach Marie-Angélique LAPORTE NCEAS June 7 th 2010.
OECD Expert Group on Statistical Data and Metadata Exchange (Geneva, May 2007) Update on technical standards, guidelines and tools Metadata Common.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
UNECE METIS 2008 Pre-work session survey of participants.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
1 Joint UNECE/EUROSTAT/OECD METIS Work Session (Geneva, March 2010) The On-Going Review of the SDMX Technical Specifications Marco Pellegrino, Håkan.
Chapter 5 The Semantic Web 1. The Semantic Web  Initiated by Tim Berners-Lee, the inventor of the World Wide Web.  A common framework that allows data.
Ontology Technology applied to Catalogues Paul Kopp.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
IAEA International Atomic Energy Agency Implementing SDMX for Energy Domain: From Discussion to Actual Implementation and Testing Andrii Gritsevskyi Oslo.
The Semantic Web By: Maulik Parikh.
DOMAIN ONTOLOGY DESIGN
ece 627 intelligent web: ontology and beyond
Structural and reference metadata in the European Statistical System
Progress Update MSIS: Bratislava, April 2005
SDMX Information Model
ece 720 intelligent web: ontology and beyond
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
Statistical Information Technology
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Metadata use in the Statistical Value Chain
Semantic Statistics DDI Lifecycle: Moving Forward Outcome of the Recent Workshops in Dagstuhl Joachim Wackerow.
Part B of CMF: Metadata, Standards Concepts and Models Jana Meliskova
Taxonomy of public services
Introduction to reference metadata and quality reporting
Presentation transcript:

Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar Statistics Portugal Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (METIS) Lisbon, 11 – 13 March, 2009

2 Definitions SDMX and SDMX Content-Oriented Guidelines (COG) Metadata Common Vocabulary (MCV) Concepts and related definitions used in structural and reference metadata of international organizations and national data producing agencies. Content Oriented Guidelines = MCV+ Cross Domain Concepts (subset of MCV) + Statistical Subject-matter Domains Last version (2009): 397 terms. Goal: uniform understanding of standard metadata concepts.

3 ESSnet on SDMX Objective –Further development of SDMX Further development and improvement of the SDMX Content-oriented Guidelines Metadata Task Force on SDMX (Statistics Portugal) WP Proposal: MCV Ontology Metadata Common Vocabulary (MCV) Semantic univocity  design of a conceptual model of the domain Detecting eventual inconsistencies, redundancies or incompleteness of the glossary Lack of structure, flat list, non-hierarchic relations between terms No semantic relations between terms

4 Conceptual system Building a glossary implies usually a previous design of a conceptual model of the respective domain. Proposal for a revision of MCV –Starting with the existent terms and definitions –creating semantic relations between terms based on the definitions of the MCV terms (bottom-up or middle-out strategy): –Goal: reveal the latent conceptual system, detecting eventual structural incongruence or redundancies.

5 Conceptual system and Concept Map Main goals –find redundancies, inconsistencies, omissions, terms belonging to other domains different from statistical metadata (justified by the complex and interdisciplinary nature of metadata). –To find omitted terms (important and relevant), is necessary to analyze the definitions of the concepts. Bearing this in mind we built a “Concept Map” representing about 20% of the terms in MCV (draft version). A concept map is a diagram showing the relationships among terms/concepts. Concepts are connected with labeled arrows, in a downward-branching hierarchical structure. Visualization (graphical): difficult since there is a great number of terms and relations.

6 Concept Map (partial view)

7

8 Terms and relations between MCV terms/concepts

9 Using Resource Description Framework (RDF) RDF is a framework for representing information in the Web. RDF is particularly concerned with meaning. RDF is a collection of triples, each one consisting of a subject, a predicate and an object: e.g. “MetadataExchange is-a DataAnd MetadataExchange”

10 Middle range solution Using SKOS (Simple Knowledge Organization System) - currently developed within the W3C framework Bridging technology between “chaos” and more rigorous logical formalism of ontology languages (like OWL). It is an application of the Resource Description Framework (RDF) providing a model for expressing the basic structure and content of concept schemes such as thesauri.

11 SKOS example: concept -data <rdf:RDF Characteristics or information, usually numerical, that are collected through observation data Data is the physical representation of information in a manner suitable for communication, interpretation, or processing by human beings or by automatic means (Economic Commission for Europe of the United Nations (UNECE), "Terminology on Statistical Metadata", Conference of European Statisticians Statistical Standards and Studies, No. 53, Geneva, 2000).

12 Ontologies Ontology = explicit formal specifications of the terms in the domain (statistical metadata) and relations among them. It is a model of reality in the world (created using an iterative design) Using an editing and modeling system of ontologies like Protégé (open source software in )

13 Ontologies reasoning It is essential to provide tools and services (reasoners) to help users answer queries over ontologies and classes and instances, e.g.: find more general/specific classes; retrieve individual matching an existing query ex. Is there any survey with trimestral frequency that uses any classification system and has a dissemination format as an on-line database?

14 Ontologies - methodology Developing an ontology: 1. Defining classes 2. Arranging classes in a taxonomic hierarchy (classes and subclasses) 3. Defining slots (same as roles or properties) 4. Describing allowed values for these slots (facets, role restrictions) 5. Filling in the values for slots for instances (individuals)

15 Ontology - Classes Just a first try to build an ontology of statistical metadata: main classes created from MCV (According to SDMX Content-Oriented Guidelines: Framework, Draft March 2006, p.6) 1. General metadata (derived from ISO, UNECE and UN documents); 2. Metadata describing Statistical methodologies; 3. Metadata describing Quality assessment; 4. Terms referring to Data and metadata exchange (SDMX information model and data structure definitions, etc.).

16 Classes and subclasses (Protégé)

17 Classes and subclasses

18 Classes and subclasses Quality

19 Properties Property Class (e.g. “Quality according to Eurostat, has a dimension called relevance”) relevance

20 Codification - Ontology Web Language (OWL) ………………….. <rdfs:comment > Metadata Common Vocabulary (MCV) ontology. ……………………… // Object Properties ……………………….. // Classes

21 Conclusion Since Ontology is a very strict, rigorous and formal language to represent knowledge, mapping a glossary like Metadata Common Vocabulary into a Statistical Metadata Ontology can help to reduce eventual inconsistencies, incompleteness and lack of structure; This may facilitate harmonization of concepts describing data (semantic univocity) to the SDMX users.