Download presentation
Presentation is loading. Please wait.
Published byFrank Booth Modified over 9 years ago
1
The Data Cube Vocabulary: Statistics in the Web of Linked Data Arofan Gregory Open Data Foundation WICS, Geneva, 5-7 May 2015
2
Outline RDF Fundamentals The Data Cube Vocabulary (QB) Implications for Modernization Conclusions
3
What is RDF? RDF – the Resource Description Framework A family of specifications published by W3C – Championed by (Sir) Tim Berners Lee – Very popular with the Open Data movement – Basis of the “Semantic Web” and the “Web of Linked Data” Allows for machine-actionable, semantically rich linking of things found on the Web
4
RDF Fundamentals (1) RDF is based on a simple construct: the Triple SubjectObject Predicate
5
RDF Fundamentals (2) The Subject and Object can be anything on the Web that can be addressed by a URL – This includes portions of files (a paragraph in an HTML document, for example) The Object can be a literal value (string, number, etc.) The Predicate describes their relationship – Predicates (relationships) and properties are described for important objects within a domain – This is termed a “vocabulary” or an “ontology”
6
RDF Fundamentals (3) Triples can be published anywhere on the Web, to refer to Subjects and Objects anywhere on the Web – On-going addition of information – Users can add to the information set as well as publishers Web resources are usually identified with URLs, but can be identified with URIs – URLs are a type of URI
7
The Web of Linked Data The sum total of all RDF-linked data on the Web is termed the “Web of Linked Data” (or “Linked Data on the Web” [LDOW]) This replaces earlier terminology: the “Semantic Web” Open Government movements are placing huge pressure to add data to the Web of Linked Data – Seems to be successful, but still in early stages
8
DataCube The main RDF vocabulary for describing statistical data is the Data Cube Vocabulary (QB) – Published by the W3C – SDMX experts were involved in its development It is a simplified version of the SDMX model – It only covers data structures and data sets – No data flows, no provision agreements, no reference metadata – No need for web services or registry – RDF uses different mechanisms It has become very widely used over the past few years: – http://wiki.planet-data.eu/web/Datasets http://wiki.planet-data.eu/web/Datasets
9
Data Cube Vocabulary
10
Who is Using Data Cube? Most statistical data is published by third parties – Not by the producers of the data – Usually done by well-intentioned Open Data adherents Some activity within statistical agencies – INSEE (France), ISTAT, Eurostat, CSO (Ireland), OECD, others – Mostly exploratory
11
Open Cube Project One interesting project which uses Data Cube is the “Open Cube Project” EC-funded project Developed open-source tools for visualizing and working with Data Cube RDF: – Alpha-level release – Will be some on-going work – http://opencube-project.eu/ http://opencube-project.eu/
12
Related Vocabularies XKOS – Statistical classifications – Published by the DDI Alliance – Aligned with QB DDI Discovery Vocabulary (DISCO) – Logical description of microdata – Published by DDI Alliance – Aligned with QB Physical Data Description – Physical description of microdata – Published by the DDI Alliance – Aligned with DISCO, QB
13
Implications for Modernization RDF technology provides potential for new functionality – On-going enhancement of linkages with data published on the Web – Users can enhance what we know about published statistics – Experience of data is static today – could become interactive New statistical products can be imagined – Cost of producing QB is low, benefits could be significant
14
Conclusions Data producers should publish QB – Retain ownership of their data – Easy to do if SDMX can be supported Presents the possibility of new types of statistical products at low cost – This is yet to be explored
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.