Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk.

Slides:



Advertisements
Similar presentations
Inside an XSLT Processor Michael Kay, ICL 19 May 2000.
Advertisements

Brief Introduction to Provenance "As data becomes plentiful, verifiable truth becomes scarce
1/(20) Introduction to ANNIE Diana Maynard University of Sheffield March 2004
An Introduction to GATE
Funded by: European Commission – 6th Framework Project Reference: IST WP6 review presentation GATE ontology QuestIO - Question-based Interface.
University of Sheffield NLP Exercise I Objective: Implement a ML component based on SVM to identify the following concepts in company profiles: company.
University of Sheffield NLP Module 4: Machine Learning.
CG0119 Web Database Systems Parsing XML: using SimpleXML & XSLT.
Embrace the Elephant A few provocative questions….
Open Provenance Model Tutorial Session 6: Interoperability.
Open Provenance Model Tutorial Session 2: OPM Overview and Semantics Luc Moreau University of Southampton.
Open Provenance Model Tutorial Session 7: Open Provenance Model Vocabulary.
Provenance GGF18 Kepler/COW+RWS, Kepler/COW+RWS, Bowers, McPhiilips et al. Provenance Management in a COllection-oriented Scientific Workflow.
Open Provenance Model Tutorial Session 3: OPM Serializations Luc Moreau University of Southampton.
Tutorial on Standoff Markup as used in: HCRC Map Task Corpus MATE/NITE Workbench Amy Isard HCRC Language Technology Group University of Edinburgh.
Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems.
XSLT Stylesheets Some more advanced examples (adapted from the Edinburgh LT site)
Highs and Lows of Library Linked Data Adrian Stevenson UKOLN, University of Bath, UK (until end Dec 2011) Mimas, Libraries and Archives Team, University.
Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September.
Version control for graph-based models Z. Protić M. F. van Amstel M.G.J. van den Brand.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Open Repositories 2015 Sharon Farnel, University of Alberta
Ontology-based Information Extraction for Business Intelligence
Framework for Model Creation and Generation of Representations DDI Lifecycle Moving Forward.
DICOM in XML Where we’re headed. Background In 2003, the Ad Hoc Publishing Committee created ‘proof-of-concept’ drafts of Parts 3, 6, 12, and 16 –Base.
Introduction to XSLT & its use in Grainger Library full-text & metadata projects Thomas G. Habing Grainger Engineering Library Presentation to ASIS&T,
Open Provenance Model Tutorial Session 5: OPM Emerging Profiles.
Erasmus University Rotterdam Introduction Nowadays, emerging news on economic events such as acquisitions has a substantial impact on the financial markets.
National Institute of Standards and Technology 1 Testing and Validating OAGi NDRs Puja Goyal Salifou Sidi Presented to OAGi April 30 th, 2008.
Chapter 7 Structuring System Process Requirements
December 15, 2011 Use of Semantic Adapter in caCIS Architecture.
Information Extraction From Medical Records by Alexander Barsky.
XML – Tools and Trends Schematron Tim Bornholtz Session 55.
ISO Environmental management — Life cycle assessment — Data documentation format.
 2004 Prentice Hall, Inc. All rights reserved. 1 Chapter 34 - Case Study: Active Server Pages and XML Outline 34.1 Introduction 34.2 Setup and Message.
New Printing Options from RTF Charles Engelke Info Tech, Inc. Trnsport Users’ Group October 15, 2004.
CIS 375—Web App Dev II XSL. 2 XSL Introduction XSL stands for _____________________________. XSL is the language used for manipulating and displaying.
27/03/01CROSSMARC kick-off meeting LTG Background XML-based Processing –Several years of experience in developing XML-based software –LT XML Tools –Pipeline.
© 2012 IBM Corporation Best Practices for Publishing RDF Vocabularies Arthur Ryman,
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
Eurostat Expression language (EL) in Eurostat SDMX - TWG Luxembourg, 5 Jun 2013 Adam Wroński.
DITA Single Source technology. What is Single Source? Single source technology is a concept of publishing documents when same content can be used in different.
Introduction to GATE Developer Ian Roberts. University of Sheffield NLP Overview The GATE component model (CREOLE) Documents, annotations and corpora.
METS Dissemination METS Opening Day Corey Keith
MEDIN Standards Workshop Using ESRI ARC 10 to create metadata.
©2003 Paula Matuszek Taken primarily from a presentation by Lin Lin. CSC 9010: Text Mining Applications.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
1 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. Using XSLT and XPath to Transform XML Documents Roger L. Costello XML Technologies.
MedKAT Medical Knowledge Analysis Tool December 2009.
The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena.
September 25, 2006 NASA Feasibility Study Status Update.
SDMX IT Tools Introduction
Natural Language Interfaces to Ontologies Danica Damljanović
The Data Documentation Initiative (DDI) Fostering Community Engagement and Adoption Breakout 9 RDA Sixth Plenary, Paris Mary Vardigan, ICPSR, University.
IESR Metadata Ann Apps MIMAS, The University of Manchester, UK.
Chunk Parsing II Chunking as Tagging. Chunk Parsing “Shallow parsing has become an interesting alternative to full parsing. The main goal of a shallow.
Prizms for Data Publication and Management May 9, 2014 Katie Chastain.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
Of 24 lecture 11: ontology – mediation, merging & aligning.
United Nations Economic Commission for Europe Statistical Division GSBPM in Documentation, Metadata and Quality Management Steven Vale UNECE
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
Semantic Web Application Patterns: Pipelines, Versioning and Validation David Booth, Ph.D. (Consultant) W3C Linked Enterprise Data Patterns Workshop 7-Dec-2011.
Paul Eglitis [IEEE] and Siri Jodha S. Khalsa [IEEE]
Data Virtualization Tutorial: XSLT and Streaming Transformations
Presented at Archives Records 2016, session 510
Exact XML Master Exam I Dumps - I Real Exam Questions Answers
Content & the Supply Chain
The GTFS-ride Data Standard: Using GTFS Datasets for Ridership
TOOLS & Projects overview
Presentation transcript:

Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Outline Background about data.gov.uk The use cases – XML serialization – Data transformation on the fly – Complex and nested processes

data.gov.uk Linking UK government data Aims: – Provide a set of best practices for government agencies – Provide the minimum set of tooling and specification to facilitate the publication of data – Encourage “responsible” data publishing

XML -> RDF XSLT Processor XSLT Parameter Binding XSLT Parameter Binding XSLT Stylesheet XSLT Template input output RDF File Who, when, which version, how

XSLT Processor input output RDF File XSLT Parameter Binding XSLT Parameter Binding XSLT Stylesheet XSLT Template Downloaded from; Unzipped from, etc Made accessible Who, when, which version, how

On-the-fly Transformation Data transformation wrapper Who, when, which version, how

Complex Data Creation Pipeline GATE Pipeline GateXMLRegressionTransformation GateXMLRdfaTransformation RdfaRdfXmlTransformation Courtesy of Paul Appleby from TSO (Data Enrichment Service)

Complex Data Creation Pipeline GATE Pipeline GateXMLRegressionTransformation GateXMLRdfaTransformation RdfaRdfXmlTransformation Document Reset PR ANNIE English Tokeniser ANNIE English Splitter ANNIE POS Tagger Data.gov.uk Morphological Analyzer Data.gov.uk Flexible Roof Gazetteer Data.gov.uk Generic Gazeteer GATE Noun Phrase Chunker Data.gov.uk Generic Transducer TSO Coreference Courtesy of Paul Appleby from TSO (Data Enrichment Service)

wasGeneratedBy hasParentProcess iterationOfProcess Level 1: Provenance of execution at higher level Level 0: Provenance of execution at detailed level Services used by executions Artifacts followed wasDerivedFrom A data collection wasTriggeredBy accessedService

Non-digital Data Objects Organizations – Organizational structure changes over time – Origin organization, resulting Organization Boundary Legislation An organization ontology:

The Challenges Data of different representations, of physical forms, of granularity Not tooling support Provenance across different types of systems – Identification – Different terminologies

The Gaps A vocabulary being able to describe provenance of all types of data, from different systems A vocabulary still providing enough terms to describe provenance accurately

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (