Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Slides:



Advertisements
Similar presentations
Open repositories: value added services The Socionet example Sergey Parinov, CEMI RAS and euroCRIS.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Distributed Data Processing
BI Web Intelligence 4.0. Business Challenges Incorrect decisions based on inadequate data Lack of Ad hoc reporting and analysis Delayed decisions.
In a cube in the office there lived an information worker…
1 Publishing Linked Sensor Data Semantic Sensor Networks Workshop 2010 In conjunction with the 9th International Semantic Web Conference (ISWC 2010), 7-11.
Sustainable Preservation of Linked Data Vassilis Christophides.
(1) Standardizing for Open Data Ivan Herman, W3C Open Data Week Marseille, France, June Slides at:
CNRIS CNRIS 2.0 Challenges for a new generation of Research Information Systems.
Open Data at the World Bank. Open Data at the World Bank Open about what we do Open about what we.
Unlock Your Data Rich connectivity Robust data integration Enterprise-class manageability Deliver Relevant Information Intuitive design environment.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
ÆKOS: A new paradigm for discovery and access to complex ecological data David Turner, Paul Chinnick, Andrew Graham, Matt Schneider, Craig Walker Logos.
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
LINKED DATA AS A SERVICE WITH THE INFORMATION WORKBENCH SEMTECHBIZ San Francisco 2012 Peter Haase fluid Operations AG.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Michalis Vafopoulos NTUA, GFOSS & The transformers GREEN CITY HACKATHON.
Presentation Outline (hidden slide) Technical Level: 100 Intended Audience: TDMs, ITPros, ITDMs, BI specialists Objectives (what do you want the audience.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
IST Programme - Key Action III Semantic Web Technologies in IST Key Action III (Multimedia Content and Tools) Hans-Georg Stork CEC DG INFSO/D5
W HAT IS I NTEROPERABILITY ? ( AND HOW DO WE MEASURE IT ?) INSPIRE Conference 2011 Edinburgh, UK.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
SHARE (SHared Access Research Ecosystem) Tyler Walters Co-Chair, SHARE Steering Group (a joint committee of the ARL, the AAU, and the APLU) Eric Celeste.
Semantic Web: The Future Starts Today “Industrial Ontologies” Group InBCT Project, Agora Center, University of Jyväskylä, 29 April 2003.
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios.
The Semantic Logger: Supporting Service Building from Personal Context Mischa M Tuffield et al. Intelligence, Agents, Multimedia Group University of Southampton.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
OWL Representing Information Using the Web Ontology Language.
Introduction to the Semantic Web and Linked Data
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
INSPIRE and Linked Data : what are the complementarities? INSPIRE Conference – Istanbul Tutorial/discussion on linked data – june 24th Bénédicte Bucher.
COMMUNITY. Data Acquisition and Usage Value Chain.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
Carl Lagoze Digital Library Service Registry Workshop Services in a Scholarly Communication Framework.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
Federal Land Manager Environmental Database (FED) Overview and Update June 6, 2011 Shawn McClure.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
Linked Library (+AM) Data Presented LITA Next-Generation Catalog IG Corey A Harper Publish, Enrich, Relate and Un-Silo.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
IoT R&I on IoT integration and platforms INTERNET OF THINGS
SQL Server 2008 R2 Report Builder 3.0 SQL Server 2008 Feature Pack Report Builder 2.0 SQL Server 2008 General Availability Authoring & Collaboration (Acquisition:
Paul Eglitis [IEEE] and Siri Jodha S. Khalsa [IEEE]
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Cloud based linked data platform for Structural Engineering Experiment
Integrating Data for Archaeology
Flanders Marine Institute (VLIZ)
Web Engineering.
Federal Land Manager Environmental Database (FED)
Lisa Ruff Business Productivity/Accessibility TS Microsoft Federal
Linked Data for SDG Reporting
C.U.SHAH COLLEGE OF ENG. & TECH.
Interoperability and standards for statistical data exchange
LOD reference architecture
GISELA & CHAIN Workshop Digital Cultural Heritage Network
TOOLS & Projects overview
Australian and New Zealand Metadata Working Group
Palestinian Central Bureau of Statistics
Presentation transcript:

Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete

Data as an asset! One of the most significant changes of the past decade has been the widespread recognition of data as an asset – Data is the new “raw material of business” – Economist Data Products

Emerging Data Ecosystem Big Data has blurred the distinction between public and private Public Volunteered Data Curated Data Observed Data

Emerging Data Subjects data marketers data brokers data aggregators A series of data stewards, custodians, and curators are producing, consuming and brokering data products forming a far more complex value making chain than in traditional enterprise or scientific contexts

What to Do with this Data? Search: – Find structured data when it’s relevant to search queries Visualize, enhance, communicate to relevant audiences – Support Communities [bio- diversity, climate, water, …] Relate data across sources Fusion data from multiple sources – Data integration! Microsoft’s Approach to Big Data

Emerging Data Life-cycle

Data as a Service (DaaS) Data as a Service Software as a Service Platform as a Service Infrastructure as a Service © DaaS promises that data products can be provided on demand to the user regardless of geographic or organizational separation of provider & consumer DaaS brings the notion that data related services can happen in a centralized place – aggregation, quality, cleansing and enriching data and offering it to different systems, applications or mobile users, irrespective of where they were – Virtualized – On-demand – Self-service – Scalable – Pay as you go

Data Marketplaces Services that make it easy to find data from a range of secondary data sources, then consume the data in a usable and unified format – Several of these services are trying to create marketplaces for data, envisioning that data providers can offer their data sets for sale to data seekers (DataMarket.com) Data Aggregation and Curation Layer Data Connection Layer Data Visualization and Analysis Layer Data Hosted by Third Party Data Hosted by Data Provider Data Hosted in Marketplace Data as a Service Preservation Service

9 Vertical Data Markets François Bancilhon Data Publica “de data rerum” WOD Tutorials 2013 Paris VerticalExampleSize (M€) FinancialReuters300 PressPress Index250 LegalFrancis Lefebvre240 SolvabilityAltarès160 Scientific Technical Medical Meteo France160 ImageSipa60 EconomySociété.com55 MarketingAcxiom55 PatentsReuters25

Only a Small Portion of Big Data! idgknowledgehub.com/idc-releases-first-worldwide-big-data-technology-and-services-market-forecast-shows-big-data-as-the-next-essential-capability-and-a-foundation-for-the-intelligent-economy/2012/05/07/

Data Hub for Market Intelligence Source Hjalmar Gislason DataMarket, Inc Emerging DaaS business models: A case study European Data Forum (EDF), Dublin 2013

hortonworks.com/blog/7-key-drivers-for-the-big-data-market

Potential Benefits of Linked Data for Data Marketplaces Abstraction layer for virtualized data access across sources – Basis for enabling automation of datasets discovery, linking&fusion Flexible data representation model (RDF) and global identifiers for all objects (URI) – Makes easier incremental data integration, interactive exploration and ad hoc analysis of data Interlinked datasets – Newly added data can be integrated with existing ones in the marketplace – Network effects Data marketplace interoperability – Data from different marketplaces can be easily federated Derived knowledge / facts – RDF inference of additional implicit facts

Web Data of Increasing Standardization Not all linked data is open and not all open data is linked! ★ Available on the web (whatever format) but with an open license, to be Open Data ★★ Available as machine-readable structured data (e.g. excel vs. image scan of a table) ★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel) ★★★★ as (3), plus using open standards from W3C (RDF and SPARQL ) to identify things through dereferenceable HTTP URIs, to ensure effective access ★★★★★ as all the above plus establishing links between data of different sources File format Recommendations (on a scale of 0-5) csv ★★★ xls ★ pdf ★ doc ★ xml ★★★★ rdf ★★★★★ shp ★★★ ods ★★ tiff ★ jpeg ★ json ★★★ txt ★ html ★★

Key Players Offers Classification Data Cube +

DIACHRON Objectives & Approach Appraising Integrating Archiving Producing Publishing Cleaning Preserve (semi-)structured, interrelated, evolving data by keeping them constantly accessible & reusable from an open framework such as the Data Web Calls for effective & efficient techniques to manage the lifecycle of web data involving data producers, curators, brokers and consumers – Pay-as-you-go data preservation spreading costs among key players in a community of interest Diachronic Data: Enhance data with temporal and provenance annotations as data products are re-used through complex value making chains

DIACHRON Research Agenda How can we assess the quality of harvested datasets in order to decide which (the data quality dimensions problem) and how many versions of them deserve to be preserved for future use (the appraisal problem)? How can we understand dependencies of datasets (the provenance problem) and how can metadata (temporal, spatial, thematic) can be smoothly represented along the data (the annotation problem)? How can we monitor changes of third-party datasets (the evolution tracking problem) or how can local/remote data imperfections (e.g., due to change propagation) can be repaired (the curation problem)? How do we cite particular versions of a dataset (the citation problem), and how will we be able to retrieve them when looking up a reference (the long term accessibility problem)? How do we maintain the consistency of multiple versions of dependent datasets (the archiving problem) and how we will access the datasets along their evolution history (the longitudinal querying problem)?

qq WP4 WP6 WP5 WP9 WP3 WP2 WP8 WP7 DIACHRON Data Services & Work Plan

Diachronic Data Services Lifecycle Data Repurposing Data Archiving Data Evolution Data Appraisal Data Citation

Concluding Remarks The integrated DIACHRON platform and services aim to support long term usability of open and/or linked data published in the Web and within Enterprise Intranets The concept of diachronic data intends to foster self- preserving data embedding an understanding of their evolving semantics, use contexts, and interpretations DIACHRON is expected to: Improve our understanding of how linked/open data evolves Reduce the maintenance costs when integrating linked/ open data Foster data accountability and transparency in open dynamic data spaces Address sustainability issues for preserving Big Data Fix Overall Data Preservation Effort

Business Models for Linked Data Publishers

Business Webs as Types of Value Creation Agora: Open electronic marketplaces with regard to pricing and offered products (e.g. Android marketplace) Aggregation: Closed, controlled electronic marketplaces (e.g. Apple App Store) Distributed Network: Value Network Value Chain: ICT-enabled Value Chains Alliance: Loosely cooperation market players (e.g. Open Source projects)

Data-Driven Business Models Source Michalis Vafopoulos