Download presentation
Presentation is loading. Please wait.
Published byBrice Smith Modified over 6 years ago
1
YourDataStories: Transparency and Corruption Fighting through Data Interlinking and Visual Exploration Georgios Petasis1, Anna Triantafillou2, Eric Karstens3 1 National Centre for Scientific Research (NCSR) “Demokritos”, Athens, Greece 2 Athens Technology Center (ATC), Athens, Greece 3 European Journalism Centre (EJC), Maastricht, The Netherlands
2
Overview Motivation The YourDataStories approach Implementation
Evaluation Conclusions and future directions
3
Open Data
4
Open Data…
5
Open Data…
6
Open Data: Current Status
7
Open Data: Current Status
We have data! Many datasets available Many areas covered Open data has the potential to: spur economic innovation; spur social transformation; and to spur fresh forms of political and government accountability But… heterogeneous & immature
8
Motivation Economic open data: Current status With great potential!
Diverse and Incompatible Lack of standardization Plethora of vocabularies in use Lack of visibility of existing data Citizens/users do not seem to be attracted by the existing solutions and tools With great potential! We need tools and infrastructure to alleviate these problems Data is diverse and incompatible difficult and tedious procedures to processing and consumption combining, aligning, merging different datasets for a more complete picture, processing different time-spans of the same dataset concentrating on specific aspects of the data, through filtering. The lack of standardization plethora of vocabularies in use less effective usage and consumption production of low quality results and conclusions being less capable for machine consumption visibility of existing data is quite limited tools and public services providing access to open data are not massively popular# they do not seem to answer anything beyond the simple questions that can be answered by an indexed search engine. Citizens and users do not seem to be attracted by the existing solutions and tools Hardly any interaction/integration between open data and social media, due to lack of social characteristics
9
YourDataStories Solution
A platform for data exploration focused on the financial flows that are critical for transparency, collaboration, participation
10
Our Approach (1) The starting point is a triple store
We assume that economic data have been already crawled, cleaned, converted to RDF following a common ontology and stored in a triple-store A SPARQL endpoint is analysed Through a set of predefined queries Aiming to retrieve the underlying data model Classes, properties, types, cardinality Analysis at the RDFS level
11
Our Approach (2) The result of the analysis is a set of graphs
“Top” nodes Representing classes Nodes representing Properties Data types Nodes have: Scale Role Cardinality
12
Our Approach (3) The “top” concepts are heuristically identified
Using information like graph centrality The information contained in graphs is used to automatically support operations Data selection Data visualisation Analytics
13
Data Selection (1) Graphs are used to extract an indexing schema
Currently Apache Solr is supported Instances of top concepts are converted into JSON-LD objects and indexed JSON-LD objects can contain instances from non-top concepts The YDS “advanced search” application is configured For querying indexed resources
14
Data Selection (2)
15
Data Visualisation (1) Graphs are used to extract visualisation information What properties can be used as x/y axes, in which plot types The YDS “Workbench” application is configured For generating custom plots of various types
16
Data Visualisation (2)
17
Input Data Model Analysis
Architecture Search SPARQL Cache (MySQL) JSON-LD API JSON Cache (MySQL) Model Specification Web Applications Analytics Developers … Input Data Model Analysis Search Configuration SPARQL Views
18
Components and Applications
19
Components and Applications
20
Components and Applications
21
Components and Applications
22
Evaluation (1) The proposed approach has been used to create a set of applications On some preselected datasets Several development cycles were evaluated More than 60 users Primarily journalists, public sector employees, representatives of NGOs and business Two scenarios: Complete half-open scenarios Explore the solution on their own
23
Evaluation (2) Evaluation Results
24
Conclusions and Future Work
Evaluation results suggest that: Non-experts can access and analyse data with minimum time and effort Integrating data from different sources, along with the powerful navigation and visualisations, enables insights that cannot be gleaned from original sources Test our approach on more datasets On domains other than economic data Assess the development of analytics for a new domain
25
http://www.yourdatastories.eu http://platform.yourdatastories.eu
Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.