YourDataStories: Transparency and Corruption Fighting through Data Interlinking and Visual Exploration Georgios Petasis1, Anna Triantafillou2, Eric Karstens3.

Slides:



Advertisements
Similar presentations
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Advertisements

Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Ontology-based User Modeling for Web-based Information Systems Anton Andrejko, Michal Barla and Mária Bieliková {andrejko, barla,
Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
A multi-level metadata approach for a Public Sector Information data infrastructure Nikos Houssos 1,2, Brigitte Jörg 1,3, Brian Matthews 4 1 euroCRIS 2.
Desire2Learn Advanced Learning Analytics Ronald Mol Desire2Learn
Data Intensive Techniques to Boost the Real-time Performance of Global Agricultural Data Infrastructures SEMAGROW U SING A POWDER T RIPLE S TORE FOR BOOSTING.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
1 Oct 30, 2006 LogicSQL-based Enterprise Archive and Search System How to organize the information and make it accessible and useful ? Li-Yan Yuan.
SOA Architecture Delivery Process by Dr. Robert Marcus SRI International 1100 Wilson Boulevard Arlington, VA
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
Mapping Techniques and Visualization of Statistical Indicators Haitham Zeidan Palestinian Central Bureau of Statistics IAOS 2014 Conference.
1/17 RDF Gravity 2/17 Content 1. Introduction  Problem statement and Existing Solutions 3. RDF Gravity 4. Conclusion 5. References.
Semantic Mediation & OWS 8 Glenn Guempel
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
What Can Do for You! Fabian Christ
In The Name Of God. Jhaleh Narimisaei By Guide: Dr. Shadgar Implementation of Web Ontology and Semantic Application for Electronic Journal Citation System.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
EUDOXOS PROJECT Teaching Science with a Robotic Telescope eLearning initiative European Commission nº /001 Laboratory for the Analysis of the.
Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Data on the Web Life Cycle Bernadette Farias Lóscio March, 2014.
Workshop – 10, December 2014, Berlin ICCS / NTUA Greece Efthymios Chondrogiannis An Intelligent Ontology Alignment Tool Dealing with Complicated Mismatches.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
EOSDIS User Registration System (URS) 1 GES DISC User Working Group May 10, 2011 GSFC, NASA.
Semantic Enhancement: Key to Massive and Heterogeneous Data Pools Violeta Damjanovic, Thomas Kurz, Rupert Westenthaler, Wernher Behrendt, Andreas Gruber,
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
SAP BI – The Solution at a Glance : SAP Business Intelligence is an enterprise-class, complete, open and integrated solution.
Data Analytics Challenges Some faults cannot be avoided Decrease the availability for running physics Preventive maintenance is not enough Does not take.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
VisIt Project Overview
Crisis management related research at
Towards a framework for architectural design decision support
Leveraging the Business Intelligence Features in SharePoint 2010
Conceptualizing the research world
PLM, Document and Workflow Management
Cloud based linked data platform for Structural Engineering Experiment
Information Day on “Search Engines for Audio-Visual Content”
Institute of Informatics & Telecommunications NCSR “Demokritos”
Harnessing the Semantic Web to Answer Scientific Questions:
CCNT Lab of Zhejiang University
Presentation of the eTendersNI service Business Intelligence Module
SMART GROUND platform overview
European Network of e-Lexicography
Big Data - in Performance Engineering
Human Complexity of Software
Semantic Annotation service
VIEWS / TSS Overview.
LOD reference architecture
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
Web archives as a research subject
TOOLS & Projects overview
Economy (data) in new context BDE SC6 workshop on 5.12 in Cologne
AI Discovery Template IBM Cloud Architecture Center
From Data to Your Data Stories
Pilot use of Linked Open Data technologies for publishing official statistics: current status in the ESS and Eurostat April 17th, 2018 GISCO WG.
Presentation transcript:

YourDataStories: Transparency and Corruption Fighting through Data Interlinking and Visual Exploration Georgios Petasis1, Anna Triantafillou2, Eric Karstens3 1 National Centre for Scientific Research (NCSR) “Demokritos”, Athens, Greece 2 Athens Technology Center (ATC), Athens, Greece 3 European Journalism Centre (EJC), Maastricht, The Netherlands

Overview Motivation The YourDataStories approach Implementation Evaluation Conclusions and future directions

Open Data

Open Data…

Open Data…

Open Data: Current Status

Open Data: Current Status We have data! Many datasets available Many areas covered Open data has the potential to: spur economic innovation; spur social transformation; and to spur fresh forms of political and government accountability But… heterogeneous & immature

Motivation Economic open data: Current status With great potential! Diverse and Incompatible Lack of standardization Plethora of vocabularies in use Lack of visibility of existing data Citizens/users do not seem to be attracted by the existing solutions and tools With great potential! We need tools and infrastructure to alleviate these problems Data is diverse and incompatible difficult and tedious procedures to processing and consumption combining, aligning, merging different datasets for a more complete picture, processing different time-spans of the same dataset concentrating on specific aspects of the data, through filtering. The lack of standardization plethora of vocabularies in use less effective usage and consumption production of low quality results and conclusions being less capable for machine consumption visibility of existing data is quite limited tools and public services providing access to open data are not massively popular# they do not seem to answer anything beyond the simple questions that can be answered by an indexed search engine. Citizens and users do not seem to be attracted by the existing solutions and tools Hardly any interaction/integration between open data and social media, due to lack of social characteristics

YourDataStories Solution A platform for data exploration focused on the financial flows that are critical for transparency, collaboration, participation

Our Approach (1) The starting point is a triple store We assume that economic data have been already crawled, cleaned, converted to RDF following a common ontology and stored in a triple-store A SPARQL endpoint is analysed Through a set of predefined queries Aiming to retrieve the underlying data model Classes, properties, types, cardinality Analysis at the RDFS level

Our Approach (2) The result of the analysis is a set of graphs “Top” nodes Representing classes Nodes representing Properties Data types Nodes have: Scale Role Cardinality

Our Approach (3) The “top” concepts are heuristically identified Using information like graph centrality The information contained in graphs is used to automatically support operations Data selection Data visualisation Analytics

Data Selection (1) Graphs are used to extract an indexing schema Currently Apache Solr is supported Instances of top concepts are converted into JSON-LD objects and indexed JSON-LD objects can contain instances from non-top concepts The YDS “advanced search” application is configured For querying indexed resources

Data Selection (2)

Data Visualisation (1) Graphs are used to extract visualisation information What properties can be used as x/y axes, in which plot types The YDS “Workbench” application is configured For generating custom plots of various types

Data Visualisation (2)

Input Data Model Analysis Architecture Search SPARQL Cache (MySQL) JSON-LD API JSON Cache (MySQL) Model Specification Web Applications Analytics Developers … Input Data Model Analysis Search Configuration SPARQL Views

Components and Applications

Components and Applications

Components and Applications

Components and Applications

Evaluation (1) The proposed approach has been used to create a set of applications On some preselected datasets Several development cycles were evaluated More than 60 users Primarily journalists, public sector employees, representatives of NGOs and business Two scenarios: Complete half-open scenarios Explore the solution on their own

Evaluation (2) Evaluation Results

Conclusions and Future Work Evaluation results suggest that: Non-experts can access and analyse data with minimum time and effort Integrating data from different sources, along with the powerful navigation and visualisations, enables insights that cannot be gleaned from original sources Test our approach on more datasets On domains other than economic data Assess the development of analytics for a new domain

http://www.yourdatastories.eu http://platform.yourdatastories.eu Thank you! http://www.yourdatastories.eu http://platform.yourdatastories.eu