1 Evaluating Health-Care Disparity Employing Linked Data and Data -driven Discovery Amrapali Zaveri AKSW, Institut für Informatik 1.


Similar presentations
Creating Knowledge out of Interlinked Data Wissenserschliessung um Web Page 1 Vom Web der vernetzten Daten zum Web vernetzten.

Counting Chronic Hepatitis B cases in York Region: Denis Heng York Region Community and Health Services APHEO Conference - “Explaining the Miracle: Statistics.
CHRONIC OBSTRUCTIVE PULMONARY DISEASE Major Killer disease in the world Chronic progressive disease of the lung caused by inhalation of toxic substances.
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Using the Semantic Web for Web Searches Norman Piedade de Noronha, Mário J. Silva XLDB / LaSIGE, Faculdade de Ciências, Universidade de Lisboa.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
ECOWREX ECOWAS OBSERVATORY FOR RENEWABLE ENERGY AND ENERGY EFFICIENCY Prospective development of the ECOWAS Observatory for RE&EE Accra, Ghana 29 October.
Linked TCM and Drug Datasets Background  Traditional Chinese Medicine (TCM), which is a type of alternative medicine, is receiving growing attention from.
International Centre for Integrated Mountain Development Kathmandu, Nepal ICIMOD KM for Climate Change Adaptation Linking Regional Adaptation Knowledge.
The International Longevity Centre-UK is an independent, non-partisan think-tank dedicated to addressing issues of longevity, ageing and population change.
Paul Groth VU University Amsterdam Convergence Meeting: Semantic Interoperability for Clinical Research & Patient.
Logics for Data and Knowledge Representation
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
The Global Health Observatory (GHO) Home for injury indicators? Kidist Bartolomeos World Health Organization ICE-Injury Swansea, UK 19 Sep 2010.
1 INFRA : INFRA : Scientific Information Repository supporting FP7 “The views expressed in this presentation are those of the author.
Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large- scale Data Collections Xuan-Hieu PhanLe-Minh NguyenSusumu Horiguchi GSIS,
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Nursing Health Services Research Unit 1 Better Data and Innovative Research Methodologies: Tools for Better Global Health Mary Crea-Arsenio, MSc. Andrea.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Dr. David Mowat June 22, 2005 Federal, Provincial & Local Roles Surveillance of Risk Factors and Determinants of Chronic Diseases.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
Samad Paydar Web Technology Lab. Ferdowsi University of Mashhad 10 th August 2011.
IBIS-Admin New Mexico’s Web-based, Public Health Indicator, Content Management System.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
IBISAdmin Utah’s Web-based Public Health Indicator Content Management System.
Hampshire Hub Data Platform Progress update 1 October Bill Roberts Swirrl.
Semi-Automatic Quality Assessment of Linked Data without Requiring Ontology Saemi Jang, Megawati, Jiyeon Choi, and Mun Yong Yi KIRD, KAIST NLP&DBPEDIA.
Problem?.  Why are we studying it?  Why do some people feel it is a problem?
Wellness-Rules: A Web 3.0 Case Study in RuleML-Based Prolog-N3 Profile Interoperation Harold Boley Taylor Osmun Benjamin Craig Institute for Information.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
1 Not So Strange Bedfellows: Information Standards For Librarians AND Publishers November 6, 2015.
Toward a framework for statistical data integration Ba-Lam Do, Peb Ruswono Aryan, Tuan-Dat Trinh, Peter Wetz, Elmar Kiesling, A Min Tjoa Linked Data Lab,
1 The Semantic Web Jonathan Jackson GCUU Master’s Seminar Spring 2005.
Prizms for Data Publication and Management May 9, 2014 Katie Chastain.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
HINARI/Health Information on the Internet (module 1.3 Part A)
PARIS21 National Strategies for the Development of Statistics: Design and implementation issues 8 July 2010, Nouméa.
SNOMED CT A Technologist’s Perspective Gaur Sunder Principal Technical Officer & Incharge, National Release Center VC&BA, C-DAC, Pune.
© Copyright 2015 STI INNSBRUCK PlanetData D2.7 Recommendations for contextual data publishing Ioan Toma.
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
Information Sharing on the Social Semantic Web Aman Shakya* and Hideaki Takeda National Institute of Informatics, Tokyo, Japan The Second NEA-JC Workshop.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
Role of Metadata in dissemination of census data Regional Seminar on dissemination and spatial analysis of census data, Nairobi, September, 2010.
Jürgen C Schmidt, Deputy Head, Public Health Data Science
Research using Registries
Cloud based linked data platform for Structural Engineering Experiment
Harnessing the Semantic Web to Answer Scientific Questions:
Progress in Implementing collaborative TB/HIV activities
Population Problem?.
Lifting Data Portals to the Web of Data
11. The future of SDMX Introducing the SDMX Roadmap 2020
Business Register Redesign Technology Strategy Plan
The Linked Data Cloud Source: Chris Bizer. Linking Open Drug Data Susie Stephens, Principal Research Scientist, Eli Lilly.
SDMX: an Overview Abdulla Gozalov UNSD.
Conference on New Technologies for official Statistics
Costs of Operating Population-Based Cancer Registries: Results from Four Sub-Saharan African Countries Florence Tangka, PhD Senior Health Economist, Division.
Future of EDAMIS Webforms
Presentation transcript:

1 Evaluating Health-Care Disparity Employing Linked Data and Data -driven Discovery Amrapali Zaveri AKSW, Institut für Informatik 1

2 Outline Motivation Methodology o Datasets o CSV to RDF Conversion o Interlinking using SILK o Validation by Linked Data Querying Conclusions Limitations Future Work

3 Motivation According to the World Health Organization (WHO), more than one billion people (i.e. one sixth of the world’s population) suffer from one or more neglected tropical diseases. This shows a significant imbalance between the research intensity invested for the investigation of certain diseases and their prevalence. Reason current absence of accurate, interlinked data and information

4 Methodology

5 Datasets DATASET LINKED DATA VERSION NUMBER OF TRIPLES ClinicalTrials.govLinkedCT9.8 million PubMedBio2RDF’s PubMed797 million WHO’s Global Health Observatory (GHO) Not yet available-

6 CSV to RDF Conversion WHO’s GHO dataset Published as Excel sheets Advantage Readable by humans Disadvantages Cannot be queried efficiently Difficult to integrate with other data (in different formats) Our approach Converting data into a single data model - RDF Using SCOVO (Statistical Core Vocabulary)* designed particularly to represent multidimensional statistical data using RDF. *Michael Hausenblas,et.al. Scovo: Using statistics on the web of data. In ESWC,

7 What is SCOVO?

8 Semi-automated approach Transforming CSV to RDF in a fully automated way is not feasible. Dimensions may often be encoded in heading or label of a sheet Our semi-automatic approach: As a plug-in in OntoWiki # a semantic collaboration platform developed by the AKSW research group. A CSV file is converted into RDF using SCOVO # Sören Auer et.al.: OntoWiki: A Tool for Social Semantic Collaboration In: Proceedings of the Workshop on Social and Collaborative Construction of Structured Knowledge CKC 2007 at the 16th International WWW2007 Banff, Canada, 2007

9 SCOVOfied GHO Data prefix ex: prefix scv: ex:Country rdfs:subClassOf scv:Dimension; rdf:type rdfs:Class; dc:title "Country". ex:Disease rdfs:subClassOf scv:Dimension; rdf:type rdfs:Class; dc:title "Disease". ex:CountryCode rdfs:subClassOf scv:Dimension; rdf:type rdfs:Class; dc:title "CountryCode". ex: Afghanistan rdf:type ex:Country; dc:title "Afghanistan". ex:Tuberculosis rdf:type ex:Disease; dc:title "Tuberculosis". ex:3010 rdf:type ex:CountryCode; dc:title “3010”. ex:c1-r6 rdf:type scv:Item; rdf:value 127; scv:dimension ex:Afghanistan; scv:dimension ex:Tuberculosis. scv:dimension ex:3010 Result: 3 million triples

10 Interlinking Datasets using SILK

11 Interlinking Results Number of interlinks obtained between datasets Interlinks for: Publications - already present Disease - used SILK $ Country - used SILK $ $ Julius Volz, Christian Bizer, Martin Gaedke, Georgi Kobilarov: Discovering and Maintaining Links on the Web of Data. International Semantic Web Conference (ISWC2009), Westfields, USA, October

12 Validation by Linked Data Querying PREFIX who: PREFIX ct: PREFIX pubmed: SELECT DISTINCT ?disease ?incidence ?country WHERE { ?x who:country "India". ?x who:incidence ?incidence. ?x who:disease ?disease. FILTER(?incidence>70) } SELECT DISTINCT ?disease ?country ?noOfTrials WHERE { ?diseasewho:disease"Tuberculosis". ?yct:disease?disease. ?yct:noOfTrials ?noOfTrials. ?yct:country?country. } SELECT ?country COUNT(?reference) WHERE { ?disease who:disease "Tuberculosis". ?z ct:disease ?disease. ?z ct:country ?country. ?z pubmed:reference ?reference. }GROUP BY ?country

13 Conclusions Which disease has the highest percentage of health-care disparity with respect to the burden of disease and the clinical trials conducted in a particular country? As a research policy maker, which research area would it be most beneficial to allocate funds? Who are the key people doing most research for a particular disease? What has been the trend, overtime, for the health-care disparity for a particular region?

14 Limitations Information Quality Coverage Interlinking Quality Propagation of Errors

15 Future Work Improve Interlinking Interlinking with other relevant datasets Updating knowledge-base as new data is published Creating a user interface

16 Acknowledgements Research group Agile Knowledge Engineering & Semantic Web (AKSW): Research on Research Group: