Linked TCM and Drug Datasets Background  Traditional Chinese Medicine (TCM), which is a type of alternative medicine, is receiving growing attention from.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

DBpedia: A Nucleus for a Web of Open Data
Knowledge Graph: Connecting Big Data Semantics
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin and Vasant Honavar. BigData2013.
Coordinating data interoperability – a W3C perspective M. Scott Marshall, Ph.D. W3C HCLS IG co-chair Leiden University Medical Center University of Amsterdam.
Chapter 3 Querying RDF stores with SPARQL. TL;DR We will want to query large RDF datasets, e.g. LOD SPARQL is the SQL of RDF SPARQL is a language to query.
VIVO and Linked Open Data December 13, 2010 Dean B. Krafft Chief Technology Strategist and Director of IT Cornell University Library.
Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Rubber Hits the Road: Why NEMO needs RDF Paea LePendu Stanford Center for Biomedical Informatics Research National Center for Biomedical Ontology (NCBO)
Enterprise Linked Data Seán O’Riain Domain of eBusiness Digital Enterprise Research Institute - National University of Ireland, Galway  Copyright 2010.
Grace CHENG Lewis CHOI Knowledge Management Unit Hospital Authority Leveraging Knowledge from Clinical Guidelines through Information Technologies.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
Aug. 14, IASLOD Linking Korean Resources to LOD: Issues in Localization Mun Y. Yi.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Service System for Management and Sharing of Scientific Data in Medicine Depei Liu, Ph.D. Chinese Academy of Medical Sciences.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Linked Open Data: a new resource for eResearch Dr Anne Cregan eResearch Analyst, Intersect and ANDS
Mike Conlon Here’s Mike on a conference call from his home. Mike spends a lot of time on conference calls from his home, and from coffee shops in and around.
© Copyright 2012 STI INNSBRUCK
Paul Groth VU University Amsterdam Convergence Meeting: Semantic Interoperability for Clinical Research & Patient.
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
Advancing translational research with the Semantic Web Ruttenberg, Clark, Bug, Samwald, Bodenreider, Chen, Doherty, Forsberg, Gao, Kashyap, Kinoshita,
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
2014-May-07. What is the problem? What have others done? What is our solution? Does it work? Outline 2.
Biomedical Databases & Tools Rolando Garcia-Milian Biomedical & Health Information Services Department Health Sciences Center Library.
Boris Villazón-Terrazas, Ghislain Atemezing FI, UPM, EURECOM, Introduction to Linked Data.
Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise.
An integrative approach to drug repositioning: a use case for semantic web technologies Paul Rigor Institute for Genomics and Bioinformatics Donald Bren.
Visualizing Linked Open Data Andra Waagmeester. Overview Context: Pathways Howto: Linked data Make sense of linked data Visualizing linked data.
Benchmarking ontology-based annotation tools for the Semantic Web Diana Maynard University of Sheffield, UK.
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud Ali Hasnain et. al Insight Center for Data Analytics National University of Ireland,
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Using linked data to interpret tables Varish Mulwad September 14,
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
The TriQL.P Browser Filtering Information using Context-, Content- and Rating-Based Trust Policies Christian Bizer, Freie Universität Berlin, Germany Richard.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
The Semantic Web Matt Klubertanz. What is it? “The Semantic Web is an extension of the current web in which information is given well- defined meaning,
KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,
Semantic Web Portal: A Platform for Better Browsing and Visualizing Semantic Data Ying Ding et al. Jin Guang Zheng, Tetherless World Constellation.
1 Evaluating Health-Care Disparity Employing Linked Data and Data -driven Discovery Amrapali Zaveri AKSW, Institut für Informatik 1.
Paloma Marín Arraiza 17 th International Conference on Grey Literature 1 st and 2 nd December 2015, Amsterdam (Netherlands) SCIENTIFIC AUDIOVISUAL MATERIALS.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Making Connections Creating Linked Open Data Neil Wilson Head, Collection Metadata UKSG Webinar June
Use SIOC RDF format for representation of scientific statements Annotated statements created by manual curation automated extraction of biomedical literature.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Linking Open Drug Data (HCLSIG LODD)
W3C Semantic Web for Health Care and Life Sciences Interest Group
Linked Data Web that can be processed by machines
BioRDF Overview and Update By Kei Cheung, Ph. D
Harnessing the Semantic Web to Answer Scientific Questions:
Linked Data and Libraries
CCNT Lab of Zhejiang University
Overview Linked Data Principals Linking Open Drug Data.
Sponsored by the University of Southampton
Unit for Natural Language Processing
WikiNeuron: Semantic Neuro-Mashup
Linking Open Drug Data (HCLSIG LODD)
HCLS Tutorial: The W3C Health Care and Life Sciences Interest Group
The Linked Data Cloud Source: Chris Bizer. Linking Open Drug Data Susie Stephens, Principal Research Scientist, Eli Lilly.
Kei Cheung, Ph.D. Yale Center for Medical Informatics
Service-enabling Biomedical Research Enterprise
Kei Cheung, Ph.D. Yale Center for Medical Informatics
Linking Open Drug Data (HCLSIG LODD)
Presentation transcript:

Linked TCM and Drug Datasets Background  Traditional Chinese Medicine (TCM), which is a type of alternative medicine, is receiving growing attention from patients and biomedical researchers in the western world.  In spite of this growing attention, TCM has not been included as part of standard care in many western countries mainly due to a lack of scientific evidence for its efficacy and safety.  In addition, many of the documentations about TCM are not available in English, creating a language barrier to patients, scientists, and physicians in the West.  We re-formatted the TCMGeneDIT database ( in the RDF format (as Linked Open Data), making it programmatically accessible through a flexible query language (SPARQL) and a flexible Web service (SPARQL endpoint).  This work represents collaboration between the BioRDF task force and the LODD (Linked Open Drug Data) task force of the Semantic Web for Health Care and Life Sciences Interest Group chartered by the World Wide Web Consortium (W3C).  We demonstrate how Linked Data can be used to connect TCM and western medicine.  We describe a novel approach of creating links between RDF datasets in a large scale.  More information can be found at: Linked TCM and Drug Datasets Creation of Data Interlinks Silk: Discovers RDF links between data sources [1]  Provides a declarative language for specifying link types and conditions  Implemented similarity metrics include string, numeric, data, URI, and set comparison methods as well as a taxonomic matcher that calculates the semantic distance between two concepts within a concept hierarchy  Each metric evaluates to a similarity value between 0 or 1  Metrics can be grouped by aggregation operators and weighted individually, with higher-weighted metrics having a greater influence on the aggregated result Customized SPARQL queries for mapping genes names  Firstly, search for mapping Entrez genes from SPARQL endpoint [ using exact gene name mapping as filters  Manually correct many to one gene mappings using Entrez and TCM database web pages Future work  Incorporate additional data sources, e.g., herbal and/or TCM related sources as well as genomic/clinical/drug data sources  Explore multi-lingual interlinking  Develop new use cases and user-facing applications  Automatic notification on interlink updates between datasets Application Use Cases For patients  Search for clinical trials of a given herb (clinicaltrial.gov)  Find out side-effect information about a given herb For researchers  Confirm target genes  Find target genes of a herb for a given disease, as reported by alternative medicine researchers  Find diseases associated with these target genes, as reported by western medical researchers  Drug discovery  Search for the chemical compounds of the herb ingredients  Search for target proteins of these compounds  Identify interesting proteins from this network of proteins Alzheimer’s herbs with side effects. Alzheimer’s herbs. drugs with no side effects reported. drugs with reported side effects.  All 10 herbs may produce side effects  65% ingredients with no reported side effects aTags  A simple convention for formulating statements on the Semantic Web.  These statements are linked with the large cloud of linked data on the web.  aTags were created by manual curation of scientific literature, using a simple, browser based curation system called 'aTag Generator'. An example of an aTag in Turtle syntax: a sioc:Item ; sioc:content "Ginkgolide B from G. biloba is a platelet-activating factor (PAF) antagonist"; sioc:topic,,, rdfs:seeAlso. The interlinking data cloud of RDF-TCM and LODD datasets. Table 1 summaries the number of triples of key entities in each dataset. Table 2 summaries the number of links to RDF-TCM for different types of entities, and the percentage of each type of RDF-TCM entities being linked to another dataset. Table1. Table 2. Representation of Data Interlinks rdf:type void:Linkset ; void:target ; void:linkPredicate owl:sameAs. oddlinker:linkage_date " "^^xsd:date ; oddlinker:linkage_method :silk ; rdf:typeoddlinker:linkage_run. oddlinker:link_source dbpedia:Retinal_detachment ; oddlinker:link_target tcm;Retinal_Detachment ; oddlinker:linkage_score 1 ; oddlinker:link_type owl:sameAs ; oddlinker:linkage_run ; dcterms:isPartOf ; rdf:type oddlinker:interlink.  For the set of links created for any two datasets:  voiD:LinkSet [2]  oddlinker:linkage_run [3]  For each link:  oddlinker:interlink [3] Ingredient# of side effects Progesterone100 Testosterone100 Adenosine57 Mannitol40 Folic_acid22 Lactulose11 Acetic_Acid4 EntityData SourceCount GeneRDF-TCM945 Diseasome3919 Drugbank4553 Medicine/DrugRDF-TCM848 Drugbank4772 Dailymed4308 SIDER924 IngredientRDF-TCM1064 Dailymed1240 DiseaseRDF-TCM553 Diseasome4213 EffectRDF-TCM241 SideEffectSIDER1738 ClinicalTrialLinkedCT61,920 EntityData SourceCount% DiseaseDBPedia SIDER Diseasome MedicineDBPedia Drugbank10.12 GeneEntrezGene DBPedia Drugbank Diseasome IngredientDailymed [1] Julius Volz, Christian Bizer, Martin Gaedke, and Geogi Kobilarov. Silk – A Link Discovery Framework for the Web of Data. LDOW’09, Madrid, 2009 [2] Keith Alexander, Richard Cyganiak, Michael Hausenblas, and Jun Zhao, voiD- Vocabulary of Interlinked Datasets. [3] Oktie Hassanzadeh and Mariano Consens, Linked Movie Data Base, LDOW’09 Madrid, 2009 Linked Data for Connecting Traditional Chinese Medicine and Western Medicine Jun Zhao 1, Anja Jentzsch 2, Matthias Samwald 3 and Kei-Hoi Cheung 4 1 Department of Zoology, University of Oxford, Oxford, UK 2 Web-based Systems Group, Freie Universität Berlin, Berlin, Germany 3 Digital Enterprise Research Institute, National University of Ireland Galway, Galway, Ireland // Konrad Lorenz Institute for Evolution and Cognition Research, Altenberg, Austria 4 Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut, USA