The Linked Data Cloud Source: Chris Bizer. Linking Open Drug Data Susie Stephens, Principal Research Scientist, Eli Lilly.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

The Integration of Biological Data Using Semantic Web Technologies Susie Stephens Principal Product Manager, Life Sciences Oracle
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
RDB2RDF: Incorporating Domain Semantics in Structured Data Satya S. Sahoo Kno.e.sis CenterKno.e.sis Center, Computer Science and Engineering Department,
The Open Innovation Center Susie Stephens, Principal Research Scientist, Eli Lilly.
Knowledge Graph: Connecting Big Data Semantics
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin and Vasant Honavar. BigData2013.
Coordinating data interoperability – a W3C perspective M. Scott Marshall, Ph.D. W3C HCLS IG co-chair Leiden University Medical Center University of Amsterdam.
©2013 MFMER | slide-1 Building A Knowledge Base of Severe Adverse Drug Events Based On AERS Reporting Data Using Semantic Web Technologies Guoqian Jiang,
© SIOC sections Copyright 2008 Digital Enterprise Research Institute. © SWAN sections Copyright 2008 Massachusetts General Hospital. All rights reserved.
Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Linked Open Data: a new resource for eResearch Dr Anne Cregan eResearch Analyst, Intersect and ANDS
Linked TCM and Drug Datasets Background  Traditional Chinese Medicine (TCM), which is a type of alternative medicine, is receiving growing attention from.
Paul Groth VU University Amsterdam Convergence Meeting: Semantic Interoperability for Clinical Research & Patient.
Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam.
Advancing translational research with the Semantic Web Ruttenberg, Clark, Bug, Samwald, Bodenreider, Chen, Doherty, Forsberg, Gao, Kashyap, Kinoshita,
Controlled Vocabulary Working Group PRESENTED BY JOHN PORTER.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
Flexible Text Mining using Interactive Information Extraction David Milward
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
Computational Tools for Population Biology Tanya Berger-Wolf, Computer Science, UIC; Daniel Rubenstein, Ecology and Evolutionary Biology, Princeton; Jared.
Semantic Web Portal: A Platform for Better Browsing and Visualizing Semantic Data Ying Ding et al. Jin Guang Zheng, Tetherless World Constellation.
1 Evaluating Health-Care Disparity Employing Linked Data and Data -driven Discovery Amrapali Zaveri AKSW, Institut für Informatik 1.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
Improving Research Data Sharing and Reuse: Scientists and Repositories Michael Conlon, PhD Emeritus Faculty Member, University of Florida VIVO Project.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Build Your Own Identity Hub Ted Lawless Code4Lib 2016 – March 8 th, 2016.
How Semantic Web Technologies are Enabling the Bench to Bedside Vision Susie Stephens Principal Product Manager, Life Sciences Oracle
Patient Engagement in Drug Development: Experiences, Good Practices and Lessons Learned Lana Skirboll VP Science Policy Sanofi October 28, 2016, National.
W3C Semantic Web for Health Care and Life Sciences Interest Group
Linking Open Drug Data (HCLSIG LODD)
W3C Semantic Web for Health Care and Life Sciences Interest Group
Harnessing the Semantic Web to Answer Scientific Questions:
BioRDF Overview and Update By Kei Cheung, Ph. D
HCLS Scientific Discourse C-SHALS 2009
Cloud based linked data platform for Structural Engineering Experiment
Harnessing the Semantic Web to Answer Scientific Questions:
CCNT Lab of Zhejiang University
W3C Semantic Web for Health Care and Life Sciences Interest Group
Overview Linked Data Principals Linking Open Drug Data.
The 2007 Winter Conference on Business Intelligence
Sponsored by the University of Southampton
Christian Ansorge Arona, 09/04/2014
BioRDF Task: Building a Knowledgebase for Neuroscience
Ontology Evolution: A Methodological Overview
HCLS Scientific Discourse Progress Report
11/15/2018 Drug Side Effects Data Representation and Full Spectrum Inferencing using Knowledge Graphs in Intelligent Telehealth Presented on Student-Faculty.
Scientific Discourse Task Tim Clark Massachusetts General Hospital & Harvard Medical School W3C HCLS MIT April 30, 2009.
Alexandre Passant1, Paolo Ciccarese2, 3, John G
WikiNeuron: Semantic Neuro-Mashup
Linking Open Drug Data (HCLSIG LODD)
Task Breakouts BioRDF Pharma Ontology Scientific Discourse
HCLS Tutorial: The W3C Health Care and Life Sciences Interest Group
An ontology for e-Research
W3C Semantic Web for Health Care and Life Sciences Interest Group
Kei Cheung, Ph.D. Yale Center for Medical Informatics
Facilitating Navigation on Linked Data through Top-K Link Patterns
Kei Cheung, Ph.D. Yale Center for Medical Informatics
BioRDF Task Force.
Linking Open Drug Data (HCLSIG LODD)
Harnessing the Semantic Web to Answer Scientific Questions:
Presentation transcript:

Linking Open Drug Data Susie Stephens, Principal Research Scientist, Eli Lilly

The Linked Data Cloud Source: Chris Bizer

Linking Open Drug Data HCLSIG task started October 1, 2008 Primary Objectives Survey publicly available data sets about drugs Publish and interlink these data sets on the Web Explore interesting questions in competitive intelligence that could be answered if the data sets are linked Participants: Bosse Andersson, Chris Bizer, Kei Cheung, Don Doherty, Oktie Hassanzadeh, Anja Jentzsch, Scott Marshall, Eric Prud’hommeaux, Matthias Samwald, Susie Stephens, Jun Zhao

Assessment of Data Sources Mark Sharp et al. A Framework for Characterizing Drug Information Sources. AMIA 2008

Published Data Sets LinkedCT (http://linkedct.org) Online registry of more than 60,000 clinical trials Published in XML 7,011,000 triples (290,000 interlinking) DrugBank (http://www4.wiwiss.fu-berlin.de/drugbank) A repository of almost 5,000 FDA-approved drugs Published as DrugBank DrugCards 1,153,000 triples (23,000 interlinking) DailyMed (http://www4.wiwiss.fu-berlin.de/dailymed/) High quality information about marketed drugs Flat file representation 124,000 triples (29,600 interlinking) Diseasome (http://www4.wiwiss.fu-berlin.de/diseasome) Information about 4,300 disorders and disease genes linked by known disorder-gene associations 88,000 triples (23,000 interlinking)

Classes of Links Based on common identifiers Links present in the source data sets Based on link discovery and record linkage techniques String matching E.g., “Alzheimer’s disease” in LinkedCT was matched with “Alzheimer_disease” in Diseasome Semantic matching E.g. “Varenicline” has the synonym “Varenicline Tartrate” and the brand names “Champix” and “Chantix”

Business Use Case A neuroscience focused business manager is interested in seeing an update on new clinical trials by competitors on Alzheimer’s Disease (AD) A phase III trial by Pfizer for a drug called Varenicline has just been listed in linkedCT More information of interest is found in DBpedia, DailyMed, and DrugBank DailyMed indicates the drug is already on the market for Nicotine addiction and has minimal side effects DrugBank allows the manager to see the targets for Varenicline Diseasome, however, indicates that the corresponding genes are only implicated in nicotine addiction, rather than AD This suggests a more complex relationship between the diseases than just the drug target Extending the browsing to the SWAN Knowledgebase shows that there are hypotheses relating AD to nicotine receptors through amyloid beta

Technical Challenges Life sciences data is difficult to connect due to inconsistent terminology and the prevalence of synonyms, and homonyms Refinement of tools and techniques for enabling more automatic linking of entities across data sets Selection of ontologies to enable consistent mappings Development a sufficiently robust platform as to enable inferencing Provide an interface to users that supports browsing, querying, and filtering data Persuade data providers to publish in RDF would alleviate the need for us to update data, and provide some of the interlinking

Next Steps Ensure that existing data are accurately and comprehensively linked Incorporate additional data sources into the LODD cloud that are of interest to competitive intelligence (e.g. Traditional Chinese Medicine) Use novel link discovery tools and frameworks including Silk and LinQuer Explore using SIOC to aggregate information as what patients are saying about drugs Submit paper to the iTriplify Challenge

Task Alignment LODD is looking to use Pharma Ontology’s work to help inform the mappings Data converted to RDF is also loaded into BioRDF’s HCLS KB

Conclusions Added 4 drug-related data sets into the cloud for competitive intelligence Will add further data sources to the LODD cloud to enable more insights to be gleaned Will continue to explore and test tools that are being developed for LOD