Presentation is loading. Please wait.

Presentation is loading. Please wait.

W3C Semantic Web for Health Care and Life Sciences Interest Group

Similar presentations


Presentation on theme: "W3C Semantic Web for Health Care and Life Sciences Interest Group"— Presentation transcript:

1 W3C Semantic Web for Health Care and Life Sciences Interest Group

2 Background of the HCLS IG
Originally chartered in 2005 Chairs: Eric Neumann and Tonya Hongsermeier Re-chartered in 2008 Chairs: Scott Marshall and Susie Stephens Team contact: Eric Prud’hommeaux Broad industry participation Over 100 members Mailing list of over 600 Background Information

3 Mission of HCLS IG The mission of HCLS is to develop, advocate for, and support the use of Semantic Web technologies for Biological science Translational medicine Health care These domains stand to gain tremendous benefit by adoption of Semantic Web technologies, as they depend on the interoperability of information from many domains and processes for efficient decision support

4 Translating across domains
Translational medicine – use cases that cross domains Link across domains and research: What are the links? gene – transcription factor – protein pathway – molecular interaction – chemical compound drug – drug side effect – chemical compound

5 Challenges Support of legacy data(bases) Federated Query Interface (e.g. support for auto-completion, identifier lookup) Terminology and Ontology alignment Large scale reasoning (over large KB) Modeling hypothetical knowledge

6 Vision: Concept-based interfaces
The scientist should be able to work in terms of commonly used concepts. The scientist should be able to work in terms of personal concepts and hypotheses. - Not be forced to map concepts to the terms that have been chosen for a given application by the application builder.

7 Interface Sketch: Finding a basis for relation
Hypothesis Epigenetic Mechanisms “There is a relation” Transcription Chromatin Transcription Factors Histone Modification Transcription Factor Binding Sites Conceptual Biology Drag a line between two concepts: results are like a query “Can the “influence checker” find a path through the instances?” Classes Instances Common Domain position

8 Biological cartoon as interface
Semantic approach to enable a systems approach, but start with two elements to integrate: Transcription factor binding sites and histones… KSinBIT’06 Source: Marco Roos

9 Group Activities Document use cases to aid individuals in understanding the business and technical benefits of using Semantic Web technologies Document guidelines to accelerate the adoption of the technology Implement a selection of the use cases as proof-of-concept demonstrations Develop high-level vocabularies Disseminate information about the group’s work at government, industry, and academic events

10 Current Task Forces BioRDF – integrated neuroscience knowledge base
Kei Cheung (Yale University) Clinical Observations Interoperability – patient recruitment in trials Vipul Kashyap (Cigna Healthcare) Linking Open Drug Data – aggregation of Web-based drug data Chris Bizer (Free University Berlin) Pharma Ontology – high level patient-centric ontology Christi Denney (Eli Lilly) Scientific Discourse – building communities through networking Tim Clark (Harvard University) Terminology – Semantic Web representation of existing resources John Madden (Duke University)

11 BioRDF Task Force Task Lead: Kei Cheung
Participants: M. Scott Marshall, Eric Prud’hommeaux, Susie Stephens, Andrew Su, Steven Larson, Huajun Chen, TN Bhat, Matthias Samwald, Erick Antezana, Rob Frost, Ward Blonde, Holger Stenzhorn, Don Doherty

12 BioRDF: Answering Questions
Goals: Get answers to questions posed to a body of collective knowledge in an effective way Knowledge used: Publicly available databases, and text mining Strategy: Integrate knowledge using careful modeling, exploiting Semantic Web standards and technologies

13 BioRDF: Looking for Targets for Alzheimer’s
Signal transduction pathways are considered to be rich in “druggable” targets CA1 Pyramidal Neurons are known to be particularly damaged in Alzheimer’s disease Casting a wide net, can we find candidate genes known to be involved in signal transduction and active in Pyramidal Neurons? Source: Alan Ruttenberg

14 Source: Susie Stephens
BioRDF: Integrating Heterogeneous Data PDSPki Gene Ontology Reactome NeuronDB BAMS Allen Brain Atlas Antibodies BrainPharm Entrez Gene MESH Literature PubChem Mammalian Phenotype SWAN AlzGene Homologene Source: Susie Stephens

15 Source: Alan Ruttenberg
BioRDF: SPARQL Query Source: Alan Ruttenberg

16 BioRDF: Results: Genes, Processes
DRD1, adenylate cyclase activation ADRB2, 154 adenylate cyclase activation ADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway DRD1IP, dopamine receptor signaling pathway DRD1, dopamine receptor, adenylate cyclase activating pathway DRD2, dopamine receptor, adenylate cyclase inhibiting pathway GRM7, G-protein coupled receptor protein signaling pathway GNG3, G-protein coupled receptor protein signaling pathway GNG12, G-protein coupled receptor protein signaling pathway DRD2, G-protein coupled receptor protein signaling pathway ADRB2, 154 G-protein coupled receptor protein signaling pathway CALM3, 808 G-protein coupled receptor protein signaling pathway HTR2A, G-protein coupled receptor protein signaling pathway DRD1, G-protein signaling, coupled to cyclic nucleotide second messenger SSTR5, G-protein signaling, coupled to cyclic nucleotide second messenger MTNR1A, G-protein signaling, coupled to cyclic nucleotide second messenger CNR2, G-protein signaling, coupled to cyclic nucleotide second messenger HTR6, G-protein signaling, coupled to cyclic nucleotide second messenger GRIK2, glutamate signaling pathway GRIN1, glutamate signaling pathway GRIN2A, glutamate signaling pathway GRIN2B, glutamate signaling pathway ADAM10, 102 integrin-mediated signaling pathway GRM7, negative regulation of adenylate cyclase activity LRP1, negative regulation of Wnt receptor signaling pathway ADAM10, 102 Notch receptor processing ASCL1, 429 Notch signaling pathway HTR2A, serotonin receptor signaling pathway ADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization) PTPRG, ransmembrane receptor protein tyrosine kinase signaling pathway EPHA4, transmembrane receptor protein tyrosine kinase signaling pathway NRTN, transmembrane receptor protein tyrosine kinase signaling pathway CTNND1, Wnt receptor signaling pathway Many of the genes are related to AD through gamma secretase (presenilin) activity Source: Alan Ruttenberg

17 Linking Open Drug Data HCLSIG task started October 1st, 2008
Primary Objectives Survey publicly available data sets about drugs Explore interesting questions from pharma, physicians and patients that could be answered with Linked Data Publish and interlink these data sets on the Web Participants: Bosse Andersson, Chris Bizer, Kei Cheung, Don Doherty, Oktie Hassanzadeh, Anja Jentzsch, Scott Marshall, Eric Prud’hommeaux, Matthias Samwald, Susie Stephens, Jun Zhao

18 The Classic Web Single information space Built on URIs
Web Browsers Search Engines Single information space Built on URIs globally unique IDs retrieval mechanism Built on Hyperlinks are the glue that holds everything together HTML HTML HTML hyper- links hyper- links A B C Source: Chris Bizer

19 Linked Data Use Semantic Web technologies to publish structured data on the Web and set links between data from one data source and data from another data sources B C Thing typed links A D E Search Engines Linked Data Mashups Linked Data Browsers Source: Chris Bizer

20 Data Objects Identified with HTTP URIs
rdf:type pd:cygri foaf:Person foaf:name Richard Cyganiak foaf:based_near dbpedia:Berlin pd:cygri = dbpedia:Berlin = Forms an RDF link between two data sources Source: Chris Bizer

21 Dereferencing URIs over the Web
rdf:type pd:cygri foaf:Person foaf:name dp:Cities_in_Germany dp:population skos:subject Richard Cyganiak foaf:based_near dbpedia:Berlin Source: Chris Bizer

22 Dereferencing URIs over the Web
rdf:type pd:cygri foaf:Person foaf:name dp:Cities_in_Germany dp:population skos:subject Richard Cyganiak foaf:based_near dbpedia:Berlin skos:subject dbpedia:Hamburg skos:subject dbpedia:Meunchen Source: Chris Bizer

23 LODD Data Sets Source: Anja Jentzsch

24 LODD in Marbles Source: Anja Jentzsch

25 The Linked Data Cloud Source: Chris Bizer

26 Pharma Ontology

27 Participants

28 Pharma Ontology Deliverables
Review existing ontology landscape Identify scope of a pharma ontology through understanding employee roles Identify roughly 30 entities and relationships for template ontology Create 2-3 sketches of use cases (that cover multiple roles) Select and build out use case (including references to data sets) Build relevant component of ontology for the use case Build an application that utilizes the ontology

29 Existing Resources

30 Roles within Translational Medicine

31 Scientific Discourse Task Force
Task Lead: Tim Clark, John Breslin Participants: Uldis Bojars, Paolo Ciccarese, Sudeshna Das, Ronan Fox, Tudor Groza, Christoph Lange, Matthias Samwald, Elizabeth Wu, Holger Stenzhorn, Marco Ocana, Kei Cheung, Alexandre Passant

32 Scientific Discourse: Overview
Source: Tim Clark

33 Scientific Discourse: Goals
Provide a Semantic Web platform for scientific discourse in biomedicine Linked to key concepts, entities and knowledge Specified by ontologies Integrated with existing software tools Useful to Web communities of working scientists Source: Tim Clark

34 Scientific Discourse: Some Parameters
Discourse categories: research questions, scientific assertions or claims, hypotheses, comments and discussion, and evidence Biomedical categories: genes, proteins, antibodies, animal models, laboratory protocols, biological processes, reagents, disease classifications, user-generated tags, and bibliographic references Driving biological project: cross-application of discoveries, methods and reagents in stem cell, Alzheimer and Parkinson disease research Informatics use cases: interoperability of web-based research communities with (a) each other (b) key biomedical ontologies (c) algorithms for bibliographic annotation and text mining (d) key resources Source: Tim Clark

35 Scientific Discourse: SWAN+SIOC
Represent activities and contributions of online communities Integration with blogging, wiki and CMS software Use of existing ontologies, e.g. FOAF, SKOS, DC SWAN Represents scientific discourse (hypotheses, claims, evidence, concepts, entities, citations) Used to create the SWAN Alzheimer knowledge base Active beta participation of 144 Alzheimer researchers Ongoing integration into SCF Drupal toolkit Source: Tim Clark

36 COI Task Force Task Lead: Vipul Kashap
Participants: Eric Prud’hommeaux, Helen Chen, Jyotishman Pathak, Rachel Richesson, Holger Stenzhorn

37 COI: Bridging Bench to Bedside
How can existing Electronic Health Records (EHR) formats be reused for patient recruitment? Quasi standard formats for clinical data: HL7/RIM/DCM – healthcare delivery systems CDISC/SDTM – clinical trial systems How can we map across these formats? Can we ask questions in one format when the data is represented in another format? Source: Holger Stenzhorn

38 Source: Eric Prud’hommeaux
COI: Use Case Pharmaceutical companies pay a lot to test drugs Pharmaceutical companies express protocol in CDISC -- precipitous gap – Hospitals exchange information in HL7/RIM Hospitals have relational databases Source: Eric Prud’hommeaux

39 Source: Holger Stenzhorn
Inclusion Criteria Type 2 diabetes on diet and exercise therapy or monotherapy with metformin, insulin secretagogue, or alpha-glucosidase inhibitors, or a low-dose combination of these at 50% maximal dose. Dosing is stable for 8 weeks prior to randomization. ?patient takes metformin . Source: Holger Stenzhorn

40 Source: Holger Stenzhorn
Exclusion Criteria Use of warfarin (Coumadin), clopidogrel (Plavix) or other anticoagulants. ?patient doesNotTake anticoagulant . Source: Holger Stenzhorn

41 Source: Holger Stenzhorn
Criteria in SPARQL ?medication1 sdtm:subject ?patient ; spl:activeIngredient ?ingredient1 . ?ingredient1 spl:classCode #metformin OPTIONAL { ?medication2 sdtm:subject ?patient ; spl:activeIngredient ?ingredient2 . ?ingredient2 spl:classCode #anticoagulant } FILTER (!BOUND(?medication2)) Source: Holger Stenzhorn

42 Terminology Task Force
Task Lead: John Madden Participants: Chimezie Ogbuji, M. Scott Marshall, Helen Chen, Holger Stenzhorn, Mary Kennedy, Xiashu Wang, Rob Frost, Jonathan Borden, Guoqian Jiang

43 Features: the “bridge” to meaning
Concepts Features Data Ontology Keyword Vectors Literature Ontology Image Features Image(s) The link to semantics is left to researchers Ontology Gene Expression Profile Microarray Ontology Detected Features Sensor Array

44 Terminology: Overview
Goal is to identify use cases and methods for extracting Semantic Web representations from existing, standard medical record terminologies, e.g. UMLS Methods should be reproducible and, to the extent possible, not lossy Identify and document issues along the way related to identification schemes, expressiveness of the relevant languages Initial effort will start with SNOMED-CT and UMLS Semantic Networks and focus on a particular sub-domain (e.g. pharmacological classification)

45 SKOS & the 80/20 principle: map “down”
Minimal assumptions about expressiveness of source terminology No assumed formal semantics (no model theory) Treat it as a knowledge “map” Extract 80% of the utility without risk of falsifying intent Why did we choose to go the SKOS route? Here are the reasons. We decided that safest and easiest route with the least risk was to “map down” to SKOS rather than to “map up” to RDFS or OWL. 45 Source: John Madden

46 E-science presentation Manchester September 25 2006
12/28/2017 Each set of components in the AIDA Toolbox is developed by a different expert in information retrieval, machine learning, and Semantic web. The motivating idea is that applications can be built by expert users such as bioinformaticians and food informaticians. Multidisciplinary collaboration is another important aspect of e-science. The AIDA toolbox for knowledge extraction and knowledge management in a Virtual Laboratory for e-Science

47 SNOMED CT/SKOS under AIDA: retrieve
This is what it looks like, running semantically-guided free text search using SCT/SKOS. We can demo this.

48

49

50 Task Force Resources to federate
BioRDF – knowledge base, aTags (stored in KB) Clinical Observations Interoperability – drug ontology Linking Open Drug Data – LOD data Pharma Ontology – ontology Scientific Discourse – SWAN ontology, SWAN SKOS, myexperiment ontology Terminology – SNOMED-CT, MeSH, UMLS

51 We’ve come a long way Triplestores have gone from millions to billions
Linked Open Data cloud On demand Knowledge Bases: Amazon’s EC2 Terminologies: SNOMED-CT, MeSH, UMLS, .. Neurocommons, Flyweb, Biogateway, Bio2RDF, Linked Life Data, ..

52 Accomplishments Technical Outreach HCLS KB hosted at 2 institutes
Linked Open Data contributions Demonstrator of querying across heterogeneous EHR systems Integration of SWAN and SIOC ontologies for Scientific Discourse Outreach Conference Presentations and Workshops: Bio-IT World, WWW, ISMB, AMIA, C-SHALS, etc. Publications: Proceedings of LOD Workshop at WWW 2009: Enabling Tailored Therapeutics with Linked Data Proceedings of the ICBO: Pharma Ontology: Creating a Patient-Centric Ontology for Translational Medicine AMIA Spring Symposium: Clinical Observations Interoperability: A Semantic Web Approach BMC Bioinformatics. A Journey to Semantic Web Query Federation in Life Sciences Briefings in Bioinformatics.  Life sciences on the Semantic Web: The Neurocommons and Beyond

53 Someday, we should be able to find this as evidence for a fact in a Knowledge Base

54 Getting Involved Benefits to getting involved include: Get involved
Early access to use cases and best practice Influence standard recommendations Cost effective exploration of new technology through collaboration Network with others working on the Semantic Web Get involved Speak to any of us after the session! chairs and team contact Participate in the next F2F (last one was here):


Download ppt "W3C Semantic Web for Health Care and Life Sciences Interest Group"

Similar presentations


Ads by Google