Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General Hospital. All rights reserved.
Biomedical web data integration challenges Requirements to cure complex disorders Catch-22 for semantic data in medicine Web 3.0 and semantic metadata Injecting semantics into the existing ecosystem Integrating ontologies, documents & data Annotation Ontology & Annotation Framework Hypothesis management (vs. KM)
Alzheimer Disease Huntington’s Disease Nicotine Addiction Schizophrenia Bipolar Disorder Alcohol addiction Autism Parkinson’s Disease ALS Neuropathic Pain Major Depressive Disorder
Yearly mortality (U.S.) = 642,00 people Yearly costs (U.S.) =$676 B / 4.7% GDP Prevalence = 5.3 M + 76 M M = 95.7 M people
create hypothesis design experiment run experimentcollect data interpret data share interpretations synthesize knowledge
MCI progressorsnon progressors PET imaging of PIB (radiolabelled compound binds amyloid beta A4 protein) MRI imaging of brain structure showing loss of hippocampal volume Brain Nov;133(Pt 11): = 218 subjects +
Alzheimer Disease Parkinson’s Disease Schizophrenia Autism Bipolar Disorder Drug Addiction Huntington’s Disease ALS Depression
dopaminergic pathway α-synuclein, β-amlyoid α-synuclein, Tau chr 16p11.2 CNV CRF, glutaminergic system, dopamine, amygdala … Alzheimer Disease Parkinson’s Disease Schizophrenia Autism Bipolar Disorder Drug Addiction Huntington’s Disease ALS Depression SIRT2
1.We want to organize all the known facts in neurobiology so we can mash them up. 2.There are no “facts” in neurobiology, except uninteresting ones. 3. All we have, are assertions supported by evidence, of varying quality.
Printing PressWeb
We scientists do not attend professional meetings to present our findings ex cathedra, but in order to argue. John Polanyi, FRS, Nobel Laureate University of Manchester
Social Web (Web 2.0, read/write) Shared annotation with controlled terminology systems (Sem Web) +
Information sharing within communities or tasks via Social Web (Web 2.0), wikis and forums Information “permeability” across pharma R&D projects / domains / pipeline stages via shared metadata (semantic annotation) Web 3.0 improves cross-domain Signal to Noise, institutional memory & data “findability”
Genes Proteins Biological Processes Chemical Compounds Antibodies Cells Brain anatomy …
Annotation Ontology (AO) is a domain- independent Web ontology. Links document fragments to ontology terms. Metadata separate from annotated documents. SWAN AF manages document annotation. Interfaces to textmining svcs & supports curation. Collaborating with NCBO, UCSD, Elsevier, USC, Manchester, EMBL, Colorado, EBI, etc…
Text Shared metadata
2) Automatic annotation Dr. Paolo Ciccarese – Oct 8, 2010
Semantics on documents (SESL) Vocabulary standards & terminology development Document & data management Collaboratories & web communities Hypothesis management (SWAN) Nanopublications (OpenPHACTS)
Model the thinking behind your research Database it, web-ify it, RDF-ize it, share it Link the Models / Hypotheses to Claims / Interpretations Evidence (publications, experiments, data) Supporting and contradictory claims from others Evidence for these other claims Web 3.0: share, compare and discuss Manage knowledge while creating it Can be public, private, or semi-private
Dr. Paolo Ciccarese – Oct 8, 2010
Cognitive Deficits (S) Cognitive Deficits (S) BACE1 (O) BACE1 (O) Relate to (p) Relate to (p) provenanc e context With thanks to Barend Mons and Paul Groth… Mons / Groth model of a nanopublication
swande:Claim Intramembranous Aβ behaves as chaperones of other membrane proteins rdf:type dct:title G1 pav:authoredBy Vincent Marchesi foaf:name foaf:Person rdf:type pav: foaf: G2
swande:Claim Intramembranous Aβ behaves as chaperones of other membrane proteins rdf:type dct:title G1 pav:authoredBy G2 pav:curatedBy G4 Gwen Wong foaf:name foaf:Person rdf:type
swande:Claim Intramembranous Aβ behaves as chaperones of other membrane proteins rdf:type dct:title G1 pav:contributedBy swanrel:referencesAsSupportiveEvidence G5 G6
G8 rdf:type Event of type GO "chaperone binding" rdfs:label rdf:type rdfs:label “Beta amyloid” rdfs:label “Membrane protein” rdfs:label “Plasma membrane” With many thanks to Nigam Shah, Stanford University
Hyque triples G8 pav:contributedBy Nigam Shah foaf:name foaf:Person rdf:type G9
swande:Claim Intramembranous Aβ behaves as chaperones of other membrane proteins rdf:type dct:title G1 Hyque triples G8 swanrel:derivedFrom
Target / pathway hypotheses will be linked to: Pathway & target relation to disease, Target selection criteria, Validation assays and criteria, Experiment (assay) provenance, Experimental data and computations, Scientist remarks, findings and discussion. Start as a relatively simple model and extend
Hypotheses of therapeutic action for compounds and scaffolds will be linked to Hypothesis / results for individual assays, Experiment (assay) provenance, Experimental data, Group annotation, Internal databases etc. Start as a relatively simple model and extend
Information ecosystem
Research reproducibility Linking data to documents at time of publications Citation of reagents, instruments, code, protocols Bibliographies and citation networks Bibliographic records and citations are metadata Personal annotations Selective sharing and virtual communities Database annotation Biomedical ontology database curation projects
What is NASA ADS? Web database comprising over 8 million astronomy and physics papers Full-text for over 880K articles, including all major astronomy journals NASA ADS semantic annotation requirements Astronomical objects by catalog ID Specific telescope, type of telescope, wavelength Investigators Grant funding sources
Curing complex medical disorders goes hand in hand with next-gen biomedical communications Web 3.0 provides the technology framework Semantic annotation, hypothesis management, nanopubs: tools for next-gen biomed comms. Requires / enables international collaborations of biomedical researchers and informaticians. Open enterprise model with semantic metadata.
People Paolo Ciccarese (Harvard) Maryann Martone (UCSD) Anita DeWaard & Tony Scerri (Elsevier) Karen Verspoor & Larry Hunter (Colorado) Adam West & Ernst Dow (Eli Lilly) Carole Goble (Manchester) Nigam Shah (Stanford / NCBO) Paul Groth (VU Amsterdam) Funding: Elsevier, NIH, Eli Lilly, & EMD Serono
Whereas King Ptolemy, living forever, the Manifest God whose excellence is fine, son of King Ptolemy and Queen Arsinoe, the Father- loving Gods, is wont to do many favours for the temples of Egypt and for all those who are subject to his kingship, he being a god… English translation by R.S. Simpson