Download presentation
Presentation is loading. Please wait.
Published byAugusta Wheeler Modified over 6 years ago
1
BioRDF Overview and Update By Kei Cheung, Ph. D
BioRDF Overview and Update By Kei Cheung, Ph.D. Yale Center for Medical Informatics C-SHALS 2009, Boston, Massachusetts, February 25, 2009
2
BioRDF Objectives Participants Enhance the HCLS KB
Increase the value and use of HCLS KB by identifying scientific use case Work on human-friendly user interface Document and publish findings to help accelerate/promote adoption of the Semantic Web Participants Universities, pharmaceutical companies, start-up companies, government institutes, W3C, etc
3
BioRDF Activities/Tasks
Invited Talks UMLS, NCBO, NIF, Biogateway, WikiNeuron, Gene Wiki, 3D Web Visualization, BioSIOC/aTag, VoID HCLS KB Two instances of HCLS KB have been created DERI (Virtuoso) Neurocommons Free University in Berlin (Allegro Graph) SenseLab and TCMGeneDIT Neuroscience use case add receptors to the picture aTags Matthias Samwald, Kei Cheung Query Federation Kei Cheung, Rob Frost, Kingsley Idehen, Scott Marshall, Adrian Paschke , Eric Prud’hommeaux, Matthias Samwald, Jun Zhao
4
Brain: Neuron and Synapse
Courtesy of NIDA
5
aTags
6
aTags Very simple, generic way of expressing biomedical statements
A short snippet of text + a list of ontology terms used for describing the text Using established vocabulary (SIOC, OBO ontologies) Encoded in RDFa (easy to embed in existing HTML-based systems)
7
aTags Transmitter T seems to activate receptor R
Receptor R is expressed in brain region B Region B has strong axonal projections into brain region B2
8
aTags aTags will be created by
conversion of existing biomedical datasets manual curation of data (highlight text snippet in browser & click on del.icio.us – like bookmarklet) Design philosophy: simplicity and practicality Use existing resources Play along with existing systems (HTML content management, RDFa-enabled search engines)
9
Query Federation
10
A Journey to Query Federation: from SPARQL Endpoint to Linked Data
Application demo Receptor explorer Mismatch between Wikipedia and DBpedia Comparison of Triplestores Linked Data description and deployment voiD FeDeRate URI
11
Receptor Explorer VectorC Semantic Service Bus Receptor App
Genes involved in receptor Publications about gene Clinical trials referencing publications RDF Bio2RDF linkedct.org Clinicaltrials.gov PubMed Entrez Gene HCLS KB DERI DBpedia map Receptors SenseLab VectorC Semantic Service Bus ESB/SOA Clinicaltrials.gov PubMed Wikipedia web sites Copyright 2008 VectorC, LLC 11
15
A Semantic Mismatch between Wikipedia and DBpedia
16
Triplestore Comparison
Features Virtuoso Allegro Graph Class Hierarchy Inference Yes Linked Data Deployment Built-in support 3rd party software (e.g., Pubby) Query Federation Linked Data Spaces (SPARQL against resource URI’s) Built-in support (Sesame and Oracle only). For other triplestores, a 3rd party middleware approach is required.
17
Federated Query (FeDeRate)
Mediation Local query 1 Local query 2 Local query n DBPedia (RDF) IUPHAR (SQL)
18
Federation Scenario PREFIX db: < PREFIX re: < PREFIX dp: < SELECT ?abstract ?code ?ligand ?hum_seq_id ?chr ?refseq FROM NAMED db:IUPHAR.prop FROM NAMED db:DBPedia.rdf WHERE { # Get info from the (SQL) IUPHAR receptor tables. GRAPH db:IUPHAR.prop { ?r re:Code ?code . ?r re:Ligand ?ligand . ?r re:Human_nucleotide ?hum_seq_id } # Get info from (RDF) DBPedia. GRAPH db:DBPedia.rdf { ?p dp:chromosome ?chr . ?p dp:refseq ?refseq . ?p dp:symbol ?symbol . ?p db:abstract ?abstract } }
19
Example Join between IUAPHAR & DBPedia (GABAB receptor)
IUPHAR DBPedia
20
voiD: vocabulary of interlinked Datasets
Motivation Effective Dataset Selection Efficient Discovery of Datasets, by search engines or data publishers SPARQL query optimisation and query federation Two high-level concepts Dataset: a dataset is published and maintained by a single provider and accessible on the Web through de-referenceable URIs or a SPARQL endpoint Linkset: a subset of a void:Dataset; store triples to express the interlinking relationship between dataset voiD Vocabulary, voiD User's Guide,
21
Biological Dataset in voiD Format
:senselabontology a void:Dataset ; dcterms:title "SenseLab Neuron Ontology" ; dcterms:description "Neuroscience ontology derived from the SenseLab NeuronDB database."; dcterms:license <> ; # TODO foaf:homepage < ; void:exampleResource < ; void:exampleResource < ; void:exampleResource < ; dcterms:creator :senselab ; ## this organization can be further defined dcterms:source < ; dcterms:subject < ; dcterms:subject < ; dcterms:subject < ; dcterms:subject < ; dcterms:source <doi: /bib/bbm018> ; void:feature :owl ; ## this technical feature can be further defined void:sparqlEndpoint < ; void:vocabulary < .
22
voiD Deployment Deploy a voiD file (in either Turtle, RDF/XML or RDFa format) onto the Web server Make it accessible to search engines, such as Sindice ( Publish a Semantic Sitemap file (sitemap.xml) on the server “ allows Data publishers to state where documents containing RDF data are located, and to advertise alternative means to access it ” [1] Use the datasetURI property in the sitemap.xml to point to the voiD description of a dataset, e.g., [1]
23
URI Issues Proliferation of synonymous URI’s
Potential problems Performance Maintenance Possible solutions Involvement of nomenclature committee (e.g., IUPHAR) and domain authority (e.g., Neuroscience Information Framework or NIF) Persistent/permanent URI scheme (e.g., PURL) E.g., Dereferenceable URI’s A dereferenceable URI is a resource identification mechanism that uses the HTTP protocol to obtain a representation of the resource it identifies For Linked Data, the representation takes the form of an information resource that describes the resource that the URI identifies.
24
Future Directions Submit a paper describing the query federation work to a journal or conference Continue and extend current tasks: Query Federation and aTag Add new tasks e.g., semantic wiki, workflow, user interface, … Expand the HCLS KB (both instances) e.g., new datasets such as UMLS Collaborate with other task forces e.g., LODD (natural alternative use case, Faviki) and SWAN/SIOC
25
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.