NEDA ALIPANAH, MARIA ADELA GRANDO DBMI 11/19/2012
Outline of the talk Use case scenario: Informed Consent Examples of Querying and Reasoning on Ontologies : Semantic Web Rule Language (SWRL) SPARQL Criteria to choose a reasoning language Conclusions
Use Case: Informed Consent Communication process between a patient and investigator that ultimately results in the patient’s agreement to participate in a research study. The process includes documents that the patient must sign to acknowledge the discussion on the reason for the research study, risks, benefits, alternatives, options to withdraw, etc. It constitutes a critical decision-making process: It is crucial that patients are engage in the decision of participating in the study, and understand the consequences.
Research Involving Specimens Because of the developments in genetics and genomics biological specimens (biospecimens) and clinical data for research are currently in high demand, they can be very limited in availability. This concern has boosted the creation of clinical data warehouses and biobanks. Subject’s will to share his biospecimens and clinical data to biobanks and warehouses is expressed through informed consent
Informed Consent Management Vision Clinical Data Warehouse Query User U requests data D and sample S to perform operation O on subjects like I under constraints C User U Research Institution Healthcare Institution Results User U receives data D and sample S in compliance with subject’s permission Permission Repository Resource Mediator BioSample Repository Permission I authorize U to perform operation O over my data D or sample S under certain constraints C Patient I Can I access blood samples from patients with breast cancer for cancer research? I consent to share my blood samples for future cancer research
Permission Ontology A has or to perform an over or under constraints
Permission Ontology- Data Repository
Resource Mediator for Cancer Center Biorepository Clinical Data Warehouse Query User U requests data D and sample S to perform operation O on subjects like I under constraints C User U Research Institution Healthcare Institution Results User U receives data D and sample S in compliance with subject’s permission Permission Repository Resource Mediator Interface MCC BioSample Repository I am a Stanford researcher, can I access laboratory tests and frozen blood samples from patients with ovarian cancer? Reasoning Engine
Resource Mediator Interface Query User U requests data D and sample S to perform operation O on subjects like I under constraints C I am a Stanford researcher, can I access laboratory tests and frozen blood samples from patients with ovarian cancer?
Resource Mediator Reasoning Engine 1. Look for the patients with breast cancer diagnosis who have signed IC documents authorizing sharing his/her blood samples and laboratory tests 2. Check if that the requested resources can be shared in compliance with the patient’s signed consent. I am a Stanford researcher, can I access laboratory tests and frozen blood samples from patients with ovarian cancer? Permission I authorize U to perform operation O over my data D or sample S under certain constraints C
Resource Mediator Reasoning Engine Check that the MCC biorepository and clinical data warehouse has available the requested clinical record entries and biosamples in the requested preservation state. I am a Stanford researcher, can I access laboratory tests and frozen blood samples from patients with ovarian cancer?
Where in the Layered Architecture? XML/XML Schemas RDF/OWL Ontologies Rules/Query Resource Mediator Other Services URI/UNICODE SPARQLSWRL
SWRL Resource Mediator Clinical Data Warehouse Query User U requests data D and sample S to perform operation O on subjects like I under constraints C User U Research Institution Healthcare Institution Results User U receives data D and sample S in compliance with subject’s permission Permission Repository Resource Mediator Interface MCC BioSample Repository I am a Stanford researcher, can I access laboratory tests and frozen blood samples from patients with ovarian cancer? SWRL Reasoning Engine
SWRL: Inference in OWL Semantic Web Rule Language (SWRL) Intended for rule-based reasoning in the Semantic Web SWRL is based on OWL: all rules are expressed in terms of OWL concepts (classes, properties, individuals,…) Provides intuitive, easy-to-read specifications Supported by Protégé Requires a reasoner, for instance Jess reasoner
SWRL: Inference in OWL SWRL rules have the form antecedent -> consequent where: both antecedent and consequent are conjunctions of atoms written a 1 ∧... ∧ a n, variables are indicated using the standard convention of prefixing them with a question mark (e.g., ?x). Example: Person(?p) ^ hasSibling(?p,?s) ^ Man(?s) -> hasBrother(?p,?s) How would you define hasUncle(?x,?y)?
SWRL Query Example Which informed consents grant me access to de- identified blood samples? User U Research Institution
SWRL Query Example Which informed consents grant me access to de- identified blood samples?
SWRL Query Example hasPolicy(?InfConsent, ?ppolicy) ^ Permission(?ppolicy) ^ canPerformOperation(?ppolicy, ?operation) ^ Share(?operation) ^ BloodSample(?sample) ^ operatesOn(?operation, ?sample) ^ isDeintifiedData(?sample, 1) -> sqwrl:selectDistinct(?InfConsent)
SPARQL: Inference on RDF/RDFS SPARQL Query on Linked Data Directly (Graph) Use Triple Patterns in the Graph SPARQL Syntax Select Command CONSTRUCT Command (for reasoning ) OPTIONAL Command
SPARQL Query Ontology Triple Pattern
Simple SPARQL Query Single Triple Matching ( Use the Select Structure of SQL) Select Variable name (?a) Where {Subject Predicate ?a} Multiple Triple Matching Select Variable name (?a),(?b) Where {Subject1 Predicate1 ?a. Subject2 Predicate2 ?b }
SPARQL Query SPARQL Triple Pattern Different parts of triple patterns can be variables. SELECT DISTINCT ?a WHERE { ?a rdf:type. } LIMIT 32
SPARQL for Inferencing What is inferencing (Reasoning)? An inference is the creation of a fact from existing facts. In RDF, this means adding a triple. :Patient6 rdf:type :Patient : Patient rdfs:subClassOf :Person Example of Inferencing in SPARQL Subclass (Subsumption) Inference Transitive Inference
Subclass Inferencing Subclass Inference (Subsumption) : Patient6 rdf:type :Person I.e. all members of the subclass are also members of the superclass. Command SPARQL CONSTRUCT for adding triples: Returns a graph (set of triples) that is the result of applying the CONSTRUCT graph pattern to each match in the WHERE clause. CONSTRUCT {?rsc rdf:type :Person} WHERE { ?rsc rdf:type :Patient..Patient rdfs:subClassOf :Person. }
Subclass Inferencing CONSTRUCT {?rsc rdf:type :Person} WHERE { ?rsc rdf:type :Patient..Patient rdfs:subClassOf :Person. } Check the Triples with Where clause Condition. ?rsc rdf:type :Patient Patient rdfs:subClassOf :Person …then the CONSTRUCT query will return the relevant triple for them and add to the graph. {?rsc rdf:type :Person}
Transitive Inferencing Example PREFIX o: ontologies.com/Ontology owlhttp:// ontologies.com/Ontology owl SELECT ?policy WHERE { ?policy rdf:type o:Permission; OPTIONAL {?policy o:haspolicy ?n} Filter(?n="MCCInfConsent") } Permission(?ppolicy) ^ hasPolicy(?ppolicy,“MCCInfCons ent”)
Inference Result The results are returned in RDF graph form. It can be directly inserted into an existing graph. Following inference protocol, Add this to the inferred graph, Saved to the asserted graph.
Inference Result SPARQL 1.1: Data directly asserted into your data through the INSERT/INSERT INTO syntax INSERT INTO {?x rdf:type ?c2} WHERE { ?c1 rdfs:subClassf ?c2. ?x rdf:type ?c1. }
SPARQL for Inferencing V SPARQL 1.1: INSERT is part of the general SPARQL update syntax that includes DELETE/DELETE FROM. This means that one can do updates on existing triples. An example: MODIFY GRAPH INSERT – DELETE Syntax INSERT {?rsc ?someprop ?newvalue} DELETE {?rsc ?someprop ?value} WHERE { ?rsc ?someprop ?value.... ?newvalue. }
SPARQL vs. SQL SPARQL is to RDF is like SQL to relational models. Relational Model RDF Model (Linked data) Many of SQL commands have equivalent in SPARQL (e.g. OPTIONAL can be seen as a left-outer join).
SPARQL OPTIONAL COMMAND SPARQL has a the ability to query for data but not to fail query when that data does not exist. SELECT ?Patient-name ?Patient-age WHERE { ?person hasFirstName ? Patient-name. OPTIONAL { ?person age ? Patient-age } | name | age | ======================= | "Becky Smith" | 23 | | "Sarah Jones" | | | "John Smith" | 25 | | "Matt Jones" | |
How to choose a reasoning language? What is happening in the Back-End? RDF vs. OWL Querying RDF/RDFS Ontologies Application Level OWL Ontologies Rules/Queries SPARQLSWRL
How to choose a reasoning language? CriteriaProtégé SWRLSPARQL Ontology TypeOWLRDF/RDFS Ease of Use/ Intuitiveness EasyDifficult User levelNon-programmerProgrammer PerformanceNP Complete Not good Good Flexibility to improve Performance Less FlexibleMore Flexible
Conclusion Need inference to get more information from ontologies. SWRL is upper level reasoning language on OWL Ontologies. SPARQL is lower level language for querying/reasoning RDF Ontologies. SWRL is more intuitive and SPARQL is more technical (no semantic).