Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Ontology for Biobanking Chris Stoeckert, Jie Zheng, and Mathias Brochhausen University of Pennsylvania and University of Arkansas for Medical Sciences.

Similar presentations


Presentation on theme: "The Ontology for Biobanking Chris Stoeckert, Jie Zheng, and Mathias Brochhausen University of Pennsylvania and University of Arkansas for Medical Sciences."— Presentation transcript:

1 The Ontology for Biobanking Chris Stoeckert, Jie Zheng, and Mathias Brochhausen University of Pennsylvania and University of Arkansas for Medical Sciences CTS Ontology Workshop Charleston, SC Sept. 24-25, 2015

2 Finding specimens and associated donor information across Biobanks is difficult to impossible Challenges: – Not all biobanks are the same and their terminology needs will vary. – Different terminologies have been created (by biobanks and others) to cover needs. How do we leverage and share these valuable resources? Examples of motivating use cases: – What specimen types (DNA, RNA, Frozen Tissue in OCT) are available for research and who is the contact person regarding access? – Identify cases and controls from a population of patients that have EDTA blood. Match these based on prescription and diagnosis data from the patients EMR. Also match basic demographic data collected at the time of recruitment. – Are circulating tumor cells available from breast cancer participants expressing HER+, ER-, PR- status? What are the passage numbers for the cell lines? Number of freeze/thaw cycles?

3 What do we gain with an Ontology for Biobanking? Pre-existing resources (e.g. CaTissue) only provide a partial solution. – The NCI Common Biorepository Model lacks robust definitions and addresses only meta-data about biospecimen collections, not individual specimen or participant data. The Ontology for Biomedical Investigations (OBI) and related OBO Foundry ontologies provide a basis for accommodating and integrating relevant terminologies. – OBI covers specimens but is interoperable with other OBO Foundry ontologies covering areas like disease (The Human Disease Ontology). – Provides mechanisms for referring to non-OBO foundry ontologies and terminologies. – Using these, we have built an application ontology, the Ontology for Biobanking.

4 Ontology for Biomedical Investigations OBI is about capturing all aspects of a biological and clinical investigation (investigation, assay, specimen, protocol, device, data, data analysis, etc.) which provides a semantic framework to model an investigation Things to know – a member of the OBO Foundry (has gone through a review process) – interoperable with other ontologies following OBO Foundry principles, such as the Gene Ontology (GO) – uses the Basic Formal Ontology (BFO) as its top level ontology – uses the Information Artifact Ontology (IAO) for general information entities Details on OBI can be found at: – http://obi-ontology.org/ http://obi-ontology.org – J Biomed Semantics. 2010. Modeling biomedical experimental processes with OBI, Ryan R Brinkman, Mélanie Courtot, Dirk Derom, Jennifer M Fostel, Yongqun He, Phillip Lord, James Malone, Helen Parkinson, Bjoern Peters, Philippe Rocca-Serra, Alan Ruttenberg, Susanna- Assunta Sansone, Larisa N Soldatova, Christian J Stoeckert, Jr., Jessica A Turner, Jie Zheng, and the OBI consortium The release version of OBI is available on: – NCBO Bioportal website: http://bioportal.bioontology.org/ontologies/OBI http://bioportal.bioontology.org/ontologies/OBI – Ontobee website: http://www.ontobee.org/browser/index.php?o=OBIhttp://www.ontobee.org/browser/index.php?o=OBI The link of latest release version of OBI is: – http://purl.obolibrary.org/obo/obi.owl http://purl.obolibrary.org/obo/obi.owl

5 OBI high level structure illustrates ontology integration No mapping needed! BFO IAO OBO OBI is a entity continuant occurrent information content entity planned process data item investigation specimen collection material component separation plan specification material combination assay document textual entity material processing material entity biological process (GO) material maintenance processed material specimen gross anatomical part (CARO) organism (NCBI taxonomy) molecular entity (ChEBI) organization device processed specimen study design process specifically dependent continuant independent continuant generically dependent continuant quality role OBI developers have driven integration by pushing for common adoption of BFO2 classes, updated releases of IAO and RO, and agreement on overlapping terms.

6 A strength of OBI is modeling the processes that connect biological source material to the data generated about it OBI assay Measurement of Glucose concentration in blood

7 OBI as basis for the Ontology for Biobanking (OBIB) Two independent efforts used OBI as a basis for capturing different aspects of biobanking. – OMIABIS: Biobank administration – Penn biobank ontology: patient to specimen tracking – These were merged without semantic conflicts into OBIB Not creating yet another new terminology. Instead we are leveraging the valuable work done by experts in areas we need. – Follow (OBO Foundry) best practices for term re-use – Extend this approach as integrative framework for other terminologies – Use these as building blocks for including new terms

8 Ontology for Biobanking (OBIB) is an Application Ontology based on OBI

9 With OBIB, we can follow a person through enrollment, getting their history and vitals, and collecting a blood specimen.

10 Current status of OBIB Summary statistics for version 2015-04-13 – 511 classes – 79 object properties – 38 annotation properties OBIB is open source and is available at: https://github.com/biobanking/biobanking NCBO Bioportal http://bioportal.bioontology.org/ontologies/OBIB Ontobee http://www.ontobee.org/browser/index.php?o=OBIB OBIB development is being driven by biobank use cases. – Now from other biobanks!

11 Penn Medicine Biobank competency question: Obtaining matched case/control cohorts Query: Generate lists of potential cases and potential controls for given criteria. Cases are patients with Type 2 diabetes that have taken a particular prescription statin on or around the time of recruitment/specimen collection and have an EDTA specimen available. In practice, "around the time of recruitment" was estimated by a prescription within 5 to 250 days prior to the date of recruitment. Controls have Type 2 diabetes and have no history of taking statins in any form and must have an EDTA specimen available. Controls are matched by gender, age at recruitment, and body mass index to the cases selected. Non-trivial because it requires ad-hoc integration across medical records, prescription orders, case report forms, and specimen inventories.

12 OBIB Model Integrates Medical Records, Case Report Forms, and Specimen Inventories Red text indicates data in the resources to be instantiated in the ontology model.

13 We can expand the eMedical record to include ICD-9 codes for diagnosis and RxNorm for drugs. Make use of other ontologies following OBO Foundry principles such as the Human Disease Ontology and the Drug Ontology. These have internal mappings to UMLS and RxNorm respectively that can be used to search ICD codes and prescription orders.

14 Using OBIB to find cohorts of Biobank specimens with the PMBB Carnival System OBIB Ontology for Biobanking DRON Drug Ontology DOID Disease Ontology RDF R2ML Relational Data OBI DRON DOID Application Ontology ? ? ? ? ? With the help of local domain experts, OBO 1 Ontology experts generate an ontology model using OBIB that includes the portions of OBO ontologies relevant to the data sources. 1 For each data source, local data experts reference the ontology model to create an R2RML 2 file to map the relational data and their domain knowledge to a graph format. They instantiate the OBIB model reflecting the naming convention they used for data instances that might be shared among other data sources. An RDF conversion tool uses the mapping file and the relational data to generate RDF triples. 2 The RDF data and any relevant OBO Ontologies are loaded into a graph database. Data from the separate data sources are now related in accordance with the expert's domain knowledge via the ontologies. 3 Queries can be performed over the graph database by referencing the OBIB model. No specific knowledge about the structure or format of the original data is necessary. Any domain knowledge, standards conversions (i.e.SNOMED, ICD) or scientific knowledge in the OBO Ontologies is available to be queried and reasoned over, even if not in the original data sources. 4 Graph Database 1. OBO - Open Biological and Biomedical Ontologies. 2. R2RML - RDB to RDF Mapping Language. RDF Conversion Software RDF R2RML RDF Conversion Software

15 Future directions for OBIB: Provide a framework for collaboration Collaborating across biobanks – Promoting OBIB as a mechanism to find common semantics (based on reality not what is stored in a database) between biobanks – Started a CTSA-based collaboration between Duke, Michigan, MUSC, Penn, UAMS. – Want to extend to others (NCI–BBRB) Collaborating with the Informed Consent Ontology Currently working on integrating Duke terminology – Identify common terms Duke: sample OBI: specimen Duke: collect OBI: collecting specimen from organism – Build Duke terms from OBI/OBIB terms and logical axioms (Defined classes!) Duke: sample family = (submitted to OBI): collection of specimens that are the output of material processing of the same input.

16 Future Directions for OBIB: Extend coverage of related terminologies through OBI How can we incorporate LOINC? – We might start with CHEM/Lab Class/Type in LOINC. For example: – 5792-7: Glucose [Mass/volume] in Urine by Test strip – 41653-7: Glucose [Mass/ volume] in Capillary blood by Glucometer – LOINC 41653-7 is very related to OBI_000418: measuring glucose concentration in blood serum (example shown earlier). Generalizing: – LOINC: component system method ::: OBI: analyte assay with evaluant and specified input – In this example, analyte = glucose, evaluant = blood, specified input = glucometer – Note: the [Mass/volume] in LOINC is specified in the OBI output of measurement datum (measurement unit label = milligram per milliliter in this example).

17 Acknowledgements Heather Williams (PMBB) David Birtwell (PMBB) OBI Consortium OBO Foundry Anna Maria Masci (Duke) Helena Ellis (Duke)

18 CHEM/Lab class/type in LOINC fits Assay in OBI Established design pattern for assay can be used to programmatically add new analyte assays (and other types).

19 Hierarchy of OBI assay terms can provide CDISC desired level of granularity BFO IAO OBO OBI is a information content entity planned process data item investigation specimen collection material component separation plan specification material combination assay document textual entity material processing material entity biological process (GO) material maintenance processed material specimen gross anatomical part (CARO) organism (NCBI taxonomy) molecular entity (ChEBI) organization quality role analyte assay clinical chemistry assay measuring glucose conc. in blood serum Current classes

20 Hierarchy of OBI assay terms can provide CDISC desired level of granularity BFO IAO OBO OBI is a information content entity planned process data item investigation specimen collection material component separation plan specification material combination assay document textual entity material processing material entity biological process (GO) material maintenance processed material specimen gross anatomical part (CARO) organism (NCBI taxonomy) molecular entity (ChEBI) organization quality role analyte assay LOINC clinical chemistry assay measuring glucose conc. in blood serum Current and proposed classes clinical glucose assay 5792-7: Glucose [Mass/volume] in Urine by Test strip 41653-7: Glucose [Mass/​volume] in Capillary blood by Glucometer


Download ppt "The Ontology for Biobanking Chris Stoeckert, Jie Zheng, and Mathias Brochhausen University of Pennsylvania and University of Arkansas for Medical Sciences."

Similar presentations


Ads by Google