Download presentation
Presentation is loading. Please wait.
Published byWillis Washington Modified over 8 years ago
1
Ontology Web Services from the National Center for Biomedical Ontology Mark Musen and Nigam Shah {musen, nigam}@stanford.edu
2
www.bioontology.org/wiki/index.php/Tutorial_Examples www.bioontology.org/wiki/index.php/NCBO_REST_services
3
NCBO: Key activities We create and maintain a library of biomedical ontologies. We build tools and Web services to enable the use of ontologies and their derivatives. We collaborate with scientific communities that develop and use ontologies.
4
www.bioontology.org
5
5 Go to BioPortal
6
Total Monthly Visits to BioPortal
8
PART-I
9
http://rest.bioontology.org Ontology Services Download Traverse Search Comment Download Traverse Search Comment Widgets Tree-view Auto-complete Graph-view Tree-view Auto-complete Graph-view Annotation Data Access Mapping Services Create Download Upload Create Download Upload Views Term recognition Fetch “data” annotated with a given term http://bioportal.bioontology.org
10
ONTOLOGY SERVICES Accessing, browsing, searching and traversing ontologies in Your application
11
11 www.bioontology.org/wiki/index.php/NCBO_REST_services
12
http://rest.bioontology.org/ CodeSpecific UI
13
http://rest.bioontology.org/bioportal/ontologies
14
http://rest.bioontology.org/bioportal/search/melanoma/?ontologyids=1351
15
http://rest.bioontology.org/bioportal/virtual/ontology/1351/D008545
16
Wikipathways uses Ontology Services
18
Biositemaps Editor
19
VIEWS Custom subsets of large ontologies
20
Views and Value Sets Users can contribute their derivatives of BioPortal ontologies, which become first-class objects in BioPortal and can be used as all other ontologies are (e.g., as value sets) Recently added: a view-extractor service Enables users to extract a subtree of an ontology in OWL 20
21
Views in BioPortal 21
22
MAPPINGS Using NCBO technology to integrate terminologies and ontologies
23
Mappings Root Term-1 Term-2 Term-3 Term-4 Term-5 R t1 t2 t4 t5 t6 t7 t3 Term-2 t1 Term-5 t5 Ontology A Upload or Download mapping subsets Ontology B
24
Using Mappings for query federation Seizure Single Seizure Single Seizure Partial Seizure Partial Seizure Complex Seizure Complex Seizure Seizure NOS Epilepsy Temporal Epilepsy Temporal Epilepsy Partial Epilepsy Partial Epilepsy Single Seizure Single Seizure Direct Mappings FROM (site #1) TO (site #2) Convulsion disorder
25
WIDGETS Using NCBO technology on your web pages
26
Ontology Widgets UI components with “BioPortal inside”: term-selection widget for a specific ontology form fields with auto- complete from a specific BioPortal ontology RSS feed for an ontology Visualization widget Tree widget
27
ANNOTATOR SERVICE Using Ontologies to Annotate Your Data
28
Annotation as a Web service Process textual metadata to automatically tag text with as many ontology terms as possible.
29
Annotator: workflow “Melanoma is a malignant tumor of melanocytes which are found predominantly in skin but also in the bowel and the eye”. – 39228/DOID:1909, Melanoma in Human Disease Transitive closure – 39228/DOID:191, Melanocytic neoplasm, direct parent of Melanoma in Human Disease – 39228/DOID:0000818, cell proliferation disease, grand parent of Melanoma in Human Disease
30
Code Word Add-in to call the Annotator Service ? Word Add-in to call the Annotator Service ? Annotator service Multiple ways to access Specific UI Excel UIMA platform
31
DATA SERVICE Using Ontologies to Access Public Data
32
Resource index: The Basic Idea The index can be used for: Search Data mining
33
Resources index: Example
34
Code Resource Index Multiple ways to access Specific UI Resource Tab Resources annotated = 22 Total records = 3.5 million Direct annotations = … million After transitive closure = 16.4 Billion
35
PART-II
36
Use-cases based on ontology services
37
Sample user needs I need to restrict user input to a certain value set I need to extract the disease branch from SNOMEDCT I need to identify all terms mapped to UMLS CUI C0151779 I need to code/annotate free-text with ontology terms – For data exchange, export to standard formats
38
Use-cases for users of i2b2
39
Aim 1: Integrate NCBO services in i2b2 Preliminary results: Export any ontology stored in BioPortal into the format used by i2b2’s ontology cell Future Work: Make the export code available as a service Embed the extraction code into the i2b2 Ontology Cell to “pull” content Ensure we have the latest versions of ontologies used by i2b2 and CTSA users (ICD9, ICD10, SNOMEDCT, RXNORM, LOINC, CPT)
40
Aim 2: Mappings for query federation Preliminary result: Worked out the workflow for using mappings for query translation Detailed discussions with the HOM and OpenMDR groups to define use-case and elicit requirements Future work: Use BioPortal as the shared repository for inter terminology mappings Tackle access, IP, performance, and institutional issues Key features Import outside mappings Update mappings when versions change Mechanism to curate mappings Support proprietary curation and content
41
Using Mappings for query federation Seizure Single Seizure Single Seizure Partial Seizure Partial Seizure Complex Seizure Complex Seizure Seizure NOS Epilepsy Temporal Epilepsy Temporal Epilepsy Partial Epilepsy Partial Epilepsy Single Seizure Single Seizure Direct Mappings FROM (site #1) TO (site #2) Convulsion disorder
42
Use-cases based on automated annotation
43
Ontology based annotation 20 diseases
44
Disease card
45
Tm2d1 RGD1306410 Svs4 Hbb Scgb2a1 Alb + Linking annotations to data (by Simon Twigger)
46
Hbb is_expressed_in rat kidney Tm2d1 is_expressed_in rat kidney
47
Annotation Analytics
48
Generic GO based analysis routine Get annotations for each gene in list Count the occurrence (x) of each annotation term in gene list Count the occurrence (y) of that term in some reference set (whole genome?) P-value for how “surprising” is it to find x, given y. Set Reference x y
49
Annotation Analytics Landscape SNOMED-CT Gene Ontology Gene Sets NCIT ICD-9 Human Disease Cell Type MeSH Drugs, Chemicals Grant Sets Paper Sets Patient Sets Drug Sets : Health Indicator Warehouse datasets
50
Mutation enrichment
51
Annotation Analytics Landscape SNOMED-CT Gene Ontology Gene Sets NCIT ICD-9 Human Disease Cell Type MeSH Drugs, Chemicals Grant Sets Paper Sets Patient Sets Drug Sets : Mut ? ? Health Indicator Warehouse datasets
52
Ontology neutral enrichment analysis
53
Set Reference x ?
54
Using ontologies other than GO ERCC6 nucleoplasm PARP1 protein N-terminus binding ERCC6 nucleoplasm PARP1 protein N-terminus binding ERCC6 PARP1 ERCC6 PARP1
55
ERCC6GO:0005654PMID:16107709 ERCC6GO:0008094PMID:16107709 PARP1GO:0047485PMID:16107709 ERCC6GO:0005730PMID:16107709 PARP1GO:0003950PMID:16107709 http://www.geneontology.org/GO.downloads.annotations.shtml Enrichment Analysis with the DO www.ncbi.nlm.nih.gov/pubmed/16107709 NCBO Annotator: http://bioportal.bioontology.org NCBO Annotator: http://bioportal.bioontology.org {ERCC6, PARP1} PMID:16107709 {ERCC6, PARP1} {Cockayne syndrome, DNA damage} {ERCC6, PARP1} {Cockayne syndrome, DNA damage}
56
P35226, P04626, P38646, P50539, O95622, P04150, P07900, Q12805, P01375, P54098, P00533, P02545, P02649, P04637, P05067, P05549, P08047, P08138, P10636, P15692, P25963, P29353, P29590, P49768, P62993, Q00987, Q04206, Q13526, Q16643, Q8N726, P00441, P05019, P05231, P35354, P10909, Q06830, P15502, Q9UEF7, P01137, P04271, O15379, O95831, P09874, Q13315, Q7Z2E3, Q9UNE7, P01127, P01308, P02656, P07203, P09619, P17936, P18031, P19838, P27169, P42771, P45984, Q07869, Q14191, P08069, P68104, P01344, P06400, P09884, P10809, P25445, O43684, P17948, P48507, P28069, P16885, P18146, P35558, Q99683, P18074, P19447, P28715, Q03468, Q13216, Q13888, P16220, P35222, Q16665, P07949, P11362, P01023, P01286, Q9NYJ7, O00555, O15530, P01138, P17252, P31749, P63165, P55851, O76070, P01241, P13232, P16871, P22061, P28340, P31785, P48047, P63279, P48637, P01100, P17535, O14746, O15297, O60934, O96017, P00519, P01106, P04040, P05412, P06493, P07992, P09429, P10415, P11388, P12004, P12956, P13010, P16104, P21675, P23025, P26583, P27361, P27694, P27695, P35249, P35638, P38398, P39748, P40692, P43351, P45983, P49715, P49841, P51587, P54132, P54274, P55072, P60484, P63104, P78527, Q02880, Q05655, Q06609, Q07812, Q13535, Q13547, Q15554, Q16539, Q92769, Q92793, Q92889, Q96EB6, Q96ST3, Q9H3D4, P20700, Q07960, O75360, P10912, P50402, P04179, O75376, O75907, P01116, P17676, P23560, P60568, P62136, P98164, Q14186, Q14289, Q08050, Q00653, Q05195, P42858, Q9GZV9, P48357, P03372, P10275, P15336, P35568, Q02643, Q12778, Q9Y4H2, P06213, P08107, P11142, O60674, P42229, P51692, Q9UJ68, Q02297, P60953, P00749, P55916, Q96G97, P01112, P09211, P09936, P48506, Q15831, P11387, Q13253, O60566, P01133, P10599, P15923, P19235, P20226, P20248, P27986, P40763, P42338, P61244, P62979, Q05397, Q06124, Q09472, Q14526, Q15648, Q9UBK2, O60381, O94761, P29279, Q9UBX0, P42345, Q01094, P06746, Q8N6T7, O43524, P50542, O00327, O15120, O15217, O15243, O15516, O75844, O95985, P00390, P00395, P09629, P13639, P20382, P25874, P32745, P36969, P61278, P62987, P78406, P98177, Q00613, Q13219, Q99643, Q99807, Q9UBI1 Profiling a set of Aging genes Ageing-related genes (261) – http://genomics.senescence.info/genes/
57
Profiling patient sets Patient Reports ICD9 789.00 (Abdominal pain, unspecified) Patient records processed from U. Pittsburg NLP Repository with IRB approval.
58
Annotation Analytics Landscape SNOMED-CT Gene Ontology Gene Sets NCIT ICD-9 Human Disease Cell Type MeSH Drugs, Chemicals Grant Sets Paper Sets Agin g Patient Sets Drug Sets : EMRs Mut What questions can we ask? Health Indicator Warehouse datasets
59
ANNOTATION ANALYTICS - II Analysis of semantically tagged data from electronic health records
60
Term – 1 : Term – n Syntactic types Frequency Term recognition tool NCBO Annotator NegEx Patterns NegEx Rules – Negation detection P1ICD9 P1T1, T2, no T4 …T5, T4, T3 …T4, T3, T1 T8, T9, T4 …T6, T8, T10 T1, T2, no T4 P2 P3 : : Pn Terms form a temporal series of tags Cohort of Interest Diseases Procedures Drugs BioPortal – knowledge graph Creating clean lexicons Annotation Workflow Further Analysis Text clinical note Terms Recognized Negation detection Generation of tagged data
61
ROR of 2.058, CI of [1.804, 2.349] PRR of 1.828, CI of [1.645, 2.032] The uncorrected X 2 statistic has p-value < 10 -7. ROR=1.524, CI=[0.872, 2.666] PRR=1.508, CI=[0.8768, 2.594] X 2 p-value=0.06816. Adverse drug events
62
Off-label use
63
Analyses on semantically tagged data SNOMED-CT Gene Ontology Gene Sets NCIT ICD-9 Human Disease Cell Type MeSH Drugs, Chemicals Grant Sets Paper Sets Agin g Patient Sets Drug Sets : EMRs Mut 1.Discovering or predicting adverse drug events 2.Predicting a labeled outcome (readmissions) 3.Learning associations between terms of type intervention, disease, finding, side effects, drugs 4.Predicting rejection rates in billing/claims processing 5.Learning off-label usage patterns 1.Discovering or predicting adverse drug events 2.Predicting a labeled outcome (readmissions) 3.Learning associations between terms of type intervention, disease, finding, side effects, drugs 4.Predicting rejection rates in billing/claims processing 5.Learning off-label usage patterns Health Indicator Warehouse datasets
64
THE END
65
65 Credits Mark Musen, PI The team @ www.bioontology.org/project-team NIH Roadmap grant U54 HG004028 Credits Mark Musen, PI The team @ www.bioontology.org/project-team NIH Roadmap grant U54 HG004028
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.