Erich Gombocz VP, Chief Scientific Officer IO Informatics, Inc. NCBO Seminar Series – Wednesday, August 5, 2009 – 10:00 AM PDT.

Slides:



Advertisements
Similar presentations
Molecular Systems Biology 3; Article number 140; doi: /msb
Advertisements

Data Visualization in Molecular Biology Alexander Lex July 29, 2013.
Integrative Organs Systems Scientists and Drug Discovery: The Link Between Big Pharma and Academia Glenn A. Reinhart, Ph.D. Senior Group Leader Integrative.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
1 Pharmaceutical Challenges for the Semantic Web.
Applications of Visualization and Data Clustering to 3D Gene Expression Data Oliver Rübel 1,2,3,7, Gunther H. Weber 3,7, Min-Yu Huang 1,7, E. Wes Bethel.
©2013 MFMER | slide-1 Building A Knowledge Base of Severe Adverse Drug Events Based On AERS Reporting Data Using Semantic Web Technologies Guoqian Jiang,
Biomarkers in Transplantation. A Genome Canada Initiative for Human Health Biomarkers in Transplantation A Knowledge Base for Allograft Rejection Benjamin.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
Agilent: The Company, The Myth, The Lengend. Agilent: Agilent Technologies Inc. (NYSE: A) is a world-wide, diverse technology company focused on expansion.
The Application of the Scientific Method: Preclinical Trials Copyright PEER.tamu.edu.
Mining the Medical Literature Chirag Bhatt October 14 th, 2004.
Data Mining – Intro.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Immune Cell Ontology for Networks (ICON) Immunology Ontologies and Their Applications in Processing Clinical Data June 11-13, Buffalo, NY.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
JumpStart the Regulatory Review: Applying the Right Tools at the Right Time to the Right Audience Lilliam Rosario, Ph.D. Director Office of Computational.
1 The Discovery Informatics Framework Pat Rougeau President and CEO MDL Information Systems, Inc. Delivering the Integration Promise American Chemical.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
The SADI plug-in to the IO Informatics’ Knowledge Explorer...a quick explanation of how we “boot-strap” semantics...
2007 GeneSpring MS GeneSpring for Metabolite BioMarker Analysis using Mass Spectrometry data Agilent Q-TOF VIP Visit Jan 16-17, 2007 Santa Clara, CA Thon.
Chapter 13. The Impact of Genomics on Antimicrobial Drug Discovery and Toxicology CBBL - Young-sik Sohn-
Introduction to Pharmacoinformatics
SP Transcription Factor network of cell migration Summary Hypothesis: We hypothesize that, by generating a network model for transcription factor regulation.
Networks and Interactions Boo Virk v1.0.
Dynamic Hypermedia Generations through a Mediator using CRM and Web Service Jen-Shin Hong National ChiNan University,Taiwan
Flexible Text Mining using Interactive Information Extraction David Milward
Helping scientists collaborate BioCAD. ©2003 All Rights Reserved.
Bioinformatics Brad Windle Ph# Web Site:
From Bench to Bedside: Applications to Drug Discovery and Development Eric Neumann W3C HCLSIG co-chair Teranode Corporation HCLSIG F2F Cambridge MA.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.
Why use a Database B8 B8 1.
NIH Common Fund Library of Integrated Network- based Cellular Signatures LINCS Applicant Information Webinar for RFA-RM September 6, :00 –
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
An Ontology-based Framework for Radiation Oncology Patient Management DL McShan 1, ML Kessler 1 and BA Fraass 2 1 University of Michigan Medical Center,
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Scientific Inquiry How to Use The Scientific Method.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Click on a lesson name to select. The Study of Life Section 1: Introduction to Biology Section 2: The Nature of Science Section 3: Methods of Science.
CBioPortal Web resource for exploring, visualizing, and analyzing multidimentional cancer genomics data.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Perspective on the current state-of-knowledge of mode of action as it relates to the dose response assessment of cancer and noncancer toxicity Jennifer.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
High throughput biology data management and data intensive computing drivers George Michaels.
Improving Research Data Sharing and Reuse: Scientists and Repositories Michael Conlon, PhD Emeritus Faculty Member, University of Florida VIVO Project.
CCLE Cancer Cell Line Encyclopedia Alexey Erohskin.
Self-Service Data Integration with Power Query Stéphane Fréchette.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
Shows tendency for mergers. These big companies may be shrinking – much research is now outsourced to low cost countries like Latvia, India, China and.
BIOBASE Training TRANSFAC ® Containing data on eukaryotic transcription factors, their experimentally-proven binding sites, and regulated genes ExPlain™
Ingenuity Pathway Analysis Alex Pico. Description "IPA is a software application that enables researchers to analyze and understand the complex biological.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Networks and Interactions
Stanford University, Stanford, CA, USA
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
Data challenges in the pharmaceutical industry
Using Spotfire for Proteomic Analysis
CSc4730/6730 Scientific Visualization
An ecosystem of contributions
Benjamin Wooden, Nicolas Goossens, Yujin Hoshida, Scott L. Friedman 
BIOBASE Training TRANSFAC® ExPlain™
Presentation transcript:

Erich Gombocz VP, Chief Scientific Officer IO Informatics, Inc. NCBO Seminar Series – Wednesday, August 5, 2009 – 10:00 AM PDT

What is an “Applied Semantic Knowledgebase” (ASK™) ? What is an “Applied Semantic Knowledgebase” (ASK™) ? How is it done? How is it done? Building & exploring knowledge Building & exploring knowledge Creating & applying SPARQL queries Creating & applying SPARQL queries Refining & qualifying models Refining & qualifying models Scenarios & practical implications Scenarios & practical implications Live DEMO Live DEMO Use case: Combinatorial biomarker for toxicity Use case: Combinatorial biomarker for toxicity Summary & discussion Summary & discussion Impact of ASK Impact of ASK How far are we? How far are we?

Graph queries represent active knowledge contained in ASK. Framework to integrate, unify and combine data for knowledge extraction. Looking at data by how it relates to other data. Allows information to adapt and evolve. ASK SENTIENTSPARQL SEMANTICS

Apply ASK for predictive biology Test hypotheses, qualify & validate model Capture combinatorial marker patterns in ‘SPARQL Arrays’ Create semantic networks to visualize and explore biomarkers Unify & analyze data

Merge data dynamically into an extensible & reconfigurable ontology Apply thesauri for classes, entities and relationships

Example Case: Toxicity compendium and Alcohol study Objective: Better understanding of systems biology of toxicity Gene expression [GEP] (Affymetrix MA) Metabolic profiling [BCP](Bruker Daltonics LC/MS) Quantitative tissue analysis [QTA] (BioImagene) Metadata from internal LIMS Set of known toxicants (hepatotoxicants) Multiple tissues, different experimental animal models (rats) Enrich experiments with public knowledge (NCBI, HMDB, KEGG, IntAct, BioGRID, PubMed …) for causal reasoning on the biology (pathways, biological functions) Goal: Combinatorial biomarkers for toxicity prediction

Identify system perturbance Identify system perturbance Commonly affected genes and metabolites Commonly affected genes and metabolites Tissue-specific vs. non-tissue-specific effects Tissue-specific vs. non-tissue-specific effects Compliment & qualify experimental results Compliment & qualify experimental results Establish a sub-network of interest Establish a sub-network of interest

Automatic SPARQL directly from graph Automatic SPARQL directly from graph Choose the sub graph you are interested to explore Choose the sub graph you are interested to explore Set confidence ranges for numeric values Set confidence ranges for numeric values Run query on training set for iterative refinements Run query on training set for iterative refinements Focus on science, not on manual query editing Focus on science, not on manual query editing Generate SPARQL queries “without SPARQL” Generate SPARQL queries “without SPARQL”

SPARQL generation occurs behind the scene SPARQL generation occurs behind the scene Scientists don’t need to know anything about SPARQL Scientists don’t need to know anything about SPARQL BUT if you need, you can edit, cut, copy, paste … BUT if you need, you can edit, cut, copy, paste … Allows those who are familiar with SPARQL to review and modify directly for testing Allows those who are familiar with SPARQL to review and modify directly for testing ASK contains collections of such queries ASK contains collections of such queries

Immediate value from ASK SPARQL queries Immediate value from ASK SPARQL queries Example: Example: find compounds likely to exhibit a specific type of toxicity find compounds likely to exhibit a specific type of toxicity Step 1: Load Toxicity ASK Step 1: Load Toxicity ASK Step 2: Run query Step 2: Run query Step 3: Review results Step 3: Review results Result: Result: 3 compounds returned 3 compounds returned Model Tested Model Tested

Apply ASK SPARQL array queries via web Apply ASK SPARQL array queries via web Example: Example: 3 different toxicity types 3 different toxicity types Search for matches Search for matches Result: Result: All 3 profiles found matches All 3 profiles found matches 3 compounds for Benzene Toxicity (1) are listed for further analysis 3 compounds for Benzene Toxicity (1) are listed for further analysis Easy and rapid screening Easy and rapid screening

Sets of models can be easily derived from graph and aggregated in ASK for decision support Sets of models can be easily derived from graph and aggregated in ASK for decision support ASK patterns are directly applied to screening ASK patterns are directly applied to screening Applications include Applications include Drug target profiles Drug target profiles Compound efficacy screening Compound efficacy screening Toxicity profiling and detection Toxicity profiling and detection Disease signatures Disease signatures Patient selection for clinical trials Patient selection for clinical trials Patient stratification Patient stratification ASK is effective in organizing and applying the knowledge contained in your data ASK is effective in organizing and applying the knowledge contained in your data ASK gives you a decisive competitive advantage in biomarker-based predictive biology ASK gives you a decisive competitive advantage in biomarker-based predictive biology

Semantic data integration of experimental and public data provides the framework for meaningful biological model generation Semantic data integration of experimental and public data provides the framework for meaningful biological model generation Sub-network visualization (intersections, exclusions) is directly transformed into SPARQL queries Sub-network visualization (intersections, exclusions) is directly transformed into SPARQL queries SPARQL queries describe network-derived models SPARQL queries describe network-derived models Arrays of SPARQL queries contained in ASK allow to screen for complex biological functions Arrays of SPARQL queries contained in ASK allow to screen for complex biological functions

Target Validation Compound Efficacy Toxicity Profiling Compound Safety Predictive Screening Disease Signatures Patient Stratification

Semantic data integration puts multi-modal experimental and public data in context Semantic data integration puts multi-modal experimental and public data in context Network exploration facilitates capturing marker classifiers Network exploration facilitates capturing marker classifiers Query pattern are directly derived from graph Query pattern are directly derived from graph Confidence in model can be iteratively refined Confidence in model can be iteratively refined ASK patterns are directly applied to screening of unknown datasets ASK patterns are directly applied to screening of unknown datasets

NIST Advanced Technology Program (ATP) Award # 70NANB2H3009 Frank Barros CLDA / Cogenics Tom Colatsky, Pat Hurban, Imran Shah, Hongkang Mei, Alan Higgins, Maureen McBride-Simon IOI Semantic Working Group Jonas Almeida, Alan Higgins, Pat Hurban, Bruce McManus, Ted Slater, Mark Wilkinson Semantic standards development CSHALS, W3C HCLSIG, Oracle 11G Test Group

Erich Gombocz IO Informatics, Inc Ninth Street, Suite 114 Berkeley, CA U.S.A. Phone (+1) Fax(+1)