Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

Slides:



Advertisements
Similar presentations
Bioinformatics Platform Three-tier Architecture Object-based Relational Database implemented using Oracle Middleware implemented using Entity-Class Operations,
Advertisements

13:10:58 A New Tool for Mapping Microarray Data onto the Gene Ontology Structure ( Abstract e GOn (explore Gene Ontology) is a.
Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Computational characterization of biomolecular networks in physiology and disease Kakajan Komurov, Ph.D Department of Systems Biology University of Texas.
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
AI and Bioinformatics From Database Mining to the Robot Scientist.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Bioinformatics Needs for the post-genomic era Dr. Erik Bongcam-Rudloff The Linnaeus Centre for Bioinformatics.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Separation of Scales. Interpretation of Networks Most publications do not consider this.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Chapter Seven The Network Approach: Mind as a Web.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
The bioinformatics of biological processes The challenge of temporal data Per J. Kraulis CMCM, Tartu University.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
”Representing Temporal Knowledge for Case-Based Prediction” Martha Dørum Jære, Agnar Aamodt, Pål Skalle.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
Bioinformatics Dr. Víctor Treviño BT4007
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
Helping scientists collaborate BioCAD. ©2003 All Rights Reserved.
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Genomics for Librarians Stuart M. Brown, Ph.D. Director, Research Computing, NYU School of Medicine.
Working Group 4 Creative Systems for Knowledge Management in Life Sciences.
Discovering Structural Models Lecture 19. Structural Models in Science Structural models encode the spatial relationships among the components of some.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Data provenance in biomedical discovery Donald Dunbar Queen’s Medical Research Institute University of Edinburgh Workshop on Principles of Provenance in.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Microarrays.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Labeling and Enhancing Life Science Links S. Heymann*, F. Naumann*, L. Raschid +, P. Rieger * * Humboldt Universität zu Berlin + University of Maryland.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Structural Models Lecture 11. Structural Models: Introduction Structural models display relationships among entities and have a variety of uses, such.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
A collaborative tool for sequence annotation. Contact:
Introduction to biological molecular networks
Proposed Research Problem Solving Environment for T. cruzi Intuitive querying of multiple sets of heterogeneous databases Formulate scientific workflows.
Oracle Spatial Network Data Model Overview Oracle Life Sciences User Group Meeting Susie Stephens Life Sciences Product Manager Oracle Corporation.
Data Mining and Decision Support
Knowledge Representation Fall 2013 COMP3710 Artificial Intelligence Computing Science Thompson Rivers University.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
High throughput biology data management and data intensive computing drivers George Michaels.
Digital Archive page 1 Worzyk Anhalt University of Applied Sciences Digital Archive Storage of pictorial material from the Departments of Design and Architecture.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
National Cancer Institute Uma Mudunuri ABCC, NCI-Frederick ISRCE Monthly Meeting, Nov 9th 2010 bioDBnet The biological DataBase network.
Knowledge Representation
Biological Databases By: Komal Arora.
CHAPTER 1 Introduction BIC 3337 EXPERT SYSTEM.
Associative Query Answering via Query Feature Similarity
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Taxonomy of Problem Solving and Case-Based Reasoning (CBR)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Schedule for the Afternoon
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Knowledge Representation
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
The Network Approach: Mind as a Web
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk

slide 2 The Motivation In various biological studies researchers often come up with a list of (possibly related) genes If the relations between these genes are unknown or hypothetic, they have to be confirmed either experimentally or through a database search (or both) Manual browsing or searching is a very tedious task; any interpretation of the results requires expert knowledge

slide 3 The Goal To automate the search in order to –assist a biologist in forming explanations of actual and hypothetical relationships between sets of genes –using various types and sources of data, and various similarity assessment tools, and background (domain) knowledge

slide 4 The Field The most important participating disciplines Biology Computer Science Bioinformatics

slide 5 The Biologist’s Problem Given a collection of genes, how can we explain the relationships between them, using the available data and knowledge? –How does gene g 1 regulate (activate, inhibit) gene g 2 ? –What is the functional similarity of gene g 3 to gene g 4 ? –What is the metabolic (signalling) pathway common to gene g 5 and g 6 in the context of disease d 1 ?

slide 6 The Bioinformatician’s Problem Given a collection of (biological) objects, which of their properties can we compare and how, and where can we find their values? –Where do we find the gene sequence (protein structure) data? –How do we assess the similarity between two gene sequences (protein structures)? –Where do we find the suitable tools, how do we use them and how do we interpret the results?

slide 7 The Computer Scientist’s Problem Given a collection of distributed data and tools to link them, how do we build an explanatory path between objects from a query? A search problem: –separate, partially overlapping graphs –coloured nodes –coloured, weighted, dynamic edges

slide 8 Simplified Search Space Graph with homogeneous vertices and edges Task: find (shortest) paths

slide 9 More Realistic Search Space Graph with qualitatively different vertices, qualitatively different edges weighted with qualitatively different weights Task: find (plausible) paths

slide 10 Even More Realistic Search Space Each node is connected to a multitude of other nodes; combinatorial explosion – an exhaustive search unfeasible Task: find heuristics to guide the search (generic and specific)

slide 11 A Trivial Example Input query

slide 12 A Trivial Example Initial mapping

slide 13 A Trivial Example Activation spreading

slide 14 A Trivial Example Plausible inheritance (inference)

slide 15 A Trivial Example Activation spreading

slide 16 A Trivial Example Data retrieval and mapping

slide 17 A Trivial Example Induction

slide 18 A Trivial Example Activation spreading

slide 19 A Trivial Example Plausible inheritance

slide 20 A Trivial Example Data retrieval and mapping Formulation of an explanation

slide 21 Explanation Schema

slide 22 System Architecture

slide 23 Related Work Basic research in gastric cancer Genomic & proteomic datawarehouse Syntactic & semantic database integration Natural language understanding Knowledge representation & modelling Knowledge intensive reasoning and learning

slide 24 Concerns Is it reasonable? (what do biologists say) Is it possible? (what do bioinformaticians say) Is it feasible? (what do computer scientists say) Isn’t it too ambitious (for a PhD study)? ? ?

slide 25 Disclaimer An in silico solution is actually a hypothesis that requires physical (experimental) confirmation. ! !

slide 26 Acknowledgments Agnar Aamodt, IDI.IME (AI, ML, CBR) Astrid Lægreid, IKM.DMF (biology, bioinformatics) Arne Sandvik, IKM.DMF (medicine) Frode Sørmo, IDI.IME (Creek)