CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks

Slides:



Advertisements
Similar presentations
Systems biology SAMSI Opening Workshop Algebraic Methods in Systems Biology and Statistics September 14, 2008 Reinhard Laubenbacher Virginia Bioinformatics.
Advertisements

Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
Modelling and Identification of dynamical gene interactions Ronald Westra, Ralf Peeters Systems Theory Group Department of Mathematics Maastricht University.
CGeMM – University of Louisville Mining gene-gene interactions from microarray data - Coefficient of Determination Marcel Brun – CGeMM - UofL.
Recombinant DNA technology
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Finding detailed relationships between proteins specific to phenotypes among microbial organisms Daniel Park Molecular Biology Institute, UCLA Yeates lab.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Gene expression analysis summary Where are we now?
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
Inferring genetic regulatory logic from expression data Prepared by Chunhui CAI.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
An Introduction to DNA Microarrays Jack Newton University of Alberta
The Central Dogma of Molecular Biology (Things are not really this simple) Genetic information is stored in our DNA (~ 3 billion bp) The DNA of a.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
6. Gene Regulatory Networks
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Epistasis Analysis Using Microarrays Chris Workman.
CISC667, F05, Lec27, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Review Session.
Gaussian Processes for Transcription Factor Protein Inference Neil D. Lawrence, Guido Sanguinetti and Magnus Rattray.
Does gene order matter? Cis-regulatory elements, proteins, and messengers are integrated into biological circuits. Does gene location in the genome affect.
What Is a Gene Network?. Gene Regulatory Systems “Programs built into the DNA of every animal.” Eric H. Davidson.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Introduction to Proteomics 1. What is Proteomics? Proteomics - A newly emerging field of life science research that uses High Throughput (HT) technologies.
CISC841, F08, Lec2, Liao CISC 841 Bioinformatics (Fall 2008) A Primer on Molecular Biology & Bioinformatics challenges.
Reconstruction of Transcriptional Regulatory Networks
Analysis of the yeast transcriptional regulatory network.
A Biology Primer Part IV: Gene networks and systems biology Vasileios Hatzivassiloglou University of Texas at Dallas.
Anis Karimpour-Fard ‡, Ryan T. Gill †,
A Genetic Differential Amplifier: Design, Simulation, Construction and Testing Seema Nagaraj and Stephen Davies University of Toronto Edward S. Rogers.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Introduction to biological molecular networks
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Use of Logic Relationships to Decipher Protein Network Organization Peter M. Bowers, Shawn J. Cokus, David Eisenberg, Todd O. Yeates Presented by Krishna.
Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network Science, Vol 292, Issue 5518, , 4 May 2001.
Motif Search and RNA Structure Prediction Lesson 9.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
1 Applications of Symbolic Logic to Gene Regulation Systems Department of Computer Science and Information Engineering of National Chung-Cheng University.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
1 CISC 841 Bioinformatics (Fall 2008) Review Session.
CISC667, S07, Lec25, Liao1 CISC 467/667 Intro to Bioinformatics (Spring 2007) Review Session.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Journal club Jun , Zhen.
Investigating Diversity Part 2
System Structures Identification
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
Genetic Engineering and Gene Expression
Christopher A. Penfold Vicky Buchanan-Wollaston Katherine J. Denby And
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Analyzing Time Series Gene Expression Data
CISC 841 Bioinformatics (Fall 2007) A Primer on Molecular Biology & Bioinformatics challenges CISC841, F07, Lec2, Liao.
Genetics: From Genes to Genomes
Causal Models Lecture 12.
The Study of Biological Information
Principle of Epistasis Analysis
The MultiOmics Explainer
CISC 667 Intro to Bioinformatics (Spring 2007) Genetic networks and gene expression data CISC667, S07, Lec24, Liao.
Presentation transcript:

CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks

Gene Networks Definition: A gene network is a set of molecular components, such as genes and proteins, and interactions between them that collectively carry out some cellular function. A genetic regulatory network refers to the network of controls that turn on/off gene transcription. Motivation: Using a known structure of such networks, it is sometimes possible to describe behavior of cellular processes, reveal their function and the role of specific genes and proteins Experiments DNA microarray : observe the expression of many genes simultaneously and monitor gene expression at the level of mRNA abundance. Protein chips: the rapid identification of proteins and their abundance is becoming possible through methods such as 2D polyacrylamide gel electrophoresis. 2-hybrid systems: identify protein-protein interactions (Stan Fields’ lab http://depts.washington.edu/sfields/)

Regulation Genes (DNA) Message (RNA) Proteins Function/ Environment Other Cells

Genetic Network Models Linear Model: expression level of a node in a network depends on linear combination of the expression levels of its neighbors. Boolean Model: The most promising technique to date is based on the view of gene systems as a logical network of nodes that influence each other's expression levels. It assumes only two distinct levels of expression: 0 and 1. According to this model a value of a node at the next step is boolean function of the values of its neighbors. Bayesian Model: attempts to give a more accurate model of network behavior, based on Bayesian probabilities for expression levels.

Regulatory networks Protein-Protein interactions Metabolic networks

Boolean Networks: An example 1: induced 0: suppressed -: forced low +: forced high Interpreting data Reverse Engineering

Predictor A population of cells containing a target genetic network T is monitored in the steady state over a series of M experimental perturbations. In each perturbation pm (0 £ m < M) any number of nodes may be forced to a low or high level. Wild-type state -: forced low +: forced high

Step 1. For each gene xn, find all pairs of rows (i, j) in E in which the expression level of xn differs, excluding rows in which xn was forced to a high or low value. For x3, we find: (p0, p1), (p0, p3), (p1, p2), (p2, p3)

Step 2. For each pair (i,j), Sij contains all other genes whose expression levels also differ between experiments i and j. Find the minimum cover set Smin, which contains at least one node from each set Sij Step 2: (p0, p1)->S01={x0, x2} (p0, p3)->S03={x2} (p1, p2)-> S12={x0, x1} (p2, p3)->S23={x1) Step 1: (p0,p1), (p0, p3), (p1,p2), (p2,p3) So, now the Smin is {x1, x2}

Step 3. use the nodes in Smin as input, xn as output, build truth table to find out fn (In this example, n=3) Now the Smin is {x1, x2} x1 1 0 1 0 x2 1 1 0 0 x3 0 * 1 0 So f3 = 0 * 1 0 * cannot be determined

Use phylogenetic profile to infer “links” between pairs of proteins with similar profiles. Nature 405 (2000) 823-826

Nature 405 (2000) 823-826

Science 306(2004)2246-2249

A complete analysis of the logic relations possible between triplets of phylogenetic profiles.

Science 306(2004)2246-2249

Logic Analysis of Phylogenetic Profiles (LAPP) Uncertainty coefficient U(x|y) = [H(x) + H(y) – H(x, y)]/H(x) U is in the range [0, 1] U = 0 if x is completely independent of y U = 1 if x is a deterministic function of y Require a triplet of profiles a, b and c U(c|a) < 0.3 and U(c|b) < 0.3, but U(c| f(a,b)) > 0.6 where f is one of the eight possible logic relationships.

4873 distinct protein families in COGs generate 62 billion possible protein triplets 750,000 previously unknown relationships

YAL001C E-value Phylogenetic profile 0.122 1 1.064 0 3.589 0 0.008 1 0.692 1 8.49 0 14.79 0 0.584 1 1.567 0 0.324 1 0.002 1 3.456 0 2.135 0 0.142 1 0.001 1 0.112 1 1.274 0 0.234 1 4.562 0 3.934 0 0.489 1 2.421 0