Module Networks BMI/CS 576 Mark Craven December 2007.

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

Lectures 9 – Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
From Sequence to Expression: A Probabilistic Framework Eran Segal (Stanford) Joint work with: Yoseph Barash (Hebrew U.) Itamar Simon (Whitehead Inst.)
Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman.
Clustering approaches for high- throughput data Sushmita Roy BMI/CS 576 Nov 12 th, 2013.
Signal Processing in Single Cells Tony 03/30/2005.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Author: Jason Weston et., al PANS Presented by Tie Wang Protein Ranking: From Local to global structure in protein similarity network.
Functional genomics and inferring regulatory pathways with gene expression data.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
6. Gene Regulatory Networks
Module Networks Discovering Regulatory Modules and their Condition Specific Regulators from Gene Expression Data Cohen Jony.
Regulatory element detection using correlation with expression (REDUCE) Literature search WANG Chao Sept 14, 2004.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Fixed Parameter Complexity Algorithms and Networks.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Learning Regulatory Networks that Represent Regulator States and Roles Keith Noto and Mark Craven K. Noto and M. Craven, Learning Regulatory.
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Analysis of the yeast transcriptional regulatory network.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Lectures 9 – Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Hierarchical Bayesian Model Specification Model is specified by the Directed Acyclic Network (DAG) and the conditional probability distributions of all.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Clustering Gene Expression Data BMI/CS 576 Colin Dewey Fall 2010.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Conditional Probability Distributions Eran Segal Weizmann Institute.
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
Introduction to biological molecular networks
Flat clustering approaches
Learning Sequence Motifs Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 Mark Craven
Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.
Inference with Gene Expression and Sequence Data BMI/CS 776 Mark Craven April 2002.
Learning disjunctions in Geronimo’s regression trees Felix Sanchez Garcia supervised by Prof. Dana Pe’er.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
More on HMMs and Multiple Sequence Alignment BMI/CS 776 Mark Craven March 2002.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Graph clustering to detect network modules
Representation, Learning and Inference in Models of Cellular Networks
Multi-task learning approaches to modeling context-specific networks
Learning Sequence Motif Models Using Expectation Maximization (EM)
Inferring Models of cis-Regulatory Modules using Information Theory
Hierarchical clustering approaches for high-throughput data
1 Department of Engineering, 2 Department of Mathematics,
Network topology reflects target gene expression profile and function.
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Evaluation of inferred networks
Finding regulatory modules
Principle of Epistasis Analysis
Diverse patterns, similar mechanism
Dynamic regulatory map and static network for yeast response to AA starvation. Dynamic regulatory map and static network for yeast response to AA starvation.
Presentation transcript:

Module Networks BMI/CS Mark Craven December 2007

Module Networks Motivation sets of variables often have the same behavior consider this simple stock example we can group variables into modules, have the members of a module share the same CPD Figure from Segal et al., UAI, 2003.

Module Networks a module network is defined by –a specified number of modules –an assignment of each variable to a module –a shared CPD for the variables in each module the learning task thus entails * –determining the assignment of variables to modules –inducing a CPD for each module * assuming we’re given the number of modules

Module Networks: Identifying Regulatory Modules and their Condition-Specific Regulators from Gene Expression Data. E. Segal et al., Nature Genetics 34(2): , 2003 given: –gene expression data set their method identifies: –sets of genes that are co-expressed (assignment to modules) –a “program” that explains expression profile for each set of genes (CPD for each module)

A Regulation Program suppose we have a set of (8) genes that all have in their upstream regions the same activator/repressor binding sites

The Respiration and Carbon Module

Regulation Programs as CPDs each of these regulation programs is actually a CPD represented using a tree –internal nodes are tests on continuous variables –leaves contain conditional distributions for the genes in the module, represented by Gaussians is HAP4 > 0.1 P(expression| leaf)

Tree CPDs in Gene Module Networks the parents of each module can be other modules in this study, Segal et al. limit parents to a set of candidate regulator genes (genes known to be transcription factors and signaling components)

Module Network Learning Procedure given: expression profile for each gene, set of candidate regulator genes initialize module assignments by clustering expression profiles repeat until convergence structure search step: for each module learn a CPD tree using splits on candidate regulators module assignment step: repeat until convergence for each gene find the module that best explains it move the gene to this module update Gaussians at leaves

Structure Search Step the method for the structure search step is very similar to the general decision-tree procedure we considered before –splits are on genes in the candidate regulator set –leaves represent distributions over continuous values the name for this step is somewhat misleading –it does involve learning structure – selecting parents for variables in the module –it also involves learning the parameters of the Gaussians at the leaves –the module assignment step heavily influences the structure

Module Assignment Step Can we independently assign each variable to its best module? –No – we might get cycles in the graph. –the score for a module depends on all of the genes in the module therefore we use a sequential update method (moving one gene at a time) –can ensure that each change is a legal assignment that improves the score

Empirical Evaluation many modules are enriched for –binding sites for associated regulators –common gene annotations

Global View of Modules modules for common processes often share common –regulators –binding site motifs

Comments on Module Networks module networks exploit the fact that many variables (genes) are determined by the same set of variables this application exploits the fact that we may have background knowledge about the variables that can be parents of others (the candidate regulators) the learning procedure is like EM, but hard decisions are made (each gene is completely assigned to a module)