CSE891-001 2012 Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Putting genetic interactions in context through a global modular decomposition Jamal.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Correlation and regression
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
A Probabilistic Dynamical Model for Quantitative Inference of the Regulatory Mechanism of Transcription Guido Sanguinetti, Magnus Rattray and Neil D. Lawrence.
Work Process Using Enrich Load biological data Check enrichment of crossed data sets Extract statistically significant results Multiple hypothesis correction.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Chapter Seventeen HYPOTHESIS TESTING
ANOVA: ANalysis Of VAriance. In the general linear model x = μ + σ 2 (Age) + σ 2 (Genotype) + σ 2 (Measurement) + σ 2 (Condition) + σ 2 (ε) Each of the.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Functional genomics and inferring regulatory pathways with gene expression data.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
Chapter 11 Multiple Regression.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
Epistasis Analysis Using Microarrays Chris Workman.
Statistics for Managers Using Microsoft® Excel 5th Edition
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Overview of Statistical Hypothesis Testing: The z-Test
© 2003 Prentice-Hall, Inc.Chap 9-1 Fundamentals of Hypothesis Testing: One-Sample Tests IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk.
Reconstruction of Transcriptional Regulatory Networks
Inferring strengths of protein-protein interactions from experimental data using linear programming Morihiro Hayashida, Nobuhisa Ueda, Tatsuya Akutsu Bioinformatics.
Testing of Hypothesis Fundamentals of Hypothesis.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
TEMPLATE DESIGN © Molecular Re-Classification of Renal Disease Using Approximate Graph Matching, Clustering and Pattern.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
IMPROVED RECONSTRUCTION OF IN SILICO GENE REGULATORY NETWORKS BY INTEGRATING KNOCKOUT AND PERTURBATION DATA Yip, K. Y., Alexander, R. P., Yan, K. K., &
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Introduction to biological molecular networks
© 2004 Prentice-Hall, Inc.Chap 9-1 Basic Business Statistics (9 th Edition) Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
. Finding Motifs in Promoter Regions Libi Hertzberg Or Zuk.
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
1 of 53Visit UMT online at Prentice Hall 2003 Chapter 9, STAT125Basic Business Statistics STATISTICS FOR MANAGERS University of Management.
Independent Samples ANOVA. Outline of Today’s Discussion 1.Independent Samples ANOVA: A Conceptual Introduction 2.The Equal Variance Assumption 3.Cumulative.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Identifying submodules of cellular regulatory networks Guido Sanguinetti Joint work with N.D. Lawrence and M. Rattray.
Methods of Presenting and Interpreting Information Class 9.
Virtual University of Pakistan
EQTLs.
Chapter 9 Hypothesis Testing.
Ingenuity Knowledge Base
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
FUNCTIONAL ANNOTATION OF REGULATORY PATHWAYS
Presentation transcript:

CSE Fall

Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model correspond to verifiable properties of the underlying biological system, such as the existence of PPI and protein-DNA interactions, the directionality of signal transduction in PPIs, as well as signs of the immediate effects of these interactions The most likely settings of all the variables, specifying the most likely graph annotations, are found by a recursive application of the max-product algorithm.

Challenges The challenge arises in part from the apparent complexity of regulatory systems – involving diverse physical mechanisms of control – Involving complex networks of molecular interactions mediating such mechanisms To unravel such systems effectively, we need both sufficient experimental data probing relevant aspects of such systems, as well as computational approaches that aid in the large scale reconstruction of the underlying molecular biology.

Simplified Model TF-promoter bindings and cascades of protein modifications or interactions form the basis of transcriptional initiation and signal transduction pathways. External stimuli are propagated via signal transduction pathways to activate TFs, which in turn bind to promoters to enable or repress transcription of the associated genes. Assumption: gene regulatory processes are carried out along pathways of physical interaction networks.

Physical Network Model - Concepts By integrating multiple sources of data, we infer annotated molecular interaction networks The existence of interactions and their annotations are represented by variables. Each annotated interaction network is represented by a particular setting of all the variables. The available data biases the setting of the variables, therefore giving rise to the most likely hypothesis about the annotated interactions networks. The output of the approach is one annotated physical network, including the presence of specific interactions, causal directions and immediate effects of such interactions, and active pathways.

Skeleton Graph - Concept The basic directed network which contains possible molecular interactions is called a skeleton graph – The direction of a protein–DNA edge is determined a priori – In contrast, the direction of a PPI edge is initially undetermined as we do not necessarily know the regulatory processes where the interaction occurs Problems: – If a skeleton graph contains all possible physical interactions, it could in principle be a complete graph – By limiting ourselves to the skeleton graph, we may be affected by false negatives, interactions without any support in the available interaction data

Skeleton Graph - Annotation Three types of data are used to estimate the physical network models: protein-DNA, protein-protein, and single gene knock-out expression data.

Skeleton Graph - Assumptions to cope with the limited data We view cascades of molecular interactions as the primary causal mechanisms underlying gene regulation We assume that the effect of gene deletions propagates along the molecular cascades represented in the model We collapse PPIs together with protein modification; we consider only the case that each PPI possesses a unique direction in the pathways in which it participates. We assume that the regulatory circuitry is not radically changed following a knock-out, at least not in terms of how the effects of the deletion are propagated

Data Association The values of the variables specifying the physical network model can be inferred from the available data To achieve this, we must tie the setting of the variables to specific measurements – Some of the variables, such as presence or absence of protein- DNA interactions, can be directly associated with individual measurements. – Others, such as the signs of the edges denoting either repressive or activating interactions, need to be inferred indirectly based on responses to gene deletions.

For a binary variable x indicating the presence/absence of a protein-DNA edge, the measurements pertaining to this variable are incorporated into the model according to the Error Model(likelihood ratio): The value of x more consistent with the observed data yields a higher value of the potential function. Potential functions as likelihood ratios are combined by multiplication across independent observations. Data Association

The sample skeleton graph, in which all the edges are protein-DNA interactions and so their directions are specified a priori. Build the potential functions for biasing the presence of edges. We obtain the likelihood ratio for the potential function φ. We assume that the likelihood ratio is the same for all edges and has the value 0.9, indicating a slight bias against an edge. The potential function is therefore φ(xi) = (0.9) xi.

Data Association The sample skeleton graph, in which all the edges are protein-DNA interactions and so their directions are specified a priori. Suppose the only data available is from the g1 deletion experiment and only g4 is significantly down-regulated in response to the knock-out, denoted as (g1, g4, −). Two explanations for this effect given by the two alternative paths from g1 to g4. The aggregate sign along the path must be the negative of the knock-out effect -> potential function OR

Data Association

Potential Function φ

Assume that the p-value comes from testing the null hypothesis H0 (the binding does not occur) against the alternative hypothesis H1 (the binding occurs). Under certain regularity conditions, the asymptotic sampling distribution of the log likelihood ratio statistic is χ 2 with degrees of freedom 1, assuming the difference in the degrees of freedom in the two hypotheses is 1. The value of the log-likelihood ratio statistic is: where p is the reported p-value and F (.) is the cumulative χ 2 distribution (CDF) with one degree of freedom.

Potential Function φ

CDF

Potential Function φ Where k is the degree of freedom, Γ(k/2) denotes the Gamma function which has closed-form values for even k, γ(k,z) is the lower incomplete Gamma function and P(k,z) is the regularized Gamma function. Cumulative distribution function

Potential Function φ where n=3 Eventually,

Results A small scale analysis involving the yeast mating response

Results Solid lines correspond to protein-DNA and dash lines represent PPIs

Results By choosing p-value threshold 0.001, we extracted 5,485 protein-DNA pairs from Lee et al., 2002 There are 14,876 PPIs of yeast proteins reported in the DIP database By choosing the p-value threshold to be 0.02, we extracted 23,766 pairwise knock-out effects from Hughes et al., 2000 The results reflect primarily the lack of systematic knock-out data to constrain annotations of interaction networks Genome-wide analysis

Conclusions A new modeling framework for reconstructing annotated molecular networks from multiple sources of data The resulting annotated molecular interactions networks were shown to be highly predictive about knockout effects in a cross-validation setting involving the yeast mating pathway In a genome-wide analysis, a much smaller fraction of knock-out effects could be explained by short paths