D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April 3 2013 Seminar.

Slides:



Advertisements
Similar presentations
Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Advertisements

Detecting active subnetworks in molecular interaction networks with missing data Luke Hunter Texas A&M University SHURP 2007 Student.
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
Seeing the forest for the trees : using the Gene Ontology to restructure hierarchical clustering Dikla Dotan-Cohen, Simon Kasif and Avraham A. Melkman.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Gene expression analysis summary Where are we now?
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Clustering (Part II) 10/07/09. Outline Affinity propagation Quality evaluation.
Systems Biology Biological Sequence Analysis
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
Predicting protein functions from redundancies in large-scale protein interaction networks Speaker: Chun-hui CAI
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Assigning Numbers to the Arrows Parameterizing a Gene Regulation Network by using Accurate Expression Kinetics.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Epistasis Analysis Using Microarrays Chris Workman.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Inferring subnetworks from perturbed expression profiles Dana Pe’er, Aviv Regev, Gal Elidan and Nir Friedman Bioinformatics, Vol.17 Suppl
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Does gene order matter? Cis-regulatory elements, proteins, and messengers are integrated into biological circuits. Does gene location in the genome affect.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Analysis of the yeast transcriptional regulatory network.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Motifs BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ. of Texas/BCH364C-391L/Spring.
Identification of cell cycle-related regulatory motifs using a kernel canonical correlation analysis Presented by Rhee, Je-Keun Graduate Program in Bioinformatics.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Statistical Testing with Genes Saurabh Sinha CS 466.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Introduction to biological molecular networks
Proteomics, the next step What does each protein do? Where is each protein located? What does each protein interact with, if anything? What role does it.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network Science, Vol 292, Issue 5518, , 4 May 2001.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
An unsupervised conditional random fields approach for clustering gene expression time series Chang-Tsun Li, Yinyin Yuan and Roland Wilson Bioinformatics,
1 Discovery of Conserved Sequence Patterns Using a Stochastic Dictionary Model Authors Mayetri Gupta & Jun S. Liu Presented by Ellen Bishop 12/09/2003.
Simultaneous identification of causal genes and dys-regulated pathways in complex diseases Yoo-Ah Kim, Stefan Wuchty and Teresa M Przytycka Paper to be.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Dynamics and context-specificity in biological networks
Statistical Testing with Genes
System Structures Identification
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks
1 Department of Engineering, 2 Department of Mathematics,
EXTENDING GENE ANNOTATION WITH GENE EXPRESSION
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Dynamics and context-specificity in biological networks
SEG5010 Presentation Zhou Lanjun.
CISC 667 Intro to Bioinformatics (Spring 2007) Genetic networks and gene expression data CISC667, S07, Lec24, Liao.
Statistical Testing with Genes
Presentation transcript:

D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar in Algorithmic Challenges in Analyzing Big Data* in Biology and Medicine-TAU

OUTLINE Introduction- biological terms Motivation Methods Basic z-score calculation simulated annealing Results Discussion

P ROTEIN -P ROTEIN INTERACTION All living organisms consist of living cells All those cells, comprise the same building blocks: RNA,DNA and PROTEIN Protein sequences are encoded in DNA Proteins play major roles in all cellular processes

DNA REPLICATION TRANSCIPTION INTO mRNA TRANSLATION OF mRNA

P ROTEIN -DNA INTERACTIONS protein binds a molecule of DNA Regulate the biological function of DNA,biological function usually the expression of a gene.expressiongene Transcription factors that activate or repress gene expressionTranscription factors

G ENE E XPRESSION Gene is a sequence of the DNA. The gene decodes to a protein. the process by which information from a gene is used in the synthesis of a functional protein is called gene expression It is interesting to test gene expression on multiple conditions (experiments). Differential- express

DNA chips/ Microarrays -Simultaneous measurement of expression levels of all genes.

M OTIVATION Databases of PROTEIN-PROTEIN & PROTEIN-DNA interactions Widely available mRNA expression data Generate concrete hypotheses for the underlying mechanisms governing the observed changes in gene expression

M OTIVATION Exposing the yeast galactose utilization pathway to 20 perturbations Constructing a molecular interaction network by screening a database of protein- protein and protein-DNA interactions Select 362 interactions linking genes that were differentially-expressed under one or more perturbations. Analyze changes in expression.

Conclusion: Pairs of genes linked in this network were more likely to have correlated expression profiles than genes chosen at random however, the general task of Associating gene expression changes with higher order groups of interaction was not discussed

D ISCOVERING REGULATORY AND SIGNALING CIRCUITS IN MOLECULAR INTERACTION NETWORKS Introducing method for searching the networks to find ‘active sub-networks’ On multiple conditions, determine which conditions significantly affect gene expression in each subnetwork.

M ETHODS

Z- SCORE CALCULATION Given each gene i a value p i p i= The significance of differential expression of gene I z i = Ф -1 (1- p i ) ( z-score for gene i) aggregate z-score for subnetwork A Calibrating z against the background distribution

S CORING OVER MULTIPLE CONDITIONS Extending the scoring system over multiple conditions. Create a matrix of z-score. Rows- m conditions Columns-genes Produce m different aggregate scores (one for each condition Sort them from highest to lowest. compute r A max = max j (r A[j] )

Compute r A[j] for each j=1….m as follows: P Z = 1 – Ф( Z A[j] ) (the probability that any single condition has a z- score above Z A[j] ) b (the probability that at least j of the m conditions had scores above Z A[j] ) r A[j] = Ф - 1 (1-p A[j) ) r A max = max j (r A[j] ) compute r A max

Z score of gene 1 Conditio n 1 Conditio n 2 Conditio n 3 Conditio n 4

Aggregate scores of z A1 ….. z Amc Aggregate scores of z A1 ….. z Am sorted Computing r A[1] … r A[m] Taking max j (r A[j] ) Calibrating z against the background distribution

S IMULATED ANNEALING strategy to find local maximum we must sometimes select new points that do not improve solution Annealing - Gradual cooling of liquid Incorporate a temperature parameter into the maximization procedure At high temperatures, explore parameter space At lower temperatures, restrict exploration

S IMULATED A NNEALING S TRATEGY Start with some sample Propose a change Decide whether to accept change

S IMULATED A NNEALING S TRATEGY Decide whether to accept change- HOW?? Consider decreasing series of temperatures For each temperature, iterate these steps: Propose an update and evaluate function Accept updates that improve solution Accept some updates that don't improve solution Acceptance probability depends on “temperature” parameter

S EARCHING FOR HIGH SCORING SUBNETWORKS VIA SIMULATED ANNEALING Associate an active/inactive state with each node G W = denote the working sub graph of G induced by the active nodes

T HE ALGORITHM

H EURISTICS FOR IMPROVED ANNEALING Search for M subnetworks simultaneously Increasing the efficiency of annealing in networks with many ‘hubs’

High score node

Solution- changing step 3

Defining d min at the beginning of the algorithm If deg(node)> dmin Remove all neighbors that are not in the top scoring component Solution- changing step 3

R ESULTS

Small network with a single perturbation Z-scores

GAL4

TRANSCRIPTION FACTOR

Simulated annealing was preformed with parameters: N=100,000 Tstart= 1 Tend= 0.01 M=5 dmin=100

Distribution of sub-network score in actual and randomized data

Large network with several perturbation

D ISCUSSION

S UBNETWORKS ARE CONSISTS WITH KNOWN REGULATORY CIRCUITS

S UBNETWORKS VERSUS GENE EXPRESSION CLUSTERS Our approach groups genes subject to the constraints of molecular interaction network Subnetworks are scored over only a subset of conditions Groups genes only by the significance of change, while clustering methods groups genes by both magnitude and direction of change Our method leaves some genes unaffiliated with any subnetwork, unlike clustering which assign every gene to distinct cluster

F UTURE WORK Investigating the subnetworks we found in the laboratory Accommodating new types of interaction networks (protein and small molecules) Annotating each interaction with its directionally compartments