Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Periodic clusters. Non periodic clusters That was only the beginning…
Table 2 shows that the set TFsf-TGblbs of predicted regulatory links has better results than the other two sets, based on having a significantly higher.
Computational detection of cis-regulatory modules Stein Aerts, Peter Van Loo, Ger Thijs, Yves Moreau and Bart De Moor Katholieke Universiteit Leuven, Belgium.
A Novel Knowledge Based Method to Predicting Transcription Factor Targets
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
Rich Probabilistic Models for Gene Expression Eran Segal (Stanford) Ben Taskar (Stanford) Audrey Gasch (Berkeley) Nir Friedman (Hebrew University) Daphne.
Gene regulatory network
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Bi-correlation clustering algorithm for determining a set of co- regulated genes BIOINFORMATICS vol. 25 no Anindya Bhattacharya and Rajat K. De.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Genomic analysis of regulatory network dynamics reveals large topological changes Paper Study Speaker: Cai Chunhui Sep 21, 2004.
Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Reconstructing Transcription Network in S.cerevisiae WANG Chao Oct. 4, 2004.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Indiana University Bloomington, IN Junguk Hur Computational Omics Lab School of Informatics Differential location analysis A novel approach to detecting.
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
An analysis of “Alignments anchored on genomic landmarks can aid in the identification of regulatory elements” by Kannan Tharakaraman et al. Sarah Aerni.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
1 Predicting Gene Expression from Sequence Michael A. Beer and Saeed Tavazoie Cell 117, (16 April 2004)
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Inferring subnetworks from perturbed expression profiles Dana Pe’er, Aviv Regev, Gal Elidan and Nir Friedman Bioinformatics, Vol.17 Suppl
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Reconstruction of Transcriptional Regulatory Networks
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Analysis of the yeast transcriptional regulatory network.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Computational Discovery of Gene Modules and Regulatory Networks Georg Gerber MIT Department of EECS and MIT/Harvard Health Sciences and Technology.
Conference Report: Recomb Satellite NYC, Nov 2010 DREAM, Systems Biology and Regulatory Genomics.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Introduction to biological molecular networks
Cluster validation Integration ICES Bioinformatics.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Finding genes in the genome
Transcription factor binding motifs (part II) 10/22/07.
Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.
EQTLs.
Reverse-engineering transcription control networks timothy s
Global Transcriptional Dysregulation in Breast Cancer
1 Department of Engineering, 2 Department of Mathematics,
Albert Xue, Binbin Huang, Jianrong Wang
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Schedule for the Afternoon
Computational Discovery of miR-TF Regulatory Modules in Human Genome
Understanding Statistical Inferences
Presentation transcript:

Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta

Agenda Introduction Goal of Paper Methods & Results Conclusions Critique Discussion

Introduction Interest in figuring out regulatory networks Genome-wide EXPRESSION data sets DNA-binding data (LOCATION analysis) PROBLEMS!!!

Theory… Integration of expression and location data ought to result in a more accurate assignment of genes to regulators, when compared to either types of data sets on their own. Basic result of combining data is more information.

Primary Goal Develop an algorithm that combines expression and location data to discover gene modules and regulatory networks.

Methods GRAM Algorithm Genetic RegulAtory Modules As opposed to ‘GRM’ algorithm??? GRAM Validation Comparing results Chromatin-IP (CHIP) experiments MIPS category enrichment analysis DNA binding motif analysis Targeted data analysis using GRAM Transcriptional regulation of rapamycin response

GRAM Algorithm Binding Data Expression Data Step 1: search all possible (pairwise) combinations of transcript regulators. Pull out sets of genes that share binding transcriptional regulators. STRINGENT BINDING CRITERIA USED (p <.001). Step 2: Reduce the gene sets from Step 1, by filtering out all genes from each set that do not have highly (positively) correlated expression levels. Reduced sets act as ‘seeds’ for gene ‘modules’. Step 3: Revisit binding data and adds genes sharing binding transcriptional regulators to gene modules from Step 2, using RELAXED BINDING CRITERIA (p <.01).

Methods GRAM applied to binding data for 106 transcription factors and 500 expression experiments from Saccharomyces cerevisiae. List of 500 Expression Experiments

Results: GRAM Algorithm 106 gene modules found. Containing 655 distinct genes. Regulated by 68 transcription factors (TFs).

Results: GRAM Algorithm (~ 35%)

Validation: Comparing Results Picked up many more (2.5x) regulator-gene interactions than binding data alone would have predicted. How do we know these are not all false- positives?

Validation: CHIP Experiments

Allows you to determine if a given gene actually binds to a specific TF. Used IP experiments for Stb1 and 36 randomly chosen genes to characterize sensitivity and specificity.

Validation: CHIP Experiments GRAM pulled out 3 TF-gene relationships that were… A) Validated by the IP results. B) NOT pulled out using binding data alone. GRAM did NOT pull out TF-gene relationships that were not also validated by the IP results. IP experiments indeed showed reduction in false negatives, and a lack of increase in false positives.

Validation: MIPS Categories Gene modules ought to belong to same MIPS categories. Gene modules derived using GRAM were 3X more likely to be enriched for genes in the same MIPS category than groups of genes derived from binding/location data alone.

Validation: DNA Binding Motifs Genes linked to specific TFs ought to have the same binding motifs upstream of them as those associated with their TFs. TRANSFAC database was used to determine whether genes in GRAM modules were more likely to be independently determined to be co- regulated vs. groups of genes from binding data alone.

Validation: DNA Binding Motifs GRAM modules did indeed display higher percentage of genes containing the appropriate motif in the upstream region of DNA. Further validation of GRAM algorithm.

Validation: Rapamycin Response Rapamycin inhibits Tor kinase signaling Mimics nutrient starvation Selected 14 TFs and performed genome- wide location analysis on them Ran GRAM algorithm using location data plus expression data from literature

Validation: Rapamycin Response Found 39 Gene Modules. 23 had significant MIPS category enrichment. Added 192 pairs of gene-TF interactions that location data alone missed. Generated 4 novel hypotheses.

Software Availability Provide link to Java Application

Conclusions GRAM provides a means of discovering putative regulatory networks that other data sets cannot detect independently. Integrating data sets provides us with more information than is available with either set independently.

Critiques No solid measure of sensitivity and specificity. Argue that GRAM is more sensitive, but without specificity measure, how do we know that these are not all false-positives? Looked for positive correlations as indicative of activation. Did not look at negatively correlated expression -- potentially an important loss of information. Software does not appear to work OOB with sample data provided.

Discussion Topics Can this method be applied to other higher-level organisms? Should it be? How can this model be improved to include more information? e.g. can we look at negatively correlated expression data? Should society consider other projects, on the scale of HGP, to extract more data on organisms in a standardized and systematic way? Pairwise data is used in many cases in biology to infer system-level interactions, which in reality are multivariate. Is using this pair-wise data wise? Is there an alternative? Could adding multiple species sets improve our results? i.e. Use metagenes instead of genes?

The End