Gene regulatory network

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Periodic clusters. Non periodic clusters That was only the beginning…
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Predicting Enhancers in Co-Expressed Genes Harshit Maheshwari Prabhat Pandey.
The multi-layered organization of information in living systems
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Genomic analysis of regulatory network dynamics reveals large topological changes Paper Study Speaker: Cai Chunhui Sep 21, 2004.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
Predicting protein functions from redundancies in large-scale protein interaction networks Speaker: Chun-hui CAI
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Fuzzy K means.
The Hardwiring of development: organization and function of genomic regulatory systems Maria I. Arnone and Eric H. Davidson.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Epistasis Analysis Using Microarrays Chris Workman.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
ChIP-seq and its applications in GRN construction Jin Chen 2012 Fall CSE
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Detecting binding sites for transcription factors by correlating sequence data with expression. Erik Aurell Adam Ameur Jakub Orzechowski Westholm in collaboration.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
Transcription factor binding sites and gene regulatory network Victor Jin Department of Biomedical Informatics The Ohio State University.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Finish up array applications Move on to proteomics Protein microarrays.
ChIP-on-Chip and Differential Location Analysis Junguk Hur School of Informatics October 4, 2005.
Reconstruction of Transcriptional Regulatory Networks
Proteome and interactome Bioinformatics.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Analysis of the yeast transcriptional regulatory network.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Localising regulatory elements using statistical analysis and shortest unique substrings of DNA Nora Pierstorff 1, Rodrigo Nunes de Fonseca 2, Thomas Wiehe.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
Conference Report: Recomb Satellite NYC, Nov 2010 DREAM, Systems Biology and Regulatory Genomics.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Cis-regulatory Modules and Module Discovery
Introduction to biological molecular networks
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Last Class 1. Transcription 2. RNA Modification and Splicing
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Biological Network Analysis
Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
BIOBASE Training TRANSFAC ® Containing data on eukaryotic transcription factors, their experimentally-proven binding sites, and regulated genes ExPlain™
Change in Pufs and their RNA InteractionsAnalogous change in transcription factors and their gene regulation Puf binding specificity tends to be conserved.
Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent
Control of Gene Expression
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Nora Pierstorff Dept. of Genetics University of Cologne
Presentation transcript:

Gene regulatory network Jin Chen CSE891-001 2012 Fall

Outline Transcriptional regulation Co-expression & co-regulation Bio-techniques ChIP-seq Bacterial one-hybrid system Computational models for gene regulation network construction Binding + expressions with TF knock out Binding + time serial gene expressions

David J.C. MacKay, Information Theory, Inference & Learning Algorithms, 2003

Transcriptional regulation Regulation of transcription controls when transcription occurs and how much RNA is created Transcription factors are often needed to be bound to a regulatory binding site to switch a gene on (activator) or to shut off a gene (repressor) Generally, as the organism grows more sophisticated, their cellular protein regulation becomes more complicated Example: when a cell contains a surplus amount of the amino acid tryptophan, the acid binds to a specialized repressor protein (tryptophan repressor). The binding changes the structural conformity of the repressor such that it binds to the operator region for the operon that synthesizes tryptophan, preventing their expression and thus suspending production. This is a form of negative feedback. and indeed some human genes can be controlled by many activators and repressors working together

Transcription control Transcription control is directed primarily by two elements Transcription factors (TF) DNA sequences that facilitate the binding of these TFs (cis-regulatory elements)

Transcription control First, there needs to be an initiating signal This signal gives rise to the activation of a TF, and recruits other members of the "transcription machine." TFs generally simultaneously bind DNA. TFs and their cofactors, can be regulated through reversible structural alterations Transcription is initiated at the promoter site, as an increase in the amount of an active TF binds a target DNA sequence. Other proteins, known as "scaffolding proteins" bind other cofactors and hold them in place Frequently, extracellular signals induce the expression of immediate early genes. These are in and of themselves TFs or components thereof, and can further influence gene expression In order for a gene to be expressed, several things must happen.

Gene regulation network Node TF or target gene Edge Regulation relation Directed Activation “--->” Inhibition “---|” Yeast Mata et al. Genome Biol. 2007;8(10):R217

Co-expression & co-regulation Genes belonging to the same cluster are often called co-expressed Genes with similar expression patterns might share transcription factors and functional regulatory binding sites

Topic 1. GRN Reconstruction Background Topic 1. GRN Reconstruction Algorithms for GRN reconstruction based on gene expression & motif data (2002-2008) Algorithms focusing on integrating binding data with existing data (2007-now) Time 2002 2008 2007 2010 2 input data Time serial microarray TF binding motifs 3 models Time shift matching Mutual information Granger causality & DBN 2 limitations High false positive rate Small scale (# genes~100) 3 input data Time serial microarray TF binding motifs Binding data 2 models Time shift matching and binding Protein expression approximation from binding data 2 limitations Lack of binding data at systems level Combinatorial TF studies To see the interaction among regulatory genes, people utilize large-scale perturbation assays in which expression of each gene was interrupted. A putative GRN built from time serial gene expression data or binding data may have a high proportion of false positives. Due to the algorithm scalability and over-fitting problems of existing learning systems, the direct purification of a large GRN is very difficult.

ChIP-seq ChIP-seq is the sequencing of the genomic DNA fragments that co-precipitate with a DNA-binding protein that is under study DNA-binding proteins most frequently investigated in this way are transcription factors, etc ChIP-seq can identify all DNA segments in the genome physically associated with a specific DNA-binding protein It does not rely on prior knowledge of precise DNA binding sites precise mapping of DNA-binding proteins/complexes transcription factors, chromatin-modifying enzymes, modified histones interacting with genomic DNA, and components of the basal transcriptional machinery Liu et al, BMC Biology 2010, 8:56

Flow scheme of the central steps in the ChIP-seq procedure Liu et al. BMC Biology 2010 8:56  

Example Algorithmic analysis for mapping and peak-calling Helicos DNA Sample Preparation Methodology for ChIP-Seq A ChIP DNA sample from a stem cell population and the corresponding input DNA sample were both processed without amplification A ChIP DNA sample from a stem cell population and the corresponding input DNA sample were both processed without amplification http://www.helicosbio.com/Applications/ChIPSeq/tabid/69/Default.aspx

Three key steps in ChIP-seq Antibody selection Sequencing Algorithmic analysis for mapping and peak-calling

Binding network First action of a TF is to find and to bind DNA segments and ChIP-seq allows the binding sites of TFs to be identified across entire genomes Protein-DNA binding network Direct downstream targets of any transcription factor can be determined DNA sequence motif that is recognized by the binding protein can be computed Precise regulatory sites in the genome for any transcription factor can be identified Clustering of transcription-regulatory proteins at specific DNA sites can be assessed

Example ChIP-seq profiling of 13 TFs in embryonic stem (ES) cell development revealed the organization of regulatory elements. This provided insights in the integration of TF-mediated signaling pathways in ES cell differentiation Chen et al Cell 2008 , 133:1106-1117

What ChIP-seq cannot do Many observed binding events are neutral and do not regulate transcription Regulatory binding events often occur at enhancers that are not proximal to the target gene that they control  The task of identifying transcriptional targets requires the integration of ChIP-seq with evidence from expression data to help associate binding events with target gene regulation Honkela et al. PNAS 2010 vol. 107 no. 17 pp 7793–7798

Bacterial one-hybrid system wikipedia

Gene expression data TF knock-out gene expression TF over-expression gene expression Differential expression of genes between wild type and mutant/over-expression is indicative of a potential regulatory interaction, e.g. Yeast GRN Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Comprehensive analysis of TF knockout expression data in Yeast 269 TF knockout microarrays, covering almost all yeast TFs DNA–protein interactions derived from ChIP-chip experiments Predicted TF binding sites with position weight matrices Hu et al. Nat. Genet., 39, 683–687, 2007; Harbison et al. Nature, 431, 99–104. 2004

Comprehensive analysis of TF knockout expression data in Yeast Checked the expression levels of the TFs Intuitively one expects the TF under consideration to have lower expression in the mutant strain compared with the wild type strain confirms this for 155 TFs 78 TFs display a negative fold change at statistically non-significant levels 36 TFs are lethal Among these regulators are several that affect many genes, emphasising that even small adjustments to TF expression levels can have a dramatic effect on target genes. Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Comprehensive analysis of TF knockout expression data in Yeast Examine functional annotations of differentially expressed genes As most TFs are considered to regulate distinct cellular processes, their target genes should be associated with a coherent set of molecular and biological functions Used g:Profiler to identify GO, KEGG and Reactome pathway annotations Among these regulators are several that affect many genes, emphasising that even small adjustments to TF expression levels can have a dramatic effect on target genes. Across all TF knockouts, this analysis has a higher score than the original analysis Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Among these regulators are several that affect many genes, emphasising that even small adjustments to TF expression levels can have a dramatic effect on target genes. Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Comprehensive analysis of TF knockout expression data in Yeast Overlap between TF-binding and TF knockout data Collect binding sites for 142 TFs, comprising 5,188 ChIP-chip interactions and 17,091 motif predictions Calculate the intersection between the list of differentially expressed genes from the TF knockout and targets identified by ChIP-chip or binding-site predictions 2,230 regulation relations Among these regulators are several that affect many genes, emphasising that even small adjustments to TF expression levels can have a dramatic effect on target genes. Information from large-scale ChIP-chip and motif-finding studies were considered Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Comprehensive analysis of TF knockout expression data in Yeast Include protein-protein interaction information as an additional perspective to the assessment of GRN TFs that function together may show significant overlap in their target genes Out of 115 pairs of physically interacting TFs in the dataset, 92 display such an overlap TFs tend to regulate genes that interact with each other Out of 110,487 differentially expressed genes, there are 3,846 pair-wise interactions between co-regulated genes, covering 2,262 genes in total Most TFs target at least one pair of interacting genes Among these regulators are several that affect many genes, emphasising that even small adjustments to TF expression levels can have a dramatic effect on target genes. Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Temporal gene expression data A problem with the above approach is that the creation of mutant strains is challenging or impossible for many TFs of interest Even when available, mutants may provide very limited information because of redundancy or due to the confounding of signal from indirect regulatory feedback Temporal dynamics: use time serial wild-type gene expression. e.g. Drosophila GRN Honkela et al. PNAS 2010 vol. 107 no. 17 pp 7793–7798

Other models for gene regulation network construction Expression based study Dynamic Bayesian Network Granger causality TF binding motif based study Weader AlignAce

Gene regulation network analysis Transcriptional regulation is mediated by the combinatorial interplay between cis-regulatory DNA elements and trans-acting transcription factors, and is perhaps the most important mechanism for controlling gene expression A transcriptional regulatory network that integrates such information can lead to a systems-level understanding of regulatory mechanisms Kim et al. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. Vol. 3, Iss 1, pp 21–35 2011

Discovery of motifs and regulatory modules Binding motif prediction only based on the knowledge afforded by PWMs often suffers from high false-positive rates To reduce the false-positive rate, a number of biological insights have been used evolutionary pressure placed on these important cis-regulatory elements co-expression with genes that have well-documented functions and expression patterns clustering of cis-regulatory features into cis-regulatory modules

Discovery of motifs and regulatory modules Collect the binding sequences for known TFs and to identify potential binding sites in unannotated genome sequence position frequency matrix cluster of individual TF binding sites regulatory relations among 4 genes As the majority of transcription factors are sequence-specific, there have been extensive efforts Modeling of individual transcription factor binding sites and cis-regulatory modules Kim et al. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. Vol. 3, Iss 1, pp 21–35 2011

Co-regulation A density-based subspace clustering algorithms for coherent clustering of gene expression data The model allows Expression profiles of genes in a cluster to follow any shifting-and-scaling patterns in subspace Expression value changes across any two conditions of the cluster to be significant Experimental results show that the algorithm is able to detect a significant amount of high biological significant clusters missed by previous models