Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational.

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
March 03 Identification of Transcription Factor Binding Sites Presenting: Mira & Tali.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 3 Finding Motifs Aleppo University Faculty of technical engineering.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Sequence Motifs. Motifs Motifs represent a short common sequence –Regulatory motifs (TF binding sites) –Functional site in proteins (DNA binding motif)
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
An analysis of “Alignments anchored on genomic landmarks can aid in the identification of regulatory elements” by Kannan Tharakaraman et al. Sarah Aerni.
CisGreedy Motif Finder for Cistematic Sarah Aerni Mentors: Ali Mortazavi Barbara Wold.
In silico cis-analysis promoter analysis - Promoters and cis-elements - Searching for patterns - Searching redundant patterns.
In silico cis-analysis promoter analysis - Promoters and cis-elements - Searching for patterns - Searching redundant patterns.
Fuzzy K means.
1 Predicting Gene Expression from Sequence Michael A. Beer and Saeed Tavazoie Cell 117, (16 April 2004)
Regulatory element detection using correlation with expression (REDUCE) Literature search WANG Chao Sept 14, 2004.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
CisGreedy Motif Finder for Cistematic Sarah Aerni Mentors: Ali Mortazavi Barbara Wold.
Finding Regulatory Motifs in DNA Sequences
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Motif finding: Lecture 1 CS 498 CXZ. From DNA to Protein: In words 1.DNA = nucleotide sequence Alphabet size = 4 (A,C,G,T) 2.DNA  mRNA (single stranded)
A Statistical Method for Finding Transcriptional Factor Binding Sites Authors: Saurabh Sinha and Martin Tompa Presenter: Christopher Schlosberg CS598ss.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
MotifML A Novel Ontology-based XML Model for Data- Exchange of Regulatory DNA Motif Profiles Eric Neumann, Beyond Genomics Tian Niu, Harvard University.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
Guiding Motif Discovery by Iterative Pattern Refinement Zhiping Wang, Mehmet Dalkilic, Sun Kim School of Informatics, Indiana University.
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
Good solutions are advantageous Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Detecting binding sites for transcription factors by correlating sequence data with expression. Erik Aurell Adam Ameur Jakub Orzechowski Westholm in collaboration.
Figure S1. Genomic PCR of in vitro potato plants transformed with StPTB1 prom (top) and StPTB6 prom (bottom) constructs using nptII-specific primers. Thirty.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Computational Identification of Drosophila microRNA Genes Journal Club 09/05/03 Jared Bischof.
Construction of Substitution Matrices
PreDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Department.
Searching for structured motifs in the upstream regions of hsp70 genes in Tetrahymena termophila. Roberto Marangoni^, Antonietta La Terza*, Nadia Pisanti^,
Motifs BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ. of Texas/BCH364C-391L/Spring.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Identification of Compositionally Similar Cis-element Clusters in Coordinately Regulated Genes Anil G Jegga, Ashima Gupta, Andrew T Pinski, James W Carman,
Localising regulatory elements using statistical analysis and shortest unique substrings of DNA Nora Pierstorff 1, Rodrigo Nunes de Fonseca 2, Thomas Wiehe.
Introduction to Bioinformatics Algorithms Finding Regulatory Motifs in DNA Sequences.
Algorithms in Bioinformatics: A Practical Introduction
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
How do we represent the position specific preference ? BID_MOUSE I A R H L A Q I G D E M BAD_MOUSE Y G R E L R R M S D E F BAK_MOUSE V G R Q L A L I G.
Cis-regulatory Modules and Module Discovery
Pattern Discovery and Recognition for Genetic Regulation Tim Bailey UQ Maths and IMB.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Local Multiple Sequence Alignment Sequence Motifs
. Finding Motifs in Promoter Regions Libi Hertzberg Or Zuk.
Construction of Substitution matrices
Motif Search and RNA Structure Prediction Lesson 9.
Intro to Probabilistic Models PSSMs Computational Genomics, Lecture 6b Partially based on slides by Metsada Pasmanik-Chor.
HW4: sites that look like transcription start sites Nucleotide histogram Background frequency Count matrix for translation start sites (-10 to 10) Frequency.
Transcription factor binding motifs (part II) 10/22/07.
Finding Motifs Vasileios Hatzivassiloglou University of Texas at Dallas.
Gene Expression Ilana Granovsky Jonathan Laserson.
A Very Basic Gibbs Sampler for Motif Detection
Bioinformatics tools to identify structured motifs in the upstream regions of stress-response-involved genes in Tetrahymena thermophila Antonietta La Terza*,
Prediction of Regulatory Elements for Non-Model Organisms Rachita Sharma, Patricia.
Recitation 7 2/4/09 PSSMs+Gene finding
Finding regulatory modules
Volume 134, Issue 2, Pages (July 2008)
Nora Pierstorff Dept. of Genetics University of Cologne
The Role of RNA Editing by ADARs in RNAi
Basic Local Alignment Search Tool
Presentation transcript:

Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational Methods Debraj Guha Thakurta, Lisanne Palomar, Bary D. Stormo, Pat Tedesco, Thomas E. Johnson, Davis W. Walker, Gordon Lithgow, Stuart Kim, and Christopher D. Link Presented by Abel G. Gezahegne ECS 289A ECS 289A February 24, 2003

Overview n n Monitor ~12,000 genes from C. elegans to determine genes up-regulated on heat shock (HS). n n Analyze the upstream regions of these genes using computational DNA pattern recognition methods to identify any cis-regulatory motifs. n n Determine the significance of these motifs using statistical methods. n n Perform comparative sequence analysis to determine if any cross-species conservations exist.

Microarray Experiment n n Determine Gene expression patterns before and after HS using DNA Microarray for 11,917 known and predicted C. elegans genes. n n Animals were harvested as young adults and then split in two halves: HS population and control population. n n 5 independent HS experiments at 35 O C: In two experiment animals were harvested after 1 hr of HS. In three experiments animals were heat shocked for 2 hrs and allowed to recover at 20 O C for 2 hrs then harvested.

Software Tools n n Consensus – a greedy algorithm that searches for a matrix with a low probability of occurring by chance. n n ANN-Spec – an algorithm based on Artificial Neural Network and Gibbs sampling method to discover un-gapped patterns in DNA sequences n n GLASS – Graphical Language for Assembly of Secondary Structures: a sequence alignment algorithm. n n Patser – given weight matrix identifies high scoring subsequences and calculates p values.

Gene Identification n n Identified 28 genes induced in at least four of the five experiments and over-expressed by a factor of two or more. n n Because of noise in DNA Microarray considered only genes up-regulated by an average factor of four or more.

Gene Identification (cont.) n n Used 500 bp upstream from transcription start site to select candidates for promoter elements. n n Two DNA motifs identified by Consensus and ANN-Spec. n n HSE - TTCTAGAA, a well known DNA binding site for HS Transcription Factors (HSF). n n HSAS - GGGTGTC, un unknown motif that does not correspond to any known TF binding site.

Mathematical Model n n Probability of a protein binding to a site with a score s: P(bound|s)  e s n n When multiple binding sites exist, probability of binding: P m seq =  sites e s n n Geometric Mean of the pp-values: = [  Sseq  sites e s ] 1/N n n Difference of the log geometric means of the pp-values: DLGM = log HS - Rand

Statistical Significance n n Use the DLGM to determine the cutoff scores using the 13 up-regulated genes and 3000 random genes from the C. elegans genome. DLGM = log HS - log Rand n n At a low cutoff value there are substantial amount of low scoring sequences thus DLGM is low. n n At a high cutoff even the high scoring sequences are not being used thus DLGM drops. n n The cutoff score that maximizes DLGM is chosen as the appropriate cutoff value.

Cross-Species Conservation n n To study conservation of regulatory sites across related species two orthologous gene pairs were examined between C. elegans and C. briggsae. n n The pattern of HSE and HSAS sites on the promoters indicate conservation across closely related species. n n Output from VISTA (VISualization Tools for Alignment.

Cross-Species C. (cont) n n The gene structure and distances between the genes are similar in both organisms. n n The two genes share 450 nt in the upstream DNA sequence. n n Output from GLASS alignment algorithm.

Mutant Promoter Construct n n A single mutation of HSE or HSAS still results in a significant expression level of GFP (green fluorescence protein). n n Mutation of all three or two sites of HSE’s or one HSE’s and the HSAS results in dramatic reduction is expression level.

Remarks and Conclusion n n Since Microarray data was conducted for ~2/3 of the C. elegans genes, there may exist other HS induced genes. n n Through experiments and statistical methods the novel cis-regulatory element discovered has been shown to play a significant role in heat shock response. n n This has also shown computational methods can be a valuable tool in discovery of novel regulatory elements.