Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

Periodic clusters. Non periodic clusters That was only the beginning…
Gene regulation /function card Anatomical network card Tassy et al., Figure S1: Navigation diagram of ANISEED Anatomical structure card Expression card.
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Predicting Enhancers in Co-Expressed Genes Harshit Maheshwari Prabhat Pandey.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
March 03 Identification of Transcription Factor Binding Sites Presenting: Mira & Tali.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational.
Cis/TF discovery for Arabidopsis Aristotelis Tsirigos NYU Computer Science.
Yeast Dataset Analysis Hongli Li Final Project Computer Science Department UMASS Lowell.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY 1 Identifying Regulatory Transcriptional Elements on Functional Gene Groups Using Computer-
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
An analysis of “Alignments anchored on genomic landmarks can aid in the identification of regulatory elements” by Kannan Tharakaraman et al. Sarah Aerni.
Fuzzy K means.
The Hardwiring of development: organization and function of genomic regulatory systems Maria I. Arnone and Eric H. Davidson.
1 Predicting Gene Expression from Sequence Michael A. Beer and Saeed Tavazoie Cell 117, (16 April 2004)
Regulatory element detection using correlation with expression (REDUCE) Literature search WANG Chao Sept 14, 2004.
CisGreedy Motif Finder for Cistematic Sarah Aerni Mentors: Ali Mortazavi Barbara Wold.
Review of important points from the NCBI lectures. –Example slides Review the two types of microarray platforms. –Spotted arrays –Affymetrix Specific examples.
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
Gene Expression Clustering. The Main Goal Gain insight into the gene’s function. Using: Sequence Transcription levels.
Motif finding: Lecture 1 CS 498 CXZ. From DNA to Protein: In words 1.DNA = nucleotide sequence Alphabet size = 4 (A,C,G,T) 2.DNA  mRNA (single stranded)
Cis-regulatory element study in transcriptome Jin Chen CSE Fall
Genome of the week - Deinococcus radiodurans Highly resistant to DNA damage –Most radiation resistant organism known Multiple genetic elements –2 chromosomes,
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Finish up array applications Move on to proteomics Protein microarrays.
Identification of Regulatory Binding Sites Using Minimum Spanning Trees Pacific Symposium on Biocomputing, pp , 2003 Reporter: Chu-Ting Tseng Advisor:
ChIP-on-Chip and Differential Location Analysis Junguk Hur School of Informatics October 4, 2005.
Using Mixed Length Training Sequences in Transcription Factor Binding Site Detection Tools Nathan Snyder Carnegie Mellon University BioGrid REU 2009 University.
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Figure 2: over-representation of neighbors in the fushi-tarazu region of Drosophila melanogaster. Annotated enhancers are marked grey. The CDS is marked.
Searching for structured motifs in the upstream regions of hsp70 genes in Tetrahymena termophila. Roberto Marangoni^, Antonietta La Terza*, Nadia Pisanti^,
Motifs BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ. of Texas/BCH364C-391L/Spring.
Identification of cell cycle-related regulatory motifs using a kernel canonical correlation analysis Presented by Rhee, Je-Keun Graduate Program in Bioinformatics.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Localising regulatory elements using statistical analysis and shortest unique substrings of DNA Nora Pierstorff 1, Rodrigo Nunes de Fonseca 2, Thomas Wiehe.
From Genomes to Genes Rui Alves.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Discovery of transcription networks Lecture3 Nov 2012 Regulatory Genomics Weizmann Institute Prof. Yitzhak Pilpel.
Pattern Discovery and Recognition for Genetic Regulation Tim Bailey UQ Maths and IMB.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Module Networks BMI/CS 576 Mark Craven December 2007.
Finding genes in the genome
Pattern Discovery and Recognition for Understanding Genetic Regulation Timothy L. Bailey Institute for Molecular Bioscience University of Queensland.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Prediction of Regulatory Elements for Non-Model Organisms Rachita Sharma, Patricia.
Recitation 7 2/4/09 PSSMs+Gene finding
A Zero-Knowledge Based Introduction to Biology
Mapping Global Histone Acetylation Patterns to Gene Expression
Diverse patterns, similar mechanism
From Mendel to Genomics
Volume 111, Issue 5, Pages (November 2002)
Nora Pierstorff Dept. of Genetics University of Cologne
Predicting Gene Expression from Sequence
Volume 20, Issue 9, Pages (May 2010)
Deep Learning in Bioinformatics
Presentation transcript:

Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002

Problems in predicting promoters/ transcription factor binding sites Transcription factors often recognize relatively short and degenerate sequences. These sequences are commonly found through out the genome of the species. Induction often depends on the spacing/ frequency of transcription binding sites within a sequences. Binding sites are not always in the upstream region. Markstein et al., 2002, figure1 Markstein et al., 2002, figure 2

Different Approaches to Promoter Analysis Saeed Tavazoie, et al. “Systematic determination of genetic network architecture” Nature Genetics 22: –Discovery of transcriptional regulation sub-networks, or genes that are under the control of similar promoters. – De novo discovery of cis-regulatory elements in yeast using expression clustering of microarray data and AlignACE.

Different Approaches to Promoter Analysis Michele Markstein, et al. “Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo” Proceeding of the National Academy of Scinces USA 99 (2): –Identify genes (known and unknown) that are regulated by the characterized transcription factor Dorsal. –Used FLYENHANCER to screen for clusters of known Dorsal response elements.

Different Approaches to Promoter Analysis Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. “ Proceedings of the National Academy of Sciences USA 99(2): –Evaluated the extent to which the clustering of transcription factor binding sites can be used as the computational basis to identify cis- regulatory modules. –Used the program PASTER to search the genome for consensus binding sites of five different developmental transcription factors and then used CIS-ANALYST to visualize and compute results.

First Approach Systematic determination of genetic network architecture Saeed Tavazoie, Jason D. Hughes, Michael J. Cambell, Raymond J. Cho, and George M. Church Nature Genetics 22:

Method Used microarray data by Cho et al that consisted of expression data for 6000 genes at 15 times points during two S. cerevisiae mitotic cell cycles. Analyzed 3000 “most variable ORFs” and normalized data by subtracting the mean expression level value across all time points for each gene. Clustered genes by expression pattern using euclidean distance metric values in the k-means algorithm.

Partitioned the 3000 ORFs into 30 clusters and the genes to functional categories. Determined the statistical significance for enrichment of a particular functional category.

Used AlignACE to align 600bp upstream regions in order to determined common nucleotide motifs.

Results Found 18 motifs in 12 different clusters –Seven characterized transcription factor binding sites that are known to regulate many of the genes in their respective cluster. Clusters with known regulons have cis-regulatory elements emerged as the highest scoring motif in every case. –examples include MCB box and SCB cell-cycle box. Motifs that have not been previously described demonstrate strong correlation with clusters that are enrichement for genes with specific functions. –Cluster 3 motifs M3a and M3b and their association with RNA and translation related genes within and outside of cluster 3. “Half of the 30 clusters were significantly enriched for functional categories or had significant motifs.”

Second Approach Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo Michele Markstein, Peter Markstein, Vicky Markstein, and Michael S. Levine. Proceeding of the National Academy of Scinces USA 99 (2):

Dorsal Transcription Factor Drosophila transcription factor involved in dorsal- ventral patterning in development. Transcription can be inhibited or induced by Dorsal depending on the promoter. Also, transcription induction is concentration dependent. Zen Sog

Used a degenerate Dorsal consensus sequences to scan entire Drosophila genome using FLY ENHANCER.

Ady and Phn expression patterns

Results Computational searches successfully identified genes that are activated at high (Phm), intermediate (Ady), and low (Sog) levels of Dorsal. At least 33% are known, or indicated, to be regulated by dorsal (5/15).

Third Approach Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. Benjamin P. Berman, Yutaka Nibu, Barret D. Pfeiffer, Pavel Tomancak, Susan E. Celniker, Michael Levine, Gerald M. Rubin, and Michael B. Eisen Proceedings of the National Academy of Sciences USA 99(2):

Methods Assigned consensus sequences to each of the five transcription factors using MEME and previously described binding sites. Used the program PASTER to search the genome for sequences that matched and visualized with the program CIS-ANALYST (developed by the authors). Using CIS-ANALYST analyzed the distribution of the sites and define windows that contained clusters of transcription factor binding sites.

Test of the Methods

Results Examined novel clusters with 15binding sites (or more) per 700bp. Identified 28 clusters that met this criteria - these sites contain binding sites for at least two of the factors. –23 fall in upstream regions –3 fall in intron regions

Results Examined the 49 genes that could be regulated by these sites using in situ hybridization and DNA microarray analysis. Ten of the 28 sites were upstream of in the first intron of anterior-posterior pattern expressed genes. –~35% correct predictions

The Giant (Gnt) gene

Conclusions (from papers) Clustering can be used to successfully determine cis-regulatory elements and can be applied to other systems. Clustering is more efficient when done using prior knowledge of transcription factor binding site(s). Computational identifications of cis-regulatory DNA regions improves when using two or more different classes of recognition sequences (motifs). “The grammar of the cis-regulatory code is clearly more complex than simply the density of transcription factor binding sites.” Berman et al. 2002

Conclusions (overall) Promoter prediction is a powerful tool that can be used for low cost screens for transcription regulatory sites. Success is going to depend on a number of factors: –the specific transcription factor (specificity of binding) –previous characterization –parameters used (window size) –annotation of the genome being used