Clustered Repeats and Regulatory Sites Abdulrahman Alazemi, Shahroze Abbas, Liam Lewis, Donald Ta, Ann Vo.

Slides:



Advertisements
Similar presentations
Periodic clusters. Non periodic clusters That was only the beginning…
Advertisements

PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Applications to Bioinformatics Bioinformatics is essential to the study of palindromic DNA sequences. First and foremost, bioinformatics, specifically.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
Short Dispersed Repeats KALEIGH, MARIAM, MICHAEL AND NICHOLAS.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
1 Profile Hidden Markov Models For Protein Structure Prediction Colin Cherry
Profiles for Sequences
Identification of Transcriptional Regulatory Elements in Chemosensory Receptor Genes by Probabilistic Segmentation Steven A. McCarroll, Hao Li Cornelia.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Regulation of Gene Expression In Prokaryotes. Regulation of Gene Expression Constituitive Gene Expression (promoters) Regulating Metabolism (promoters.
Discovery of RNA Structural Elements Using Evolutionary Computation Authors: G. Fogel, V. Porto, D. Weekes, D. Fogel, R. Griffey, J. McNeil, E. Lesnik,
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
Similar Sequence Similar Function Charles Yan Spring 2006.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Computational Biology, Part 2 Sequence Comparison with Dot Matrices Robert F. Murphy Copyright  1996, All rights reserved.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Online Counseling Resource YCMOU ELearning Drive… School of Architecture, Science and Technology Yashwantrao C havan Maharashtra Open University, Nashik.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Assessment of sequence alignment Lecture Introduction The Dot plot Matrix visualisation matching tool: – Basics of Dot plot – Examples of Dot plot.
Regulatory factors 1) Gene copy number 2) Transcriptional control 2-1) Promoters 2-2) Terminators, attenuators and anti-terminators 2-3) Induction and.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Detecting binding sites for transcription factors by correlating sequence data with expression. Erik Aurell Adam Ameur Jakub Orzechowski Westholm in collaboration.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
Yersinia Palindromic Sequences Introduction to Bioinformatics 301 April 30th, 2015 Jordan Davis.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Using Mixed Length Training Sequences in Transcription Factor Binding Site Detection Tools Nathan Snyder Carnegie Mellon University BioGrid REU 2009 University.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Searching for structured motifs in the upstream regions of hsp70 genes in Tetrahymena termophila. Roberto Marangoni^, Antonietta La Terza*, Nadia Pisanti^,
Motifs BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ. of Texas/BCH364C-391L/Spring.
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Motif discovery and Protein Databases Tutorial 5.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Generic substitution matrix based sequence comparison Q: M A T W L I. A: M A - W T V. Scr: 45 -?11 3 Scr: Q: M A T W L I. A: M A W T V A. Total:
EJ Kochis.  Regulatory sequences found in intergenic regions of DNA  There are different types!  Repressor  Promoter  DNAa Sites.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Consensus SDR Sequence in E. coli near Important Genes Nav Saini BNFO 301 4/29/15.
Gene Expression *Protein coding gene *Gene expression Genes control inherited variation via: DNA, RNA and protein Phenotype *Gene DNAPhenotypeRNAProtein.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Local Multiple Sequence Alignment Sequence Motifs
. Finding Motifs in Promoter Regions Libi Hertzberg Or Zuk.
Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.
Lecture 4: Transcription in Prokaryotes Chapter 6.
Motif Search and RNA Structure Prediction Lesson 9.
Intro to Probabilistic Models PSSMs Computational Genomics, Lecture 6b Partially based on slides by Metsada Pasmanik-Chor.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
Finding genes in the genome
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Chapter 27 Phage Strategies
bacteria and eukaryotes
The Transcriptional Landscape of the Mammalian Genome
A Very Basic Gibbs Sampler for Motif Detection
Genome Center of Wisconsin, UW-Madison
Basic Local Alignment Search Tool
Prokaryotic (Bacterial) Gene Regulation
Presentation transcript:

Clustered Repeats and Regulatory Sites Abdulrahman Alazemi, Shahroze Abbas, Liam Lewis, Donald Ta, Ann Vo

Overview Clustered repeats  Wide variety of functions  History largely unknown Regulatory Sites  A segment capable of altering expression of specific genes  Various classifications of regulatory sequences  Found in non-coding regions  Functions at the transcriptional level

Identification Consensus sequences  Utilize PSSM  How?  Determine consensus

Clustered repeats and potential regulatory sequences Abdulrahman Alazemi

Background; Transcription factors? Activators Vs Repressors.

Thoughts; - My questions; Can I apply a known method/tool with a known results to other phage and get the same/similar result?

The known case; Examine the proven Repressor and Cro binding sites (operators) of Phage Lambda. Bioinformatics method in the notes. First “Name of Lambda in BioBike”. Go to BioBike/Phantome.

The known case; Second; Motifs in for the upstream sequence of phage Lambda. Labeled, DNA, Multiple-Hits-ok.

The known case; Results of motifs in. More than one interesting case. Motifs 1, 2, 3.

The known case; BioBike function; Description-analysis submenu, Genes- proteins menu. Now, we have an idea about where to look.

The known case; Used the function sequence-of from Genome menu. Go to the specific region in the genome

The known case; Finding the operators. Directly Vs inversion of.

Phage Lambda map;

Thoughts; - My concern/focus; Would I find some sort of generality between operators of different phages?

My experiment; Twenty one random phages of different phage families. Eight of them don't have repressors. (eliminated) Three of the 13 phages left didn't display a map because of linear amplicon. Ten phages out of 21 went through all the steps of the method/tool successfully and gave me back out come that I can work with. Three out of 10 have similar results to phage Lambda.

Outcome analysis; - Similar to phage lambda: -Bacillus-phage-1

Phage Bacillus-phage-1 map;

Outcome analysis; -Similar to phage Lambda: -Listeria-phage-A006

Phage Listeria-phage-A006 map;

Outcome analysis; -Similar to phage lambda: -Lactobacillus-johnsonii- prophage-Lj928

Phage Lactobacillus-johnsonii-prophage-Lj928 map;

First conclusion; - Out of 21 or 10 phages, only 3 phages are similar to phage Lambda. - Less than 50%. - No Generality. - Appropriate conclusion; - Phage Lambda, Bacillus-phage-1, Listeria-phage-A006, and Lactobacillus-johnsonii-prophage-Lj928 have a similarity/generality between their operators that the repressors bind to.

Inspiration; - Dead end. - The articles !!!! - Extend my research. - look for something interesting.

First interesting case; - In Burkholderia-phage-Bcep1. - Six similar sequence in one intergenic region - another 6 similar sequences in another intergenic region. - Palindromic sequences. - 6 or 3 sequences ? - Bacillus-phage-1 is similar to Burkholderia-phage-Bcep1 somehow.

First interesting case;

Phage Burkholderia-phage-Bcep1 map;

Second interesting case; - In phage Clostridium-phage-39-O. - Eight nucleotides sequence (TTACTACA) repeated 10 times in one intergenic sequence. - Again the same sequence repeated 8 times in another intergenic sequence in another place on the phage.

Second interesting case;

Phage Clostridium-phage-39-O map;

Conclusion; - Goals; - Pick one interesting case. - Research it. - Make sense of it.

A. Comparison of Pseudomonas putida and Azotobacter REP sequences Donald Ta

REPs Repetitive Extragenic Palindromic Sequences  Found mainly in abundance in Enterobacteriaecae Can be anywhere around 20 to 40 nt long Clustered into structures called BIMEs (bacterial interspersed mosaic element) as two inverted tandem repeats separated by a short linker of variable length

What do REPs do? Regulate Gene Expression Structuring DNA Specific target sites for bacterial insertion sequences Possibly more that are undiscovered

Previous Study I. Aranda-Olmedo 2002 used BLAST (Basic local alignment search tool) to find regions of local similarity between sequences downloaded from the National Center for Biotechnology Information (NCBI) Used database with all contigs of Pseudomonas putida already available in The Institute for Genome Research Developed their own program to screen all of the strains against the 35 nt sequence 5’- CCGGCCTCTTCGCGGGTAAGCCCGCTCCTACAGGG- 3’

Results of that Study

Implications from that study They suggest that the 35 bp element they found is species specific in P. putida First time that REP sequences have been described and characterized in a group of non- enterobacteriaceae

What am I doing? Comparing REP sequence element of Pseudomonas putida KT2440 with Azotobacter vanlandii  Why?  Order Pseudomonadales Used the REP element that is most common among Pseudomonas species “GCGGGnnnnCCCGC”

Methods Used built-in functions of BioBike to scan a sequence for possibly loose matches of a pattern “****GCGGG****CCCGC****” sequence iterated over the sequence of the organism of interest and then whenever there was a match it was displayed on the output “*” means an unspecified amino acid

Findings 52 sequence hits in Azotobacter vanlandii that appear to have the same conserved region found in Pseudomonas putida The species share similar REP elements with the same conserved central palindromic region “GCGGG****CCCGC”

Output

Significance REP sequences mainly found abundantly in Enterobacteriacaea Study by Bao Ton-Hoang 2012 suggested that transposases could’ve been responsible for the proliferation of REP sequences in the genomes of bacteria in Enterobacteriacaea Possibly suggest a similar origin of REP sequences/elements for Pseudomonas and Enterobacteriacaea?

Problems? Found 2 hits in E. Coli K-12 that had the REP element  Maybe suggests similar origin? Could be just a fluke/just by chance that these two organisms share the same REP element in abundance  Past Study found 804 REP sequences with that REP element in Pseudomonas putida  I found 52 in Azotobacter vanlandii

Possible plans of the future/near future? Compare with other bacteria in the order of Pseudomondas to see if I get similar results Possibly try to find a link to how REP sequences started proliferating in bacteria outside of Enterobacteriacae

Positional Preference of Rho-Independent Transcriptional Terminators in E. Coli Ann Vo

Transcriptional Terminators Rho-independent Specific activities poorly understood Occurs in ssDNA and RNA Unique characteristics:  T-Tract: nt  GC-rich stem: 4-18 nt

Transcriptional Terminators Available algorithms:  RNAMotif  TransTermHP  ARNold About 317 natural terminators found in E. Coli Lai et al. (2013) found a positional preference between other regulatory sequences Do transcriptional terminators have a positional preference relative to the end of the gene?

ARNold Erpin  Scores input sequences  Compares against 1,200 known terminators from Bacillus subtilitis and Escherichia coli RNAMotif  Used descriptors to find possible terminators  Scores free energy of hairpin formation

Matching Sequences BioBIKE/PhAnToMe  Extracted the 50 nucleotides following every gene Python  Compared sequences to terminators  Calculated distance to terminator ARNold3248 possible terminators BioBIKE5341 downstream sequences Python126 terminators CAGGACGGTTTACCGGGGAGCCATAAACGGCTCCCTTTTCATTGTTATCA ACGGTTTACCGGGGAGCCATAAACGGCTCCCTTTTCATTGTTA downstream sequence terminator

Conclusion Appear to exhibit some degree of positional preference Reasons remain unclear Further studies:  Length of terminator  Function of operons

References Chen, Ying-Ja et al. “Characterization of 582 Natural and Synthetic Terminators and Quantification of Their Design Constraints.” Nature methods 10.7 (2013): 659–64. Web. 20 Mar Ermolaeva, M D et al. “Prediction of Transcription Terminators in Bacterial Genomes.” Journal of molecular biology (2000): 27–33. Web. 4 Apr Kingsford, Carleton L, Kunmi Ayanbule, and Steven L Salzberg. “Rapid, Accurate, Computational Discovery of Rho-Independent Transcription Terminators Illuminates Their Relationship to DNA Uptake.” Genome biology 8.2 (2007): R22. Web. 17 Apr Lai, Fu-Jou et al. “Identifying Functional Transcription Factor Binding Sites in Yeast by Considering Their Positional Preference in the Promoters.” PloS one 8.12 (2013): e Web. 10 Apr Lau, Lester F et al. “A Potential Stem-Oop Structure and the Sequence CAAUCAA in the Transcript Are Insufficient to Signal Q-Dependent Transcription Termination at XtR1.” 12.2 (1984): 1287–1299. Print. Macke, T J et al. “RNAMotif, an RNA Secondary Structure Definition and Search Algorithm.” Nucleic Acids Research (2001): 4724–35. Mooney, Rachel Anne, and Robert Landick. “Building a Better Stop Sign: Understanding the Signals That Terminate Transcription.” Nature Methods 10.7 (2013): 618–619. Web. 21 Mar Naville, Magali et al. “ARNold: A Web Tool for the Prediction of Rho-Independent Transcription Terminators.” RNA Biology 8.1 (2011): 11–13. Web. 8 Apr

Resemblances and differences between promoter sequences in E. coli and S. enterica Liam Lewis

Inspiration Novel sequence-based method for identifying transcription factor binding sites in prokaryotic genomes Results found promoters with high probability

Background of Promoter sequences Regulatory Elements -35 and -10 consensus sequence Sigma factor + RNA Polymerase

Program used to identify promoters PePPER Uses PSSMs and Hidden markov Models Algorithm is universal for prokaryotes

Biobike implementation Biobike to compare both outputs from PePPER.

What’s next? Comparison of results Biobike algorithm to accurately predict promoters

Comparison of Repressor-Operator Sequences in Lambda and other Temperate Phages Shahroze Abbas

Repressor Sequences in Lambda Two possible life cycles, dependent upon either Lambda repressor or Cro repressor. cI repressor maintains lysogenic state Cro repressor initiates a switch to the lytic state Significance of intergenic sequences and neighboring genes to determine ‘hypothetical proteins’ in other organisms similar

Comparison of Repressor Sequence Lambda repressor sequence tested for occurrence in other phages Enterobacteria phage Sfl Enterobacteria phage HK244 Enterobacteria phage HK542 Enterobacteria phage HK544 Enterobacteria phage HK 106 Enterobacteria phage CL707

Still to come… Analysis of operator sequences Comparison of cro repressor in other phages Trend or pattern to determine function of neighboring proteins in other phages Trend or pattern in sequences between phages