The Influence of Alternative Splicing in Protein Structure The fact that gene number is not significantly different between mammals and some invertebrates.

Slides:



Advertisements
Similar presentations
A very short introduction (in plants)
Advertisements

EAnnot: A genome annotation tool using experimental evidence Aniko Sabo & Li Ding Genome Sequencing Center Washington University, St. Louis.
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Ab initio gene prediction Genome 559, Winter 2011.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction methods Gene indices Mapping cDNA on genomic DNA Genome-genome.
PROMoter SCanning/ANalysis tool. Goal Creating a tool to analyse a set of putative promoter sequences and recognize known and unknown promoters, with.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
CSE182-L12 Gene Finding.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Bioinformatics Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) SpliceNest Lecture 13.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Lecture 12 Splicing and gene prediction in eukaryotes
Genome Annotation BCB 660 October 20, From Carson Holt.
Regulation of eukaryotic gene sequence expression
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
International Livestock Research Institute, Nairobi, Kenya. Introduction to Bioinformatics: NOV David Lynn (M.Sc., Ph.D.) Trinity College Dublin.
Progress report Yiming Zhang 02/10/2012. All AS events in ASIP Intron retention Exon skipping Alternative Acceptor site NAGNAG AltA Alternative Donor.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Chapter 2: From genes to Genomes. 2.1 Introduction.
COT 6930 HPC and Bioinformatics Introduction to Molecular Biology Xingquan Zhu Dept. of Computer Science and Engineering.
Assignment 2: Papers read for this assignment Paper 1: PALMA: mRNA to Genome Alignments using Large Margin Algorithms Paper 2: Optimal spliced alignments.
1 The Interrupted Gene. Ex Biochem c3-interrupted gene Introduction Figure 3.1.
Part I: Identifying sequences with … Speaker : S. Gaj Date
DNA sequencing. Dideoxy analogs of normal nucleotide triphosphates (ddNTP) cause premature termination of a growing chain of nucleotides. ACAGTCGATTG ACAddG.
MPL Identification of alternative spliced mRNA variants related to cancers by genome-wide ESTs alignment KIM DAE SOO Oncogene Apr.
Fea- ture Num- ber Feature NameFeature description 1 Average number of exons Average number of exons in the transcripts of a gene where indel is located.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Sackler Medical School
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
A Non-EST-Based Method for Exon-Skipping Prediction Rotem Sorek, Ronen Shemesh, Yuval Cohen, Ortal Basechess, Gil Ast and Ron Shamir Genome Research August.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
Chapter 2 From Genes to Genomes. 2.1 Introduction We can think about mapping genes and genomes at several levels of resolution: A genetic (or linkage)
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
Research about Alternative Splicing recently 楊佳熒.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
On the biological significance of alternative splicing: a bioinformatics approach Sandro J. de Souza TDR, 07/05/2004 RNA 10: , 2004.
While replication, one strand will form a continuous copy while the other form a series of short “Okazaki” fragments Genetic traits can be transferred.
JIGSAW: a better way to combine predictions J.E. Allen, W.H. Majoros, M. Pertea, and S.L. Salzberg. JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Chapter 3 The Interrupted Gene.
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Annotation of eukaryotic genomes
Welcome to the combined BLAST and Genome Browser Tutorial.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Name the 4 gene mutations that can occur State the effect of gene mutations on amino acid sequences.
Features of the genetic code: Triplet codons (total 64 codons) Nonoverlapping Three stop or nonsense codons UAA (ocher), UAG (amber) and UGA (opal)
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
bacteria and eukaryotes
Using DNA Subway in the Classroom
Ab initio gene prediction
From: TopHat: discovering splice junctions with RNA-Seq
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
Chapter 4 The Interrupted Gene.
Protein structure prediction.
Volume 3, Issue 4, Pages (April 2013)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Introduction to Alternative Splicing and my research report
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Splice isoforms of the JNK1, JNK2, and JNK3 proteins.
Presentation transcript:

The Influence of Alternative Splicing in Protein Structure The fact that gene number is not significantly different between mammals and some invertebrates suggests that other mechanisms are being used to generate diversity, such as alternative splicing (AS) and post-translational modifications. AS could be understood as a single gene originating different mRNA sequences which can occur by the use of alternative splice sites. The major types of AS are: intron retention (IR), alternative splice sites usage (AU), exon skipping (ES) and mutually exclusive exons. It is know that some AS variants are tissue-specific and/or associated with several diseases in humans, as cancer. However, AS can create thousand of mRNA sequences and their functional viability has been questioned. Some studies indicate that variants with a frame shift and/or premature stop codons will be degraded. Some suggested that a high number of ESTs/mRNAs supporting a variant correlates with its functionality while others use the comparison between human and other organisms (mouse, rat) to exclude not functional sequences. Many computational tools have been used to find and compare alternative splicing variants. Generally, cDNA, mRNA, ESTs and protein sequences that are public available are aligned against each other or against the genome to identify splicing isoforms. Most of this information is usually deposited in relational databases with open access. This can be used to join all sequence information related to variants as size, frame shift, insertions, deletions, repetitive elements and domains. Some previous studies correlated the effect of alternative splicing in protein structures. Of them, some are focused on protein families while others do not cover all possible protein modifications caused by alternative splice sites. So, there still exists a lack of information about the protein structure modifications as a consequence of alternative splicing. Alan Durham for Perl lessons and support, Pedro Galante for initial set of alternative splice cases, Joao Muniz for usefull Modeller tips PhD Bioinformatics program and CAPES for financial support Durham, E. H. A. B. 1, Garratt, R. C. 2, de Souza, S. J. 3 1 – Bioinformatics PhD student - University of São Paulo - Brazil 2 – Physics Insitute (São Carlos) - University of São Paulo - Brazil 3 - Ludwig Institute for Cancer Research – São Paulo Branch - Brazil D I S C U S S I O N This work brings an authomatical method to find proteins structures related to mRNA sequences modified by alternative splicing. The pipeline used here gives the precise position of the splice site in protein structure and assign it as an alternative or constitutive boundary. The assignment of the boundaries is a specially difficult task once, most of times, we did not find the alternative boundaries from genomic data in proteins structures. This study still intend to highligth the structural modifications caused by alternative splicing and each type of AS events. Besides it will be a detailed description of structures related to alternative splicing, including their dynamic behavior (Molecular Modeling and Dynamics in course) and one experimental structure (X-ray) in future studies. R E S U L T S In this study we intend to identify and distinguish human protein structures modified by alternative splicing. In order to do it, mRNAs and EST sequences from UCSC were mapped to the human genome using BLAT and SIM4. All mapped sequences were deposited in a local database and the splicing boundaries from all sequences from a gene were compared to identify splicing variants. Those variants were assigned as IR, AU or ES events. We constructed a pipeline where TBLASTN was performed between those variants ( mRNA and EST sequences) and a set of non-redundant PDB human sequences..Some BLAST parameters were carefully adjusted to allow gap opening and extension and identity was recalculated considering the gap size. Terminal regions without alignment were resubmitted to TBLASTN and the correct splice boundaries were assigned. Sequences with identity greater than 70% were included in our analysis, except for those containing stop codons. Initially, the non-redundant PDB structures were related to Unigene clusters allowing a directly association between the genes with alternative splicing sequences and their structural effects on proteins. Events in proteins were separated in insertion and deletion, depending of the splicing sequence alignment. Proteins with deletions presented donor and acceptor splice boundaries mapped into structures (716 Unigene clusters) while insertion had cases were related to structures (585 Unigene clusters). Other structural features were analyzed, as motility (measured through experimental B-factor values from PDB files) which is one feature expected to vary in determined regions of proteins was measured to deletion and insertion boundaries as to deleted regions. Spatial distance restraints between CA atoms which can be used by alternative splicing sequences to restraint the energy needed to fold a new protein, was measured in deleted regions and compare between the prototype and variant structures. Association between interaction regions (intra-protein and inter-protein) and diseases are also in course. I N T R O D U C T I O N A B O U T T H I S W O R K Aminoacids composition of alternative and constitutive boundaries Spatial distances of alternative boundaries deleted in protein structures T H A N K S Exposition and interaction of deleted alternative boundaries Contact Structural Units (interaction between chains) ProfBval (exposition and flexibility) Conditional Probability Pairs of aminoacids (boundaries) (A) Conditional probability for a pairs of aminoacids in human PDB data set (B) Comditional probability for alternative boundaries Pairs of aminoacids (boundaries) Conditional Probability B A Frequency of human protein aminoacids (black), constitutive (white) and alternative (grey) boundaries A C D E F G H I K L M N P Q R S T V W Y Aminoacids Frequency Distance (Angstron) Size (aminoacids) Distance (Angstron) Deleted size (aminoacids) (A) Distance of human protein regions with different sizes (B) Distance of deleted regions with alternative splice boundaries A B Alternative boundaries Random Draw Random Draw Alternative boundaries Exposed and rigid Total Exposed and rigid Interacting