Download presentation
Presentation is loading. Please wait.
Published byLawrence Norton Modified over 8 years ago
1
On the biological significance of alternative splicing: a bioinformatics approach Sandro J. de Souza TDR, 07/05/2004 RNA 10:757-765, 2004
2
Genomics Bioinformatics Large-scale Biology
3
The Real Revolution Early 20 th century: Mendel and the inheritance laws Mid 20 th century: DNA as the genetic element (Avery) Mid 20 th century: Watson and Crick and the structure of DNA. 70’s and 80’s: Molecular biology/biotechnology 90’s and 21th century: Genomics and Bioinformatics Paradigm in Biology: Evolution by means of natural selection (Darwin and Wallace, mid 19 th century)
4
Bioinformatics Development of tools Development of tools Gateway to explore new datasets Gateway to explore new datasets Processing of data derived from large- scale projects Processing of data derived from large- scale projects A new way to do hypothesis-driven science A new way to do hypothesis-driven science
6
Splicing (1977) Roberts and Sharp (Nobel 1993)
7
ExonsIntrons mRNA Coding Non-coding
10
Splicing Splicing depends on recognition of exon-intron boundaries Splice sites are generic and consist solely of: 5’ boundary 3’ boundary Acceptor site Polypyrimidine tract
11
.....if they occur at the boundaries of the regions to be spliced out, can change the splicing pattern, resulting in the deletion or addition of whole sequences of amino acids. Walter Gilbert. Why genes in pieces. Nature 271:501, 1978.
12
At least half of all human genes undergo alternative splicing Biological significance or spurious events?
13
Alternative splicing 1. Chromosomal ratio activates txn of Sxl in females only 2. SXL controls splicing of tra-2 mRNA 3. Females: exon 2 (which has a stop codon) is removed via SXL Males: exon 2 is not removed. 4. Males: no active TRA Females: TRA is made. 5. TRA directs splicing of dsx mRNA in specific manner; in males default splicing occurs.
14
Alternative Splicing – Auditory Hair Cells Cytosol PM AVSGRK AVSGRKAMFARYVPEIAALILNRKKYGGTFNSTRGRK Ca 2+ concentration at which K + channel opens depends on alternative splicing of K + channel – 576 possible alternative splicing combinations K + channel Dotted lines show regions of the protein dependent on splicing Picture of human cochleal hair cells from http://www.sickkids.on.ca/otolaryngology/Hearloss.asp Sound frequency Cytosolic Ca 2+ concentration K + channel opens Therefore Ca 2+ concentration ‘decodes’ frequency
15
Types of alternative splicing: Exon skipping Intron Retention 5´3´ Alternative 5’ splic. site Alternative 3’ splic. site mRNA
16
Large-scale analysis of intron retention in the human transcriptome Pedro F.A. Galante, Noboru Jo Sakabe, Natanja Slager, Sandro J. de Souza
17
Examples of intron retention events with biological significance Msl2 in Drosophila Msl2 in Drosophila P element in Drosophila P element in Drosophila retroviruses retroviruses
18
Transmembrane domain In immature B cells an intron containing an early translational stop signal is removed yielding a long transcript. The additional sequence encodes an transmembrane region. Hydrophilic stretch This intron is not removed in activated B cells, giving rise to a truncated (secreted) product Ig gene Immature B Cell Stop codons Hydrophilic tail Transmembrane domain Activation Immature B cells express membrane-bound Ig. Activation leads to production of secreted form
19
Intron retention and cancer CD44several tumors Gastrin receptorpancreas Ret tyrosine kinasepheochromocytomas Fas receptorT-cell lymphoma
20
Transcriptome Database EST data Known mRNAs SAGE data Genome Data
21
Genome-based cDNA clustering Exon 1 DNA RNAm cluster Exon 2Exon 3
22
Transcript Mapping P53
23
Types of Data
24
Retention RetentionPrototype Full length ESTTotal 6406911120 EST2594n.d2594 Total27936913127 Dataset
25
Experimental validation
26
14% of all human genes show evidence of intron retention Kan, States & Gish (2002) 36% of RefSeq database! After sample statistics: 5%
27
Distribution of events along transcripts. elite group events in observedexpected CDS 287 (53%) 502 (93%) 5’ UTR 84 (15%) 27 (5%) 3’ UTR 170 (32%) 12 (2%) MGCObservedexpected 87 (52%) 155 (93%) 15 (9%) 8 (5%) 65 (39%) 4 (2%) This bias can be a product of: Underreporting of sequences Nonsense-mediated decay (NMD) p << 0.005
28
2563 out of 3195 (80%) sequences with a retained intron had an exon/exon boundary downstream of the retention event.
29
Retained introns are shorter P<<<<0.001
30
Domains encoded by retained introns
31
Number of domains entirely encoded by: Retained introns only: 02 Exon-intron-exon:31 Number of domains partially encoded by: Retained introns only: 25 Exon-intron-exon: 10
32
Retained introns have a higher GC content P<<<<0.001
33
Did retained introns encode protein domains? Only retained introns in the CDS were used. Only retained introns in the CDS were used. Only retained introns defined by full- length mRNAs were used. Only retained introns defined by full- length mRNAs were used. Protein sequences were searched against PFAM database. Protein sequences were searched against PFAM database.
34
Codon Usage
35
Conservation of intron retention in mouse cDNA sequences 40%-57% of all retained introns present a mouse hit Identity of orthologous retained introns is 84% Non-retained introns is 60%; Exons 87% Mouse cDNA also corresponds to an retention variant 26% - 10 out of 46
36
Frequency of stop codon Expected: 1064 88 cases where the retention generates a putative truncated protein TACTTGTGCGTAGTCCCCGCGATCTAACGCCACGATGGATGACACTGTGA exon retained intron Stop codons – TAG, TGA, TAA Found 651 stop codons mRNA cds stop cds p-value << 0.005 TACTTGTGCGTAGTCCCCGCGATCTAACGCCACGATGGATGACAC
37
GC content for sequences upstream and downstream the premature stop codon – 88 cases GC 58% stop exon retained intron GC 49% Are under selective pressure for coding potential 5’3’
38
Why the argument of ‘selection’ is important? As noted originally by Gilbert (1978), mutations that affect splicing can allow the production of new proteins without the loss of the original one If, however, the new variant has some biological significance, selection will act to maintain the function of this variant. Therefore, there should not be any “negative selection” on this variant.
39
TissueT/NIRBreastT1.52* N0.62 ProstateT1.45* N0.44 BrainT2.52* N3.16 ColonT0.85 N0.60 Intron Retention in Tumors
40
w/ downstream spliced intron w/ hit w/ mouse cDNAs* encoding protein domains* experimentally validated (both forms) 2563/3195 80 % 74/152 49 % 47/151 31 % 2/2 * full-length vs full-length set and retained intron entirely in the CDS Towards a reliable set of intron retention events
41
Second International Conference on Bioinformatics and Computational Biology www.icobicobi.com.br 25-28/10/2004 Angra dos Reis
42
Group of Computational Biology Sandro J. de Souzatennis player Helena SamaiaResearch Assistant Ana C. PereiraAdmin. Assistant Maarten LeerkesPh.D student Noboru SakabePh.D student Maria VibranovskiPh.D student Elza HelenaPh.D student Natanja SlaterPh.D student Pedro Galante Ph.D student Elisson C. Osorioprogrammer Jorge E. de SouzaPh.D student Rodrigo Soaresprogrammer Andre Zaiatssystem admin.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.