Download presentation
Presentation is loading. Please wait.
Published byAnastasia Roberts Modified over 9 years ago
1
Alternative splicing: A playground of evolution Mikhail Gelfand Institute for Information Transmission Problems, RAS May 2004
2
Alternative splicing of human (and mouse) genes
3
Alternative splicing of orthologous human and mouse genes Sequence divergence in alternative and constitutive regions Evolution of splicing sites Alternative splicing and protein structure
4
Data known alternative splicing –HASDB (human, ESTs+mRNAs) –ASMamDB (mouse, mRNAs+genes) additional variants –UniGene (human and mouse EST clusters) complete genes and genomic DNA –GenBank (full-length mouse genes) –human genome
5
Methods Direct comparison of EST-derived alternatives difficult because of uneven coverage. Instead, align alternative isoforms from one species to the genomic DNA of other species. If alignable (complete exon or part of exon, no significant loss of similarity, no in-frame stops, conserve splicing sites), then conserved. This is an upper estimate on conservation: an isoform may be non-functional for other reasons (e.g. disruption of regulatory sites). Cannot analyze skipped exons.
6
Tools TBLASTN (initial identification of orthologs: mRNAs against genomic DNA) BLASTN (human mRNAs against genome) Pro-EST (spliced alignment, ESTs and mRNA against genomic DNA) Pro-Frame (spliced alignment, proteins against genomic DNA) –confirmation of orthology same exon-intron structure >70% identity over the entire protein length –analysis of conservation of alternative splicing conservation of exons or parts of exons conservation of sites
7
166 gene pairs 424284844040 human mouse Known alternative splicing: 126124124
8
Elementary alternatives Cassette exon Alternative donor site Alternative acceptor site Retained intron
9
Human genes mRNAEST cons.non-cons.cons.non-cons. Cassette exons56257426 Alt. donors1871610 Alt. acceptors1351915 Retained introns4350 Total963011451 Total genes45284144 Conserved elementary alternatives: 69% (EST) - 76% (mRNA) Genes with all isoforms conserved: 57 (45%)
10
Mouse genes mRNAEST cons.non-cons.cons.non-cons. Cassette exons705399 Alt. donors246176 Alt. acceptors156169 Retained introns87104 Total117248228 Total genes68223026 Conserved elementary alternatives: 75% (EST) - 83% (mRNA) Genes with all isoforms conserved: 79 (64%)
11
Real or aberrant non-conserved AS? 24-31% human vs. 17-25% mouse elementary alternatives are not conserved 55% human vs 36% mouse genes have at least one non-conserved variant denser coverage of human genes by ESTs: –pick up rare (tissue- and stage-specific) => younger variants –pick up aberrant (non-functional) variants 17-24% mRNA-derived elementary alternatives are non-conserved (compared to 25-32% EST- derived ones)
12
smoothelin human common mouse human-specific donor-site mouse-specific cassette exon
13
autoimmune regulator human common mouse retained intron; downstream exons read in two frames
14
Na/K-ATPase gamma subunit (Fxyd2) human mouse (deleted) intron common alternative acceptor site within (inserted) intron
15
MutS homolog (DNA mismatch repair) human common dual donor/acceptor site
16
Modrek and Lee, 2003: conserved skipped exons: –98% constitutive –98% major form –28% minor form inclusion level: –highly correlated – good predictor of conservation Minor non-conserved form exons are not aberrant: –minor form exons are supported by multiple ESTs –28% of minor form exons are upregulated in one specific tissue –70% of tissue-specific exons are not conserved Thanaraj et al., 2003: 61% (47-86%) alternative splice junctions are conserved
17
Alternative splicing of orthologous human and mouse genes Sequence divergence in alternative and constitutive regions Evolution of splicing sites Alternative splicing and protein structure
18
Our preliminary observations: less synonymous, more non-synonymous divergence in alternative exons (human/mouse) => positive selection towards variability “Contrary to our prediction, synonymous divergence between humans and non-human mammals was significantly higher in constitutive exons … Intriguingly, non- synonymous divergence was marginally significantly higher in alternative exons” Iida and Akashi, 2000
19
279 proteins from SwissProt+TREMBL with “varsplic” features constitutivealternative% alt. to all length1992706605425% all SNPs112636825% synonymous576 (51%)167 (45%)22% benign401 (36%)141 (38%)26% damaging149 (13%)60 (16%)29% again, there is some evidence of positive selection towards diversity. This is not due to aberrant ESTs (only protein data are considered).
20
Alternative splicing of orthologous human and mouse genes Sequence divergence in alternative and constitutive regions Evolution of splicing sites Alternative splicing and protein structure
21
Alternative splicing in a multigene family: the MAGEA family of cancer/testis specific antigens A locus at the X chromosome containing eleven recently duplicated genes: two subfamilies of four genes each and three single genes One protein-coding exon, multiple different 5’- UTR exons Originates from retroposed spliced mRNA Mutations create new splicing sites or disrupt existing sites
22
Phylogenetic trees (protein-coding and upstream regions)
23
Expression data pooled by organ/tissue; maximum recorded expression level retained no data for MAGEA10; MAGEA3 and MAGEA6 likely non-distinguishable green: normal; brown: cancer
24
Simple genes with alternatives in exon 1 (MAGEA1, MAGEA5, MAGEA3/6) 1 1b MAGEA1 1 MAGEA5 (normal placenta) 1 MAGEA3 1a 1 1 MAGEA6 (testis, brain/medulla, cancer) 1a
25
Two more genes of subfamily B: multiple isoforms of MAGEA2 and a deletion in MAGEA12 MAGEA2 1 1 1 1 1 1 1 2a 4d 5 5 56 1-0 MAGEA12 1-046 6-5
26
Isoforms of subfamily A 1 2-1 1 1 1 1 1 1 1 1 3 2 2 2d 2 4a 4c 4b MAGEA8 MAGEA9 (testis, no cancers) MAGEA10 MAGEA11
27
Multiple duplications of the initial exon in MAGEA4 1 1 1 1 1 1 1 1 1 MAGEA4 (testis and cancers; brain/medulla; also common 3’ ESTs in placenta)
28
Chimaeric mRNAs (splicing of readthrough transcripts) 1 initial exon of MAGEA10exons of MAGEA5 exon in intergenic space initial exon of MAGEA12 exons of BC013171 exon in intergenic space
29
Other examples: galactose-1-phosphate uridylyltransferase + interleukin-11 receptor alpha chain (Magrangeas et al., 1998) P2Y11 [receptor] + SSF1 [nuclear protein] (Communi et al., 2001) PrP [Prion protein] + Dpl [prion-like protein Doppel] (Moore et al., 1999) cytochrome P450 3A: CYP3A7 + two exons of a downstream pseudogene read in a different frame (Finta & Zaphiropoulos, 2000) HHLA1 + OC90 [otoconin-90] (Kowalski et al., 1999) TRAX [translin-associated factor X] + DISC1 [candidate schizophirenia gene] (Millar et al., 2000) Kua + UEV1 [polyubiquination coeffector] (Thomson et al., 2000) FR + GAP [Rho GTPase activating protein] (Romani et al., 2003) - ? methyonyl tRNA synthetase + advillin (Romani et al., 2003) - ?
30
Birth of donor sites (new GT in alternative intial exon 5)
31
Birth of an acceptor site (new AG and polyY tract in MAGEA8-specific cassette exon 3)
32
Birth of an alternative donor site (enhanced match to the consensus (AG) in cassette exon 2)
33
Birth of an alternative acceptor site (enhanced polyY tract in cassette exon 4)
34
Disactivation of a donor site and birth of a new site (non-consensus G and new GT in major-isoform cassette exon 4)
35
Series of mutations sequentially activating downstream acceptor sites (mutated AG in exon 4)
36
Alternative splicing of orthologous human and mouse genes Sequence divergence in alternative and constitutive regions Evolution of splicing sites Alternative splicing and protein structure
37
Data Alternatively spliced genes (proteins) from SwissProt –human –mouse Protein structures from PDB Domains from InterPro –SMART –Pfam –Prosite –etc.
38
Alternative splicing avoids disrupting domains (and non-domain units) Control: fix the domain structure; randomly place alternative regions
39
… and this is not simply a consequence of the (disputed) exon-domain correlation
40
Positive selection towards domain shuffling (not simply avoidance of disrupting domains)
41
Short (<50 aa) alternative splicing events within domains target protein functional sites c) Prosite patterns unaffected Prosite patterns affected FT positions unaffected FT positions affected ExpectedObserved
42
An attempt of integration AS is often young (as opposed to degenerating) young AS isoforms are often minor and tissue-specific … but still functional –although unique isoforms may be result of aberrant splicing AS regions show evidence for positive selection –excess damaging SNPs –excess non-synonymous codon substitutions MAGEA - not aberrant, because explainable by effects of mutations
43
What to do Each isoform (alternative region) can be characterized: –by conservation (between genomes) –if conserved, by selection (positive vs negative) human-mouse, also add rat –pattern of SNPs (synonymous, benign, damaging) –tissue-specificity in particular, whether it is cancer-specific –degree of inclusion (major/minor) –functionality (for isoforms) whether it generates a frameshift how bad it is (the distance between the stop-codon and the last exon-exon junction)
44
What to expect (hypotheses) Cancer-specific isoforms will be less functional and more often non-conserved Non-conserved isoforms will contain a larger fraction of non-functional isoforms; and this may influence evolutionary conclusions Still, after removal of non-functional isoforms, one should see positive selection in alternative regions (more non-synonymous substitutions compared to constant regions etc.); especially in tissue-specific ones.
45
Plans careful and detailed analysis of human- mouse-(rat)-((dog)) AS isoforms (human and mouse ESTs) conservation of AS regulatory sites mosquito-drosophila more families of paralogs; add mouse data AS of transcription factors and receptors
46
Acknowledgements Discussions –Vsevolod Makeev (GosNIIGenetika) –Eugene Koonin (NCBI) –Igor Rogozin (NCBI) –Dmitry Petrov (Stanford) Support –Ludwig Institute of Cancer Research –Howard Hughes Medical Institute
47
Authors Andrei Mironov (GosNIIGenetika) – spliced alignment Shamil Sunyaev (EMBL, now Harvard University Medical School) – protein structure Vasily Ramensky (Institute of Molecular Biology) – SNPs Irena Artamonova (Institute of Bioorganic Chemistry) – human/mouse comparison, MAGEA family Dmitry Malko (GosNIIGenetika) – mosquito/drosophila comparison Eugenia Kriventseva (EBI, now BASF) – protein structure Ramil Nurtdinov (Moscow State University) – human/mouse comparison Ekaterina Ermakova (Moscow State University) – evolution of alternative/constitutive regions
48
References Nurtdinov RN, Artamonova II, Mironov AA, Gelfand MS (2003) Low conservation of alternative splicing patterns in the human and mouse genomes. Human Molecular Genetics 12: 1313-1320. Kriventseva EV, Koch I, Apweiler R, Vingron M, Bork P, Gelfand MS, Sunyaev S. (2003) Increase of functional diversity by alternative splicing. Trends in Genetics 19: 124-128. Brudno M, Gelfand MS, Spengler S, Zorn M, Dubchak I, Conboy JG (2001) Computational analysis of candidate intron regulatory elements for tissue-specific alternative pre-mRNA splicing. Nucleic Acids Research 29: 2338-2348. Dralyuk I, Brudno M, Gelfand MS, Zorn M, Dubchak I (2000) ASDB: database of alternatively spliced genes. Nucleic Acids Research 28: 296-297. Mironov AA, Fickett JW, Gelfand MS (1999). Frequent alternative splicing of human genes. Genome Research 9: 1288-1293.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.