Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission.

Similar presentations


Presentation on theme: "Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission."— Presentation transcript:

1 Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission Problems RAS, Moscow, Russia October 2008

2 % of alternatively spliced human and mouse genes by year of publication Human (genome / random sample) Human (individual chromosomes) Mouse (genome / random sample) All genes Only multiexon genes Genes with high EST coverage

3 Roles of alternative splicing Functional: –creating protein diversity ~30.000 genes, >100.000 proteins –maintaining protein identity e.g. membrane (receptor) and secreted isoforms dominant negative isoforms combinatorial (transcription factors, signaling domains) –regulatory E.g. via chanelling to NMD Evolutionary

4 Evolution of alternative exon-intron structure –mammals: human compared to mouse and dog mouse and rat compared to human and dog paralogs –dipteran insects Drosophila melanogaster, D. pseudoobscura, Anopheles gambiae many drosophilas Evolutionary rates in constitutive and alternative regions –human and mouse –D. melanogaster and D. pseudoobscura –many drosophilas –human-chimpanzee vs. human SNPs Alternative splicing and protein domains Regulation of AS via conserved RNA structures Plan

5 Elementary alternatives Cassette exon Alternative donor site Alternative acceptor site Retained intron

6 EDAS: a database of alternative splicing Sources: –human and mouse genomes –GenBank –RefSeq consider cassette exons and alternative splicing sites functionality: potentially translated vs. NMD-inducing elementary alternatives (in-frame stops, length non divisible by 3) humanmouse genes2895731811 mRNA / cDNA114624215212 proteins91844126797 ESTs42945903817531 all alternatives5171344030 elementary alternatives3174621329

7

8 Alternative exon-intron structure in the human, mouse and dog genomes Human-mouse-dog triples of orthologous genes We follow the fate of human alternative sites and exons in the mouse and dog genomes Each human AS isoform is spliced-aligned to the mouse and dog genome. Definition of conservation: –conservation of the corresponding region (homologous exon is actually present in the considered genome); –conservation of splicing sites (GT and AG)

9 Caveats we consider only possibility of AS in mouse and dog: do not require actual existence of corresponding isoforms in known transcriptomes we do not account for situations when alternative human exon (or site) is constitutive in mouse or dog of course, functionality assignments (translated / NMD-inducing) are not very reliable

10 Gains/losses: loss in mouse Common ancestor

11 Gains/losses: gain in human (or noise) Common ancestor

12 Gains/losses: loss in dog (or possible gain in human+mouse) Common ancestor

13 Human-specific alternatives: noise? Conserved alternatives Triple comparison Human-specific alternatives: noise? Conserved alternatives Lost in dog Lost in mouse

14 Translated and NMD-inducing cassette exons Mainly included exons are highly conserved irrespective of function Mainly skipped translated exons are more conserved than NMD-inducing ones Numerous lineage-specific losses –more in mouse than in dog –more of NMD-inducing than of translated exons ~40% of almost always skipped (<1% inclusion) exons are conserved in at least one lineage

15 Mouse+rat vs human and dog: a possibility to distinguish between exon gain and noise

16 The rate of exon gain: decreases with the exon inclusion rate; increases with the sequence evolutionary rate Caveat: spurious exons still may seem to be conserved in the rodent lineage due to short time Solution: estimate “FDR” by analysis of conservation of pseudoexons

17 Alternative donor and acceptor sites: same trends Higher conservation of ~uniformly used sites Internal sites are more conserved than external ones (as expected)

18 Source of innovation: Model of random site fixation Plots: Fraction of exon- extending alternative sites as dependent on exon length –Main site defined as the one in protein or in more ESTs –Same trends for the acceptor (top) and donor (bottom) sites The distribution of alt. region lengths is consistent with fixation of random sites –Extend short exons –Shorten long exons

19 Genetic diseases Mutations in splice sites yield exon skips or activation of cryptic sites Exon skip or activation of a cryptic site depends on: –Density of exonic splicing enhancers (lower in skipped exons) –Presence of a strong cryptic nearby Av. dist. to a stronger site Skipped exons Cryptic site exons Non-mutated exons Donor sites22075289 Acceptor sites 1856681

20 One more source of innovation: site creation MAGE-A family of human CT-antigens –Retroposition of a spliced mRNA, then duplication –Numerous new (alternative) exons in individual copies arising from point mutations Creation of donor sites

21 Improvement of an acceptor site

22 Alternative exon-intron structure in fruit flies and the malarial mosquito Same procedure (AS data from FlyBase) –cassette exons, splicing sites –also mutually exclusive exons, retained introns Follow the fate of D. melanogaster exons in the D. pseudoobscura and Anopheles genomes Technically more difficult: –incomplete genomes –the quality of alignment with the Anopheles genome is lower –frequent intron insertion/loss (~4.7 introns per gene in Drosophila vs. ~3.5 introns per gene in Anopheles)

23 Conservation of coding segments constitutive segments alternative segments D. melanogaster – D. pseudoobscura 97%75-80% D. melanogaster – Anopheles gambiae 77%~45%

24 Conservation of D.melanogaster elementary alternatives in D. pseudoobscura genes blue – exact green – divided exons yellow – joined exon orange – mixed red – non-conserved retained introns are the least conserved (are all of them really functional?) mutually exclusive exons are as conserved as constitutive exons

25 Conservation of D.melanogaster elementary alternatives in Anopheles gambiae genes blue – exact green – divided exons yellow – joined exons orange – mixed red – non-conserved ~30% joined, ~10% divided exons (less introns in Aga) mutually exclusive exons are conserved exactly cassette exons are the least conserved

26 Evolution of (alternative) exon-intron structure in nine Drosophila spp. Dana Dmel Dsec Dyak Dere Dpse Dmoj Dvir Dgri D. melanogaster D. sechelia D. yakuba D. erecta D. ananassae D. pseudoobscura D. mojavensis D. virilis D. grimshawi D. Pollard, http://rana.lbl.gov/~dan/trees.html

27 Gain and loss of alternative segments and constitutive exons Dmel Dsec Dyak Dere Dana Dpse Dmoj Dvir Dgri Caveat: We cannot observe exon gain outside and exon loss within the D.mel. lineage 1 / 7 19 / 23 20 / 32 2 / 4 2 / 16 5 / 13 1 / 16 7 / 8 Notation: Patterns with single events / Patterns with multiple events (Dollo parsimony) 9 / 21 7 / 12 Sample size 397 / 452 18596 / 18874 5 / 8 1 / 2 3 / 5 8 / 21 1 / 5 9 / 12 6 / 15 8 / 33 5 / 7 2 / 3 3 / 10 10 / 12 7 / 7 1 / 1 0 / 2 2 / 12 0 / 1 8 / 10 3 / 5

28 Evolutionary rate in constitutive and alternative regions Human and mouse orthologous genes D. melanogaster and D. pseudoobscura Estimation of the d n /d s ratio: higher fraction of non-synonymous substitutions (changing amino acid) => weaker stabilizing (or stronger positive) selection

29 Human/mouse genes: non-symmetrical histogram of d n /d s (const)– d n /d s (alt) Black: shadow of the left half. In a larger fraction of genes d n /d s (alt) > d n /d s (const), especially for larger values

30 Concatenated regions : Alternative regions evolve faster than constitutive ones dNdN dN/dSdN/dS dSdS dN/dSdN/dS dSdS dNdN 1 0

31 Weaker stabilizing selection (or positive selection) in alternative regions (insignificant in Drosophila) dN/dSdN/dS dNdN dSdS dN/dSdN/dS dSdS dNdN 1 0

32 Different behavior of terminal alternatives dN/dSdN/dS dSdS dNdN 1,5 0 Mammals: Density of substitutions increases in the N-to-C direction Drosophila: Synonymous substitutions prevalent in terminal alternative regions; non-synonymous substitutions, in internal alternative regions

33 Many drosophilas: dN in mut. exclusive exons same as in constitutive exons dS lower in almost all alternatives: regulation?

34 Many drosophilas: relaxed (positive?) selection in alternative regions

35 The MacDonald-Kreitman test: evidence for positive selection in (minor isoform) alternative regions Human and chimpanzee genome substitutions vs human SNPs Exons conserved in mouse and/or dog Genes with at least 60 ESTs (median number) Fisher’s exact test for significance Pn/Ps (SNPs)Kn/Ks (genomes)diff.Signif. Const.0.720.62– 0.100 Major0.780.65– 0.130.5% Minor1.411.89+ 0.480.1% Minor isoform alternative regions: More non-synonymous SNPs: Pn(alt_minor)=.12% >> Pn(const)=.06% More non-synonym. substitutions: Kn(alt_minor)=.91% >> Kn(const)=.37% Positive selection (as opposed to lower stabilizing selection): α = 1 – (Pa/Ps) / (Ka/Ks) ~ 25% positions Similar results for all highly covered genes or all conserved exons

36 What does alternative splicing do to proteins? SwissProt proteins PFAM domains SwissProt feature tables

37 Alternative splicing avoids disrupting domains (and non-domain units) Control: fix the domain structure; randomly place alternative regions

38 … and this is not simply a consequence of the (disputed) exon-domain correlation

39 Positive selection towards domain shuffling (not simply avoidance of disrupting domains)

40 Short (<50 aa) alternative splicing events within domains target protein functional sites c) Prosite patterns unaffected Prosite patterns affected FT positions unaffected FT positions affected ExpectedObserved

41 An attempt of integration AS is often species-specific young AS isoforms are often minor and tissue-specific … but still functional –although species-specific isoforms may result from aberrant splicing AS regions show evidence for decreased negative selection –excess non-synonymous codon substitutions AS regions show evidence for positive selection –excess fixation of non-synonymous substitutions (compared to SNPs) AS tends to shuffle domains and target functional sites in proteins Thus AS may serve as a testing ground for new functions without sacrificing old ones

42 What next? AS in one species, constitutive splicing, in another (data from microarrays) Changes in inclusion rates Evolution of regulation of AS Control for: –functionality: translated / NMD-inducing (frameshifts, stop codons) –exon inclusion (or site choice) level: major / minor isoform –tissue specificity pattern (?) –type of alternative – 1: N-terminal / internal / C-terminal –type of alternative – 2: cassette and mutually exclusive exon, alternative site

43 Acknowledgements Discussions –Eugene Koonin (NCBI) –Igor Rogozin (NCBI) –Vsevolod Makeev (GosNIIGenetika) –Dmitry Petrov (Stanford) –Dmitry Frishman (GSF, TUM) Data –King Jordan (NCBI) Support –Howard Hughes Medical Institute –INTAS –Russian Academy of Sciences (program “Molecular and Cellular Biology”) –Russian Foundation of Basic Research

44 Authors Andrei Mironov (Moscow State University) Ramil Nurtdinov (Moscow State University) – human/mouse+rat/dog Dmitry Malko (GosNIIGenetika, Moscow) – drosophila/mosquito Ekaterina Ermakova (Moscow State University, IITP) – Kn/Ks Vasily Ramensky (Institute of Molecular Biology, Moscow) – SNPs, MacDonald-Kreitman test Evgenia Kriventseva (now at U. of Geneva) and Shamil Sunyaev (now at Harvard U. Medical School) –protein structure Irena Artamonova (Inst. of General Genetics, Moscow) – human/mouse, plots, MAGE-A Alexei Neverov (GosNIIGenetika, Moscow) – functionality of isoforms

45 Bonus track: conserved secondary structures regulating (alternative) splicing in the Drosophila spp. ~ 50 000 introns 17% alternative, 2% with alt. polyA signals >95% of D.melanogaster introns mapped to at least 7 of 12 other Drosophila genomes Search for conserved complementary words at intron termini (within 150 nt. of intron boundaries), then align Restrictive search => 200 candidates 6 tested in experiment (3 const., 3 alt.). All 3 alt. ones confirmed

46 CG33298 (phopspholipid translocating ATPase): alternative donor sites

47 Atrophin (histone deacetylase): alternative acceptor sites

48 Nmnat (nicotinamide mononucleotide adenylytransferase): alternative splicing and polyadenylation

49 Less restrictive search => many more candidates

50 Properties of regulated introns Often alternative Longer than usual Overrepresented in genes linked to development

51 Authors Andrei Mironov (idea) Dmitry Pervouchine (bioinformatics) Veronica Raker, Center for Genome Regulation, Barcelona (experiment) Juan Valcarcel, Center for Genome Regulation, Barcelona (advice) Mikhail Gelfand (general pessimism)


Download ppt "Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission."

Similar presentations


Ads by Google