Presentation is loading. Please wait.

Presentation is loading. Please wait.

Trans-splicing in Trypanosoma brucei— results from genome-wide experiments Shai Carmi Bar-Ilan University Department of physics and the faculty of life.

Similar presentations


Presentation on theme: "Trans-splicing in Trypanosoma brucei— results from genome-wide experiments Shai Carmi Bar-Ilan University Department of physics and the faculty of life."— Presentation transcript:

1 Trans-splicing in Trypanosoma brucei— results from genome-wide experiments Shai Carmi Bar-Ilan University Department of physics and the faculty of life sciences February 2010

2 mRNA processing in T. brucei Almost all genes have no promoters. Gene expression is regulated by controlling splicing (?), mRNA stability, and translation. Gene1 Gene2 Gene3 Gene4 Polycistronic Transcript AAAA SL Itai Dov Tkacz Trans-Splicing= And Polyadenylation= mature transcripts translation

3 Splicing overview SL- Spliced Leader RNA See also: Liang et. al, Euk. Cell (2003).

4 cis-splicing machinery and consensus 3’ splice-site snRNPs Yeast conserved branch site: TACTAAC 10-12nts mammalian

5 Splicing regulation splicing enhancersplicing silencer SR proteins create ’bridges’ to stabilize the spliceosome hnRNP In trypanosomes: U2F65 and 35 exist and do not interact. U2F65 interacts with SF1. Interacting SR proteins were identified. hnRNP proteins exist.

6 Open questions 3’ splice site recognition and selection. Spatial organization of splicing factors: protein-protein and protein-RNA interactions. Splicing efficiency and gene expression regulation. Detailed molecular mechanism of trans-splicing and spliceosome assembly, structure of 5’ splice site, SL- RNA biogenesis, and coupling to poly-adenylation: not in this talk.

7 Past studies of splicing regulation Clayton et. al, Mol. Biochem. Parasit. (2005): Calculated the statistical properties of the splice sites based on a couple of hundreds ESTs. Clayton et. al, Mol. Cell. Biol. (1994); Ullu et. al, Mol. Cell. Biol. (1998); Cross et. al, Mol. Cell. Biol. (2005): Used reporter gene systems with the splice sites of model genes (tubulin, actin, procyclin) to study the effect of splice site composition on splicing efficiency. Limited applicability. promoterintron5’UTRreporter geneAG Taken from endogenous gene and mutated 3’ splice-site

8 Major known facts Poly-adenylation is coupled to downstream trans-splicing. Hierarchy of trans-splicing and polyA signals exist. Specific sequences in the 5’UTR (exon) are required for splicing. Optimal PPT should be 25 nts long, U dominated but interspersed with Cs, and have no two consecutive purines. Optimal PPT-AG spacer should be 20-25 nts long, have U at position -3 and never AC at [-3,-4]. reporter gene3’UTR5’UTRreporter geneintergenic region 3’ splice-sitepolyA-site

9 Research strategy– outline Sequence all messenger RNAs to map transcript boundaries. Silence splicing factors and measure the effect on each transcript. Examine the splice site regions of regulated genes to infer possible roles for splicing factors and mechanisms of splicing regulation.

10 Methods– deep sequencing illumina guide.

11 Deep sequencing of T. brucei mRNA Experiment performed at Ullu and Tschudi’s lab, Yale University. Library preparation: Total RNA Poly(A) + RNA selectionTerminator exonuclease treatment First strand cDNA synthesis with random hexamer or oligo(dT) primers First strand cDNA synthesis with random hexamer primers Second strand cDNA synthesis with RNaseH-derived RNA primers Second strand cDNA synthesis with SL primer cDNA fragmentation and size selection Addition of adapters and amplification Illumina sequencing 15 million useful reads!

12 Ullu’s lab results 532 transcripts with misannotated start codon. 805 annotated genes not producing an transcript. 442 genes with alternative transcript in their UTRs. 1,114 new transcripts, conserved coding and non-coding. Trans-splicing and polyadenylation of snoRNA clusters. The experimental method can be slightly modified to discover pol-II transcription initiation sites. These sites were found at strand-switch- regions, in proximity to tRNA genes, and within transcription units. Digital gene expression.

13 Examples of reannotated features Chr VIII Chr X Chr VII Chr XI Chr VII Correctly annotated gene cluster. Blue- number of reads from SL-enriched library. Red- number of reads from polyA-enriched library. A novel transcript. A misannotated start codon. Blues line at the bottom denote SL reads. An ORF which is part of a larger transcript. A short transcript at the 3’end of a gene. Red lines at the bottom denote polyA reads. Examples were experimentally verified for all cases.

14 Statistics of UTR lengths UTR length distribution is approximately log-normal. median- 91 5’ 3’ median- 388

15 Splice-site composition PPT No signal observed in the exon No G allowed at the -3 position Non AG splice-sites due to sequencing errors and strain differences. Maximum at about -25, distance from AG varies: unique to trypansomes.

16 Splice-site composition Pyrimidine content Sites closer to the PPT are stronger. PPT disturbed along tens of nucleotides. Purines favored in the exon. exon AG

17 Splice-site composition AC is not preferred at positions [-3,-4] of the 3’ splice-site: Splice-site with AC are less abundant.

18 Splicing heterogeneity Not alternative splicing in the regular sense- leads to the same protein. Average distance (nts) of all weak splice sites from the strongest splice site. Uncertainty of splice-site usage. log-scale 6967 genes: one major site 978 genes: two major sites 21 genes: three major sites Uncertainty

19 Splicing heterogeneity illustrated Each row correspond to one gene. Each site is denoted with a bar. Sites are centered around the strongest site. Bar color is according to relative usage. 0 20 40 60 -300-100100300 ATG nt position relative to START codon relative usage of trans-splice sites Downstream sites are more popular. Some sites are found in frame.

20 Predicting splicing heterogeneity What determines if a gene will be differentially spliced? Look at 100nts up- and down-stream the strongest site. Rank all potential splice sites: TAG-3, AAG, CAG-2, GAG-1. heterogeneity rank of a gene = sum of ranks of all other AG dinucleotides / rank of strongest site. Average heterogeneity rank about 10 for high uncertainty genes, but only about 7 for low uncertainty genes (P=10 -20 ). Signatures do not look meaningful, but analysis show that longer 5’UTRs, shorter PPTs, and longer PPT-AG distance also contribute significantly to heterogeneity.

21 What is heterogeneity good for? Unclear at the moment. Such heterogeneity is not found in other organisms. In cis-splicing, exon boundaries must be conserved to maintain intact coding sequence. In trans-splicing, such evolutionary pressure does not exist. However, trans-splicing heterogeneity was not observed in C. elegans. Can reflect another level of complexity in gene expression regulation, as the degree of heterogeneity significantly varies throughout the genome.

22 Explaining abundance A-rich exons are more abundant. Other correlations: Genes with longer PPT and shorter 5’UTR are more abundant. Splice-site ambiguity is anti-correlated with abundance.

23 A possible model for splicing factors organization? U2F65 does not bind U2F35, so AG can be far from PPT. Variable distance between AG and PPT allows regulation by differential binding of the splicing efficiency. intergenic regionBPPPTAG5’UTR 0-80 10-30 Optimal:25 AC-rich AG competitor splice-site

24 Silencing methods– RNAi Stem-loop construct T7-opposing construct Inducible by Tertracycline. Gene is silenced after 3 days. Wang et. al, JBC (2000).

25 Silencing methods– microarrays Microarrays are chips on which thousands of DNA oligos are printed in an array. Each oligo represents a fragment of one gene. Expression profiles of entire genomes are obtained in a single experiment. Wikipedia

26 Genome-wide observations Hundreds of genes are upregulated- unprecedented phenomenon. U2F65 and SF1 are physically interacting and thus have similar pattern. Vazquez et al., Mol. Biochem Parasitol. 164, 137 (2009). red-up, green-down.

27 Genome-wide correlations Potential protein-protein interactions should be biochemically verified. Interactions maybe indirect.

28 Processes affected by splicing defects Upregulated- Mostly ribosomal and translation involved proteins, peptidases, and chaperones. 10 candidates verified experimentally by RT-PCR. Downregulted- Mostly metabolic enzymes and transporters.

29 Downregulated genes The sequence at the splice site of the genes most impacted by silencing may indicate the role of the splicing factor. Look at PPT length and distance to 3’ splice-site. Most results are negative (discuss reason later). P-value=0.001P-value=0.004 Genes with shorter PPT require SF1Genes with longer PPT-AG distance require PTB1

30 Sequence motifs Using DRIM tool of Yael Mandel-Gutfreund’s lab. Hard to assess the significance of the motifs. Surprisingly no pyrimidine-rich motifs identified. Other tools not suited for RNA motifs or intended for the human genome and thus perform poorly. Should look which elements are conserved. hnRNPF/H binding sites.

31 Mechanisms of regulation RNA level regulation can be mediated via two mechanisms: 1. mRNA stability. The 3’UTR carries a specific sequence that causes stabilization or destabilization under given experimental conditions (silencing). Demonstrated experimentally for a few upregulated genes. Binding can be directly to the silenced splicing factor (U2F65, SF1, …). Splicing factors have been shown to bind mature mRNA in human cells (Carmo-Fonseca et. al, 2006). Alternatively, binding can be to some other factor which is affected by the silencing (secondary effect). Binding can induce both up- and down-regulation of different genes, depending on the context (e.g., competing with stabilizing/destabilizing proteins). Regulation might not due to binding but due to secondary structure. 2. Splicing defects. The absence of a splicing factor might cause downregulation of genes for which it is required for splicing. Such genes may have certain properties such as weak splice site, long PPT-AG distance, short PPT, competition with other AGs, etc.

32 Discussion (problems) Computational approaches are limited by low reproducibility of the microarrays, noisy fold changes, and the very small number of genes affected by more than one factor. Genes with splicing defects are masked by many more genes which are regulated by mRNA stability. It is unclear at the moment if there is a significant number of genes regulated by splicing. mRNA stability can be mediated by more than one factor (primary and secondary effects). Thus, a clean set of genes which undergo the same regulation is hard to obtain.

33 Discussion (future plans) Computational: Deep-sequencing of Leishmania at Ullu’s lab may provide information about conserved regulatory elements. Secondary structure of 3’UTR will be explored. Experimental: Reporter gene system with the intergenic region of a model gene. CLIP-seq (in vivo cross linking and immunoprecipiation followed by deep-sequencing) should yield RNA binding sites. Examine splicing defects (accumulation of SL-RNA or Y-structure) of individual genes or genome-wide (co-silencing of the exosome).

34 Thank you for your attention!


Download ppt "Trans-splicing in Trypanosoma brucei— results from genome-wide experiments Shai Carmi Bar-Ilan University Department of physics and the faculty of life."

Similar presentations


Ads by Google