Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab
Arabidopsis thaliana have been adapted to highly variable environments
Transcription and splicing Chromosomal DNA Transcription Nuclear RNA Exon 1 Exon 2 Exon 3 Intron 1Intron 2 RNA splicing Messenger RNA Exon 1Exon 2Exon 3Exon 1Exon 3
Whole genome tiling array Genetic hybridization polymorphisms could affect the estimation of gene expression High density and resolution: 1.6M unique probes at 35bp spacing Without bias toward known transcripts
Col♀ x Col♂Van ♀ x Van ♂Col ♀ x Van ♂Van ♀ x Col ♂ parental strains and reciprocal F1 hybrids mRNA from total RNA; genomic DNA The experiment
Double-stranded random labeling Random reverse transcription Double-stranded cDNA Random priming AAAAA
Sequence polymorphisms Gene expression variation Splicing variation A functional network of differentially spliced genes HMM for a de novo transcription profiling Outlines
Sequence polymorphisms Gene expression variation Splicing variation A functional network of differentially spliced genes HMM for a de novo transcription profiling Outlines
SFP deletion or duplication in Van Single Feature Polymorphisms and indels SFPs SFP
Sequence polymorphisms SPFs and indels (>200bp) were removed before gene expression analysis SFPs a FDRCol > Van c Van > Col c Total 11.82% % % % % Indels b Model selectiondeletionduplicationTotal BIC d AIC e
Deletions vs duplications
Distribution of indels along chromosomes
Sequence polymorphisms Gene expression variation Splicing variation A functional network of differentially spliced genes HMM for a de novo transcription profiling Outlines
Additive, dominant and maternal effects of gene expression
The linear model Gene probe Intensity ~ additive + dominant + maternal + ε intensity Col Van F1c F1v additive maternal dominant genotypes
Gene expression variation between genotypes Delta a Sig+ b Sig- c TotalFalse d FDR additive % % % % % dominant % % % % % maternal % % % % %
Mean gene intensity Van dominant Col dominant over dominant F1v dominantF1c dominant Maternal paternal The pattern of gene expression inheritance Col Van F1v F1c
The pattern of gene expression inheritance
Enrichment in GO functional categories GO enrichment for additive dominant maternal effect genes Defense response genes are highly expressed in F1 hybrid lines, while many growth related pathway are down-regulated
Sequence polymorphisms Gene expression variation Splicing variation A functional network of differentially spliced genes HMM for a de novo transcription profiling Outlines
Default expression status of exon and intron Exons: correction for gene expression corrected by gene mean corrected by a gene median splicing index (Mean exon /Mean gene ) Introns: direct comparison Exon/intron probe Intensity ~ additive + dominant + maternal + ε
Differential exon splicing Exon probe Intensity ~ additive + dominant + maternal + ε Delta a Sig+ b Sig- c TotalFalse d FDR corrected by gene mean % % % % % Corrected by gene median % % % % % Splicing index % % % % %
Differential intron splicing Intron probe Intensity ~ additive + dominant + maternal + ε Delta a Sig+ b Sig- c TotalFalse d FDR % % % % % %
Differential exon splicing is predominantly additive in F1 hybrids
Some dominant effect in differential intron splicing in F1 hybrids
Comparison for enrichment in known alternatively spliced exons Threshold 1Threshold 2 CalledNot calledCalledNot called Corrected by gene mean Known Not known Fold enrichment p-value 5.97E E-03 Corrected by gene median polish Known Not known Fold enrichment p-value 3.60E E-03 Splicing index Known Not known Fold enrichment p-value 6.84E E-02
AT1G21350 AT1G34180 AT1G76170 AT1G29120 AT1G51350 AT1G80960 AT1G07350 Experimental determined FDR for differential splicing # of significant calls estimated FDR # of tested # of confirmed experimental FDR Exon (corrected by mean) % % % % Exon (corrected by median) % % % % Exon (splicing index) % % % % intron % % % %
Sequence polymorphisms Gene expression variation Splicing variation A functional network of differentially spliced genes HMM for a de novo transcription profiling Outlines
Enrichment of differentially spliced genes in chloroplast thylakoid enrichment of differentially spliced genes
Chloroplast thylakoid
Differrentially spliced genes which are located in chloroplast thylakoid Photosynthesis related genes AT5G38660 APE1 (Acclimation of Photosynthesis to Environment) mutant has altered acclimation responses
AT1G07350transformer serine/arginine-richribonucleoprotein putative AT1G55310SC35-like splicing factor 33 kD(SCL33) AT2G29210splicing factor PWIdomain-containing protein AT5G04430KH domain-containing proteinNOVA putative Splicing regulator tend to be differentially spliced
Sequence polymorphisms Gene expression variation Splicing variation A functional network of differentially spliced genes HMM for a de novo transcription profiling Outlines
Generalized tiling array HMM 3-state HMM Discrete distribution for emission probability Transition probability counts for probe spacing Baum-Welch parameter estimation (by Jake Byrnes)
An example of HMM detected segments
A nice model also needs better array Array density is not enough to distinguish exon/intron boundaries Probe quality
Differential segments >=3 continuous probes with posterior probability >0.99. Differentially expressed genes annotated genes for which ≥33% of their probes reside within the observed differential segments. Differentially spliced genes annotated genes for which <33% of probes resided within the differential segment, or annotated genes containing ≥2 differential segments with different states. Novel gene boundaries differential segments with >= 5 probes extending beyond annotated gene boundary Novel transcripts differential segments with >= 5 probes and outside any annotated gene boundary.
Length distribution of segments called by HMM
Comparison of annotation-based analysis and HMM Col > VanVan > ColTotal Annotation differential expression a differential exonic splicing b differential intronic splicing c HMM differential expression d differential splicing e un-annotated transcript f un-annotated 5' g un-annotated 3' g 28836
Comparison of annotation-based analysis and HMM Annotation Expression (Col>Van) Expression (Van>Col) Splicing (Col>Van) Splicing (Van>Col) HMM Expression (Col>Van) Expression (Van>Col) Splicing (Col>Van) Splicing (Van>Col)
Acknowledgements Justin Borevitz Yan Li Christos Noutsos Geoff Morris Andy Cal Jake Byrnes Josh Rest