Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Experimental Setup  Whole brain RNA-Seq Data from Sanger Institute Mouse Genomes Project [Keane et al. 2011]  Synthetic hybrids with different levels.

Similar presentations


Presentation on theme: " Experimental Setup  Whole brain RNA-Seq Data from Sanger Institute Mouse Genomes Project [Keane et al. 2011]  Synthetic hybrids with different levels."— Presentation transcript:

1  Experimental Setup  Whole brain RNA-Seq Data from Sanger Institute Mouse Genomes Project [Keane et al. 2011]  Synthetic hybrids with different levels of heterozygosity generated by pooling reads from C57/BL6 and four other strains Read statistics Strain variation Inference accuracy Allele Specific Gene/Isoform Expression Make cDNA & shatter into fragments ABCDE Map reads Allele Specific Gene Expression (GE)Allele Specific Isoform Expression (IE) ABCDE Sequence fragment ends H0H1 H0H1 Inference of Allele Specific Expression Levels from RNA-Seq Data Sahar Al Seesi and Ion M ă ndoiu Computer Science and Engineering Dept., University of Connecticut Current Approaches  [Gregg et al., 2010]: parent-of-origin effect in hybrids of inbred mouse strains  [McManus et al., 2010]: cis- and trans-regulatory effects in hybrids of inbred drosophila species  [Heap et al., 2010]: allelic expression imbalance in human primary cells by allele coverage analysis for heterozygous SNP sites within transcripts  [Turro et al., 2011]: allele specific isoform expression through SNP calling and diploid transcriptome construction  [Missirian et al., 2012]: parentally biased gene expression in Arabidopsis hybrids RNA-PhASE Analysis Pipeline Methods  Hybrid Mapping Approach  Independently map reads onto reference genome and transcriptome using bowtie (for Illumina or SOLiD reads) or tmap (for ION Torrent and 454 reads)  Discard reads with multiple alignments in either genome or transcriptome, or unique but discordant alignments in both  Discordance determined at base level to accomodate local alignments of long reads with indel errors (ION Torrent and 454)  SNV Calling and Genotyping (SNVQ) [Duitama et al. 2012]  Bayesian model for SNV discovery and genotype calling from RNA- Seq reads  Phasing SNVs  RefHap [Duitama et al. 2010]  Based on finding a maximum-weight cut in each connected component of the read graph with edges between reads with overlapping alignments ; edge weights given by #mismateches  Coverage Based Phasing  Haplotypes in disconnected blocks of SNVs connected based on allele coverage at the their closest SNV sites  Inference of Allele Specific Isoform Expression  Diploid extension of IsoEM [Nicolae et al. 2011]  Expectation-Maximization algorithm based on a probabilistic model that incorporates fragment length distribution, quality scores, read pairing and, if available, strand information  Detection of Allelic Expression Imbalance  Fisher Exact test for isoforms/genes with allelic expression change fold over a certain threshold Preliminary Results J. Duitama, et al., ReFHap: A Reliable and fast algorithm for Single Individual Haplotyping, Proc. ACM-BCB, pp. 160-169, 2010 J. Duitama and P.K. Srivastava and I.I. Mandoiu, Towards accurate detection and genotyping of expressed variants from Whole Transcriptome Sequencing data, BMC Genomics 13(Suppl 2):S6, 2012 C. Gregg et al., Sex-specific parent-of-origin allelic expression in the mouse brain, Science 239:682-685, 2010 G.A. Heap, et al, Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing, Human Molecular Genetics, 19(1):122134, 2010 T.M. Keane, et al., Mouse genomic variation and its efect on phenotypes and gene regulation, Nature 477(7364):289-294, 2011 C.J. McManus, et, al., Regulatory divergence in Drosophila revealed by mRNA-seq, Genome Research 20:816-825, 2010 V. Missirian, I. Henry, L. Comai, and Vladimir Filkov, POPE: Pipeline of Parentally-Biased Expression, Proc. ISBRA, LNCS 7292:177-188, 2012 M. Nicolae, S. Mangul, I.I. Mandoiu, A. Zelikovsky, Estimation of alternative splicing isoform frequencies from RNA-Seq data, Algorithms for Molecular Biology 6:9,2011 E. Turro, et al., Haplotype and isoform specific expression estimation using multimapping RNA-Seq reads, Genome Biology 12(2):R13, 2011 ACKNOWLEDGEMENTS: This work is supported in part by awards IIS-0546457 from NSF, Agriculture and Food Research Initiative Competitive Grant no. 2011-67016-30331 from the USDA NIFA, and a Collaborative Research Compact award from Life Technologies Corporation. C57BL x Strain Hybrid C57BL IEStrain IEC57BL GEStrain GE C57BL x BALBc0.7050.6750.7060.675 C57BL x AJ0.8550.9020.8560.903 C57BL x CAST0.8720.8240.9240.882 C57BL x SPRET0.9520.7260.9510.725 Pearson correlation between strain- specific FPKM values inferred from separate strain RNA-Seq reads vs. those inferred from pooled reads  RNA-PhASE pipeline addresses limitations of existing ASE methods  Does not require prior availability of diploid genome/transcriptome  Mapping reads against the diploid transcriptome reconstructed on- the-fly resolves bias towards reference alleles  EM model improves inference accuracy by using all reads, including those that map to more than one isoform  In collaboration with the Michael and Rachel O’Neill labs, RNA-PhASE is being used to identify parentally imprinted genes associated with autism Conclusions and Ongoing Work References & Acknowledements


Download ppt " Experimental Setup  Whole brain RNA-Seq Data from Sanger Institute Mouse Genomes Project [Keane et al. 2011]  Synthetic hybrids with different levels."

Similar presentations


Ads by Google