Current Data And Future Analysis Thomas Wieland, Thomas Schwarzmayr and Tim M Strom Helmholtz Zentrum München Institute of Human Genetics Geneva, 16/04/12
Overview 1.Current Data – mRNA – miRNA – Imputation 2.Future Analysis – Variant Calling on mRNA Data
mRNA Our 48 samples have on average 55M reads 88% mapped (BWA; reads trimmed to 50bp) ~4% of aligned reads intersect lincRNA (from UCSC)
miRNA
31% (+- 13%) of reads overlapping with known miRNAs (UCSC)
Imputation info value: “a measure of the relative statistical information about the SNP allele frequency from the imputed data” (Marchini, J., & Howie, B. Nature reviews. Genetics, 2010)
Variant Calling on mRNA Data Illumina Pipeline Alignment BWAReference genome Variant calling SAMtools Variant filter SAMtools / custom Variant annotation UCSC gene tables dbSNP Database Candidate genes Linkage information Run statistics Inheritance model Base Quality Alignment Statistics Enrichment Statistics RefSeq genes lincRNA,miRNAs,...
Variant Calling on mRNA Data Variant calling SAMtools Variant filter SAMtools / custom Variant annotation UCSC gene tables dbSNP Database Candidate genes Linkage informationInheritance model RefSeq genes lincRNA,miRNAs,... pre-aligned mRNA files
Samples Case-Controls Homozygous-Heterozygous Type of variation SNV call quality dbSNP / HapMap Average heterozygozity
Exome database Genes Cases Controls Quality Annotation dbSNP HGMD 1000 genomes PolyPhen Prediction,...
Variant Calling on mRNA Data Possible analysis: mRNA calling vs. WG genotypes mRNA calling vs. Imputation RNA editing? Recent discussions about amount: “widespread” (M. Li et al., Science 333, 53, 2011) vs. “very few” (Schrider et al., Plos One, 2011) Can we find known sites from literature? (e.g. Li, J. B. et al. (2009). Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science, 324(5931), )