Genome-wide association study between DSE polymorphism and Poly-A usage in Human population Hiren Karathia Sridhar Hannenhalli
Transcription & Polyadenylation (Poly-A)
Objectives Genome-wide estimation of alternate Poly-A (PA) usage on 3’UTR Genome-wide Prediction and investigation of polymorphisms in DSE (Downstream Sequence Element) motifs Population-wide correlation study between the PA usage and DSE polymorphisms
Annotation status of Poly-A sites on 3’UTR of Human Genome (hg19 – 2009) 37% - Multiple Poly-A points Target of the analysis
RNA-Seq processing for Human Samples Sample Fastq files BWA Samtools BAM fileMerged BAM file Samtools Sorted BAM file De-duplicated file Picard tool Indexing the BAM Samtools SAM file Calculate Coverage Bed tools Calculate Relative usage of PAs Python script SymbolGroup of SamplesMaleFemaleDNARNA BRBritish in England and Scotland11 FIFinnish in Finland11 UTUtah residents with Northern and Western European ancestry11 YOYoruba in Ibadan, Nigeria11 Differential Expression of UTR Cuffdiff tools Python script De-novo assembly
Genome-wide estimation of alternate Poly-A (PA) usage on 3’UTR PA1 Coverage PA2 Coverage PA1 JunctionPA2 Junction Complete UTR coverage Coverage (Stop codon – PA1 junction) / Distance PA1 Usage = Complete (complete 3’ UTR) / Distance Coverage (Stop codon – PA1 junction) / Distance PA1 Usage = Complete (complete 3’ UTR) / Distance Coverage (Stop codon - PA2 junction) / Distance PA2 Usage = Coverage (complete 3’UTR) / Distance Coverage (Stop codon - PA2 junction) / Distance PA2 Usage = Coverage (complete 3’UTR) / Distance Stop Codon Cleaved 3’UTR
Prediction of DSE Coding Strand of DNA Sample A RNA-Seq Sample A DNA-Seq De-novo assembled 3’UTR fragment Prediction of DSE motif Template Strand of DNA
Frequency of Poly-A usage in the samples
Correlation of different PA usage in a Human Sample PA1 – PA2PA2 – PA3 r = ; p = 0.0 r = ; p = 1.06e -33
Correlation of PA usage and corresponding DSE polymorphism
Functional enrichment of Genes associated with Differential PA Usage and Polymorphic for of DSEs in Population
Thank you !!
Differential Expression of complete 3’UTR
Inter/Intra group correlation of a PA usage r = 0.8; p = 0.0 r = 0.98; p = 0.0 PA1 usage BR1 – BR2FN1 – FN2 BR1 – FN1
Statistics of predicted DSE motifs SamplePA typeMean(Motif Length)Max(Motif Length)Min(Motif Length)Mean(Distance)Max(Distance)Min(Distance) BR-1 Single Multiple BR-2 Single Multiple FN - 1 Single Multiple Find Polymorphism in the DSEs Find Correlation between the PA-usage and DSE polymorphism Pending
Alternate Poly-A selection mechanism
Complete 3’UTR coverage VS Alternate 3’UTR coverage Differential expression of complete 3’UTR usageDifferential expression of PA Usage
Poly Adenylation Usage on 3’UTR PA1 CoveragePA2 Coverage PA1 JunctionPA2 Junction Complete UTR coverage PA1 Coverage Relative PA1 Usage = Longest UTR Coverage PA1 Coverage Relative PA1 Usage = Longest UTR Coverage PA2 Coverage Relative PA2 Usage = Longest UTR Coverage PA2 Coverage Relative PA2 Usage = Longest UTR Coverage Stop Codon Intron Cleaved 3’UTR
DSE statistic SamplePA typeMean(Motif Length)Max(Motif Length)Min(Motif Length)Mean(Distance)Max(Distance)Min(Distance) BR-1 Single Multiple BR-2 Single Multiple FN - 1 Single Multiple
+ strand - strand Gene Strand Template Strand + Read - Read RNA Strand DNA Strand
Locations of annotated multiple PA locations on 3’UTR PA1 JunctionPA2 JunctionStop Codon Cleaved 3’UTR PA1 Junction PA2 Junction Stop Codon PAs on same exon PAs on multiple exons r = p = 8.44e Poly-A Location Length of 3’UTR