Third generation long-read sequencing of HIV-1 transcripts discloses cell type specific and temporal regulation of RNA splicing Frederic Bushman International AIDS Meeting Washington DC, 2012
Splicing factors prominent in genome-wide siRNA screens HIV RNAs spliced to yield at least 40 mRNAs Sensitivity suggests unexploited opportunity for intervention? Relevant ORFs remain to be discovered? Bushman et al PLoS Path Why Study HIV Splicing?
Approach Amplification: 18 primer pairs Canonical splicing Rare splicing New splicing
cDNA Template Mix Break Emulsion Sequence RainDance Technologies: Single Molecule Droplet PCR Tewhey et al., Nature Biotechnology, 2009 RainDance Technologies b cDNA prep from infected cells a Primer Library Overlapping primer pairs amplify cDNA maintaining ratios Primer Library PCR
Pacific Biosciences: Single molecule sequencing Fixed polymerase Phosphate- labeled nucleotides High throughput single molecule real-time sequencing provides long reads, maintaining linkage between exons Error mitigated by 1.Alignment to 10kb HIV genome 2.SMRTbell approach…
930,294 HIV sequences of up to 2629 bp Pacific Biosciences: Sequence Output Cell Type Mappable Reads Median Raw Read-length Longest HIV Sequence HOS (18,24,48hpi) 88, bp2105 bp Primary CD4T (7 donors triplicate, 48hpi) 841, bp2629 bp
2 Novel Splice Donors Scott Sherrill-Mix
11 Novel Splice Acceptors Scott Sherrill-Mix
Novel Splice Sites Genetic Map Exons SD Splice Donor SA Splice Acceptor * site does not adhere to consensus
Complete message population of HIV in CD4 + T cells 77 complete message structures Evidence for 36 additional transcripts from partial reads Total: 113 mRNAs 19 novel transcripts including a new completely spliced class (~1kb) Scott Sherrill-Mix
Novel Acceptor A8c Novel splice acceptor A8c creates new ORFs in HIV
Dynamic Transcript Populations Mutually exclusive acceptors :
Temporal, cell-type and intra-human variability Dynamic Transcript Populations
Conclusions Long read single molecule sequencing works well to delineate HIV message populations At least 113 different HIV-1 transcripts 1 kb class of RNAs prominent in HIV 89.6 Differential splicing by cell type, time after infection, and among cells from human subjects
Credits Bushman laboratoryFormer Bushman LabCollaborators Troy Brady Gary WangCharles Berry Kyle Bittinger Brett BeitzelSumit Chanda Rohini SinhaMary LewinskiJohn Young Scott Sherrill-Mix Astrid SchroderRenate Koenig Frances MaleAngela Ciuffi Joe Ecker Christian Hoffmann Heather Marshall RoseCraig Hyde Nirav MalaniJeremy Leipzig Mark Yeager Brendan KellyMatt CulybaKushol Gupta Young HwangRick MitchellGreg Van Duyne Stephanie GrunbergTracy Diamond Masahiro Yamashita Serena DolliveEmily CharlsonMike Emerman Alexandra BrysonShannah RothFrancis Collins Sam MinotKaren OcwiejaPhilippe Leboulch Spencer BartonKeshet RonenAlain Fischer Aubrey BaileyGreg PeterfreundMarina Cavazzana-Calvo Rithun MukherjeeSalima Hacien-Bey-Abina Jennifer HwangRik Gijsbers Kristine YoderZeger Debyser Rebecca Custers-Allen
RNA in infected cells is 14% viral. Ratios among HIV message forms HIV infection associated with intron retention in cellular genes Solexa/Illumina Hi Seq 100 base paired end reads 2 uninfected samples 3 infected samples HIV 89.6 in human T-cells ~ 1 Billion sequence reads Both human and HIV