Download presentation
Presentation is loading. Please wait.
1
RNA-SEQ IN PPMI Whole-Blood samples
Kendalll Craig, Keller, Cookson, Van Keuren-Jensen May 3rd, 2018
2
Overview »PI: Cookson, Craig, Keller, Van Keuren-Jensen
»Background: Develop a comprehensive RNA resource from whole blood samples that can be easily accessed and utilized by the PD research community »Methods: Whole Transcriptome RNA-Seq from PaxGene whole blood »Results (brief with possible figures): Phase I will be released at the end of May »Next steps: Complete and release Phase II and small RNA-Seq for all samples. Complete data portal for easy/accessible gene look up »Relevance for PPMI: Resource for examining transcriptomic changes, their relationship to genetic and clinical variables, and their potential as PD biomarkers
3
RNA-SEQ: BEYOND TRANSCRIPT QUANTIFICATION
Pre-spliced, Spliced, Strandedness Junctions/Isoforms Alternative start-sites Integrated to individual w/ DNA SNPs Allele specific expression Non-sense mediated decay, eQTLs
4
STUDY FLOW 1 HudsonAlpha Institute for Biotechnology 2 TGen/USC 3
PPMI-INFO LONI - Data Transferred to Basespace - Limited LIMs Info Data Generation -Initial Quality Control -Analysis Tables -Pilot Portal For Querying Quality Control, Analysis, Portal -Data Portal, -Production Data Management -Coordination Data Sharing/Unblinding BCL / FASTQ Folders BAMs/FASTQs Tables Pilot Study Main Study Long RNA Comparison of 4 Different Assays on 48 Samples Comparison across 2 sequencing platforms, and 3 sites Analysis of Depth vs. Species Detection Small RNA Comparison of 2 Different Assays Long RNA NovaSeq Sequencing 200M Reads/100M Read Pairs 4,623 Samples Kappa + NEB 27 Plates Completed, Topping off underway Small RNA Finishing QC for production
5
Samples Subjects Baseline 6 Months YR1 YR2 YR3 PD 426 285 365 361 360
196 179 184 171 166 Prodromal 60 61 56 SWEDD 64 53 57 LRRK2 Affected 118 135 108 LRRK2 unaffected 79 138 98 43 SNCA and GBA 200 Genetic Registry 417 86 Samples: Longitudinal 4,500+ Samples on 1,000+ Individuals
6
DATA AND ANALYSIS FLOW FASTQs BAMs Tables Initial Release
Alignment STAR Genome Build/Transcripts Mirror TopMed RNA-SeQC Differential Analysis Multiple Comparisons DESeq2 FASTQs Each: ~10-12 Gbyte Phase 1: Tbytes BAMs Phase 1: Tbytes Tables Each: 1-5 Mb Study: <10 Gbytes Counts Gene,Transcript P-val, Fold Δ Gene,Transcript BAM Files Integrative Analysis +VCFs from WGS Variant Calling, QTLs Stats Reads.. VCFs WGS DNA FASTQ Files Descriptive Analysis Gene/Transcript quantification Salmon TPMs Gene,Transcript Tables: Sufficient for many uses. FASTQs/BAMs: Require shipping of hard-drives
7
PHASE 1 PROGRESS: Initial 2,047 Samples
Ave. Read Pairs: Mil or 218 Mil Reads Med. Read Pairs: 107 Mil. Over 90M: 1,944/2,437 Samples
8
Beyond Coding
9
Principal Component Analysis By Plate
Quality Control Overview Principal Component Analysis By Plate Principal Component Analysis By % Percent Correct Strand ~ 30% of Variance
10
Data Exploration Portal
Interactive Querying of Study Data Landing Page Study Overview Pilot Study Quality Control Study Detail Gene Query Allows exporting of CSV/XLS Allows visualization and comparison to other databases such as GTEX Crosslinked to Relevant Biology and Disease Relevance Portable/Secure Single public HTML, w/ JSON API Integrates via Node.js to OAUTH SSO MongoDB, BigQuery version HTML5/JS, D3.js Cross-platform Validated
11
Acknowledgements Ivo Violich Michelle Webb Eric Alsop, Ph.D.
Elizabeth Hutchins, Ph.D. Karen Crawford, Ph.D. Art Toga, Ph.D. Shawn Levy, Ph.D. + Team
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.