Download presentation
Presentation is loading. Please wait.
1
WV-INBRE West Virginia IDeA Network of Biomedical Research Excellence Managing the NextGen data pipeline Jim Denvir, Ph.D.
2
NextGen data challenges NextGen Sequencing produces very large data sets – Order of Terabytes (10 12 bytes) per run Data analysis requires considerable computing power and specialist management Main challenge is in distilling useful information from raw data WV-INBRE West Virginia IDeA Network of Biomedical Excellence
3
Core Facility support Bioinformatics and Genomics core facilities provide support for investigators needing to have NextGen Sequencing data analyzed – Perform analysis from early part of pipeline – Perform downstream analysis, or provide support and software for individual investigators Depending on needs and expertise of investigator WV-INBRE West Virginia IDeA Network of Biomedical Excellence
4
NextGen Analysis Pipeline Image Analysis Base Calling Demultiplexing* Alignment SNP calling or RNA Read Counting Statistical Analysis Functional Analysis CASAVA (Illumina) or open source (Tuxedo Suite, R/Bioconductor) * May require custom scripts Partek or R/Bioconductor Real Time Analysis performed by RTA software on sequencer IPA Automated Core Facility Investigator WV-INBRE West Virginia IDeA Network of Biomedical Excellence
5
Commercial Tools Examples: RTA, CASAVA, Partek, IPA Pros: – Short learning curve Potentially can be used by individual investigators – Usually come with technical support and training Cons: – Expensive – Closed, proprietary source code WV-INBRE West Virginia IDeA Network of Biomedical Excellence
6
Open Source Examples: R/Bioconductor, Tuxedo suite Pros: – Free – Open source Enables rapid, community-led improvement Potentially more academically reviewable Cons: – Steeper learning curve Typically prohibitive for individual investigators – Sparse technical support WV-INBRE West Virginia IDeA Network of Biomedical Excellence
7
Tools developed on site Pros: – Can fill in missing functionality from available tools – Customized exactly to our needs – Potential for a revenue source Cons: – Development is very time consuming WV-INBRE West Virginia IDeA Network of Biomedical Excellence
8
Roadmap Experience from microarray data analysis suggests: – Start with commercial tools Rapid start-up enables us to focus on learning scientific basis for the analyses – Transition to open-source tools for some parts of pipeline Probably mid 2012-mid 2014 Provides for financial saving further down the road Sometimes better received by journal reviewers Initial steps of analysis pipeline and functional analysis will still be managed by commercial software – Develop custom solutions only when needed WV-INBRE West Virginia IDeA Network of Biomedical Excellence
9
Storing Data Archiving data from NextGen experiments requires a large amount of disc space Once analysis is complete, some raw image data will be deleted – Storage of data is more expensive than re-running an experiment! – Will consider exceptions for experiments which cannot be repeated WV-INBRE West Virginia IDeA Network of Biomedical Excellence
10
NextGen analysis server Genomics Core has a Linux server for managing analysis and storing data – Housed in Drinko library and managed by central campus IT staff Has 42 Terabytes of usable disc space – Uses redundant system to allow for potential of drive failures without losing data – Additionally, IT will back up data off site WV-INBRE West Virginia IDeA Network of Biomedical Excellence
11
Things to remember Core facilities are there to help! At experimental design stage, be sure you understand what analysis the core facility will perform – Would you prefer to have IPA done by the core, or would you prefer control over that stage If so, do you need training and/or support? WV-INBRE West Virginia IDeA Network of Biomedical Excellence
12
Questions ? WV-INBRE West Virginia IDeA Network of Biomedical Excellence Presentation available at http://users.marshall.edu/~denvir/presentations.html
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.