Download presentation
Published byDennis Roberts Modified over 9 years ago
1
GenomeVIP: A Genomics Analysis Pipeline for Cloud Computing with Germline and Somatic Calling on Amazon’s Cloud R. Jay Mashl October 20, 2014
2
Turnkey Variant Analysis
Project Provides a collection of analysis tools and computational frameworks for streamlined discovery and interpretation of genetic variants Pindel BreakDancer VarScan Genome STRiP Multi-tool Variant discovery Cloud computing Scalability Extensibility Genome Variant Investigation Portal Cloud (AWS) local Poster #1678M (Monday) tvap.genome.wustl.edu
3
Genome Variant Investigation Portal
Web server and interface for germline and somatic variant-discovery tools Concurrent pipelines (SNV, indel, SV) with parallelization Launchable on local machines or on the cloud through Amazon Web Services (AWS) Download results from AWS via web browser Pindel VarScan BreakDancer GenomeSTRiP (Harvard U.) Heuristic/statistical calling of single nucleotide variants (SNVs) Indel detection for paired reads based on local realignment Structural variant (SV) detection for paired reads Structural variant detection and genotyping
4
Biological Discoveries (selected)
Comprehensive molecular portraits of human breast tumours Identified four main types by combining data from five platforms Nature 490, (2012) Clonal evolution in relapsed acute myeloid leukaemia “Cancer” consists of multiple variants; founding clone may give rise to relapse clone; subclones may survive therapy and mutate further Nature 481, (2012) Genomic Landscape of Non-Small Cell Lung Cancer in Smokers and Never-Smokers Of patients with lung cancer, smokers found to have10x more mutations than non-smokers Cell 150, (2012) Discovery & genotyping for structural variants in populations ~14,000 deletion polymorphisms with allelic states (1000G pilot) Nature Genetics 43, (2011)
5
Application to APOL1: Demo
Representative samples from PUR population from 1000 Genomes Analyze within the range chr22 : Mbp for known variants: Sample Region Variant Isoforms HG01242 22:36,661,906 A / G G1 (non-silent) HG01101 22:36,662,041 AATAATT / A G2 (D6) HG01049 22:36,133,448 D 767bp
6
Login Select AWS Click Next
7
Sample & Reference Selection
Specify path & retrieve Entering path: Copy the given URI. Click Retrieve. Click on all the PUR low_coverage items to transfer them to the Selected bams textbox. Select reference hs37d5.chr22.fa. Click Next. Select samples Select reference (hs37d5.chr22.fa)
8
SNV Detection: VarScan
All 22: CheckVarScan Select Germline Select SNVs only Select All (pooled) samples Select User-defined region and enter “22: ” Keep p-value: 0.99 Set Output vcf: True Click Next.
9
Indel Detection: Pindel
Check Run Pindel Select All (pooled) samples Select User-defined region and enter “22: ” Click Next. Select All 22:
10
SV Detection: BreakDancer
Check BreakDancer In Step 1, select All (pooled) samples In Step 3, select Intra (ITX) only, user-defined region and enter “22: ” Click Next. 22:
11
SV Detection & Genotyping: GenomeSTRiP
Check Run GenomeSTRiP Verify reference is hs37d5.chr22.fa Select mask human_g1k_v37.mask.36.fasta GC normalization: True, with cn2_mask_g1k_v37.fasta Chromosome: User-defined with “22: ” Variant size: 100bp – 100 kbp. Hs37d5.chr22 100bp-100kbp
12
Amazon AWS Submission Jobs have been tested to finish within a few minutes Select machine type Where to send results Validate & submit
13
Results 22 36133341 DEL_1 T <DEL> …SVLEN=-762;SVTYPE=DEL
AATAATT A . PASS END= ;HOMLEN=4;HOMSEQ=ATAA;SVLEN=-6;SVTYPE=DEL; A G . PASS ADP=7;WT=1;HET=0;HOM=0;NC=2 AATAATT A . PASS ADP=4;WT=0;HET=1;HOM=0;NC=2;
14
Poster #1678 / M (this afternoon) Jay Mashl
genome.wustl.edu) Kai Ye genome.wustl.edu) Li Ding genome.wustl.edu) ...and with thanks to the Ding Lab members Poster #1678 / M (this afternoon) National Human Genome Research Institute
15
Alternate slides
16
Amazon AWS S3 Data Retrieval
Links to actual files to be generated, along with merged VCF Participants will identify variants in the output (Left) Prepared results available, in case of technology problems Click links to download
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.