Presentation is loading. Please wait.

Presentation is loading. Please wait.

Calling Somatic Mutations using VarScan

Similar presentations


Presentation on theme: "Calling Somatic Mutations using VarScan"— Presentation transcript:

1 Calling Somatic Mutations using VarScan
Read the documentation Read the Paper 1. Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, et al. (2012) VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. Available:

2 VarScan USAGE: java -jar VarScan.jar [COMMAND] [OPTIONS] COMMANDS: pileup2snp Identify SNPs from a pileup file pileup2indel Identify indels a pileup file pileup2cns Call consensus and variants from a pileup file mpileup2snp Identify SNPs from an mpileup file mpileup2indel Identify indels an mpileup file mpileup2cns Call consensus and variants from an mpileup file somatic Call germline/somatic variants from tumor-normal pileups copynumber Determine relative tumor copy number from tumor-normal pileups readcounts Obtain read counts for a list of variants from a pileup file filter Filter SNPs by coverage, frequency, p-value, etc. somaticFilter Filter somatic variants for clusters/indels fpfilter Apply the false-positive filter processSomatic Isolate Germline/LOH/Somatic calls from output copyCaller GC-adjust and process copy number changes from VarScan copynumber output compare Compare two lists of positions/variants limit Restrict pileup/snps/indels to ROI positions

3 Generate Pileup File -B disable BAQ computation -q INT skip alignments with mapQ smaller than INT [0] -Q INT skip bases with baseQ/BAQ smaller than INT [13] -d INT max per-BAM depth to avoid excessive memory usage [250] samtools mpileup -B -Q 0 -q 5 -d –f harismendy_data/resources/hg19_lite.fa sampleB.bam > sampleB.mpileup samtools mpileup -B -Q 0 -q 5 -d –f harismendy_data/resources/hg19_lite.fa sampleT.bam > sampleT.mpileup

4 VarScan Somatic varscan]$ java -jar /opt/varscan/VarScan.v2.4.1.jar somatic USAGE: VarScan somatic [normal_pileup] [tumor_pileup] [Opt: output] OPTIONS normal_pileup - The SAMtools pileup file for Normal tumor_pileup - The SAMtools pileup file for Tumor output - Output base name for SNP and indel output OPTIONS: --output-snp - Output file for SNP calls [output.snp] --output-indel - Output file for indel calls [output.indel] --min-coverage - Minimum coverage in normal and tumor to call variant [8] --min-coverage-normal - Minimum coverage in normal to call somatic [8] --min-coverage-tumor - Minimum coverage in tumor to call somatic [6] --min-var-freq - Minimum variant frequency to call a heterozygote [0.10] --min-freq-for-hom Minimum frequency to call homozygote [0.75] --normal-purity - Estimated purity (non-tumor content) of normal sample [1.00] --tumor-purity - Estimated purity (tumor content) of tumor sample [1.00] --p-value - P-value threshold to call a heterozygote [0.99] --somatic-p-value - P-value threshold to call a somatic site [0.05] --strand-filter - If set to 1, removes variants with >90% strand bias [0] --validation - If set to 1, outputs all compared positions even if non-variant --output-vcf - If set to 1, output VCF instead of VarScan native format java -jar /projects/ps-yeolab/biom262_2016/tools/varscan/VarScan.v2.3.9.jar somatic normalsample.mpileup tumorsample.mpileup sample --output-vcf

5 VarScan SomaticFilter
This command filters somatic mutation calls to remove clusters of false positives and SNV calls near indels. Note: this is a basic filter. More advanced filtering strategies consider mapping quality, read mismatches, soft-trimming, and other factors when deciding whether or not to filter a variant. See the VarScan 2 publication (Koboldt et al, Genome Research, Feb 2012) for details. USAGE: java -jar VarScan.jar somaticFilter [mutations file] OPTIONS mutations file - A file of SNVs from VarScan somatic OPTIONS: --min-coverage Minimum read depth [10] --min-reads2 Minimum supporting reads for a variant [2] --min-strands2 Minimum # of strands on which variant observed (1 or 2) [1] --min-avg-qual Minimum average base quality for variant-supporting reads [20] --min-var-freq Minimum variant allele frequency threshold [0.20] --p-value Default p-value threshold for calling variants [1e-01] --indel-file File of indels for filtering nearby SNPs --output-file Optional output file for filtered variants java -jar /projects/ps-yeolab/biom262_2016/tools/varscan/VarScan.v2.3.9.jar somaticFilter AA2253chr1.snp.vcf --indel-file AA2253Tchr1.indel.vcf --output-file AA2253chr1.somfiltered.vcf

6 FP Filter USAGE: java -jar VarScan.jar fpfilter [variant file] [readcount file] OPTIONS variant file - A file of SNPs or indels in VarScan-native or VCF format readcount file - The output file from bam-readcount for those positions ***For detailed filtering instructions, please visit OPTIONS: --output-file Optional output file for filter-pass variants --filtered-file Optional output file for filter-fail variants --dream3-settings If set to 1, optimizes filter parameters based on TCGA-ICGC DREAM-3 SNV Challenge results --keep-failures If set to 1, includes failures in the output file FILTERING PARAMETERS: --min-var-count Minimum number of variant-supporting reads [4] --min-var-count-lc Minimum number of variant-supporting reads when depth below somaticPdepth [2] --min-var-freq Minimum variant allele frequency [0.05] --max-somatic-p Maximum somatic p-value [0.05] --max-somatic-p-depth Depth required to test max somatic p-value [10] --min-ref-readpos Minimum average read position of ref-supporting reads [0.1] --min-var-readpos Minimum average read position of var-supporting reads [0.1] --min-ref-dist3 Minimum average distance to effective 3' end (ref) [0.1] --min-var-dist3 Minimum average distance to effective 3' end (var) [0.1] --min-strandedness Minimum fraction of variant reads from each strand [0.01] --min-strand-reads Minimum allele depth required to perform the strand tests [5] --min-ref-basequal Minimum average base quality for ref allele [15] --min-var-basequal Minimum average base quality for var allele [15] --min-ref-avgrl Minimum average trimmed read length for ref allele [90] --min-var-avgrl Minimum average trimmed read length for var allele [90] --max-rl-diff Maximum average relative read length difference (ref - var) [0.25] --max-ref-mmqs Maximum mismatch quality sum of reference-supporting reads [100] --max-var-mmqs Maximum mismatch quality sum of variant-supporting reads [100] --max-mmqs-diff Maximum average mismatch quality sum (var - ref) [50] --min-ref-mapqual Minimum average mapping quality for ref allele [15] --min-var-mapqual Minimum average mapping quality for var allele [15] --max-mapqual-diff Maximum average mapping quality (ref - var) [50]

7 Varscan Filtering Combine both VCF files vcfcombine AA2253chr1.snp.vcf AA2253chr1.indel.vcf > AA2253chr1.vcf Get coordinates from VCF file grep -v '^#' AA2253chr1.vcf | awk '{print $1"\t"$2"\t"$2}' > coordinates.bed Generate read-counts bam-readcount -q 1 -b 20 -l coordinates.bed -f harismendy_data/resources/hg19_lite.fa harismendy_data/class/AA2253T_groupRealigned.chr1.bam > AA2253T.readcount Apply fpfilter java -jar /projects/ps-yeolab/biom262_2016/tools/varscan/VarScan.v2.3.9.jar fpfilter AA2253chr1.vcf AA2253T.readcount --output-file AA2253.filtered.vcf

8 Handling VCF - VCFlib Read the doc Keep somatic only vcffilter -f "SS = 2" AA2253chr1.filtered.vcf > AA2253chr1.filtered.som.vcf Export into a table using vcf2tsv vcf2tsv -g AA2253chr1.filtered.vcf > AA2253chr1.filtered.tsv

9 Annotating Variants with ANNOVAR
Read the doc Run Table_annovar table_annovar.pl --vcfinput --nastring . --protocol refGene,snp144,cosmic70,exac03 --operation g,f,f,f --buildver hg19 --outfile AA2253chr1 AA2253chr1.filtered.vcf /projects/ps-yeolab/biom262_2016/tools/annovar/humandb/ Calculate stats R Unix (cut/sort/uniq –c) Excel (pivot table)


Download ppt "Calling Somatic Mutations using VarScan"

Similar presentations


Ads by Google