Standards and Guidelines for Validating Next-Generation Sequencing Bioinformatics Pipelines Somak Roy, Christopher Coldren, Arivarasan Karunamurthy, Nefize S. Kip, Eric W. Klee, Stephen E. Lincoln, Annette Leon, Mrudula Pullambhatla, Robyn L. Temple-Smolkin, Karl V. Voelkerding, Chen Wang, Alexis B. Carter The Journal of Molecular Diagnostics Volume 20, Issue 1, Pages 4-27 (January 2018) DOI: 10.1016/j.jmoldx.2017.11.003 Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
Figure 1 Next-generation sequencing (NGS) bioinformatics pipeline. The figure illustrates a bioinformatics pipeline and its components that are typically used for processing NGS data. The illustration may not represent nuances or additional algorithms that are specific to sequencing platforms. The components of the pipeline that overlap with the gray shaded region are out of scope of this guideline. BAM, binary alignment map; HGVS, Human Genome Variation Society; indel, insertion/deletion; QC, quality control; SNV, single-nucleotide variant; VCF, variant call format. The Journal of Molecular Diagnostics 2018 20, 4-27DOI: (10.1016/j.jmoldx.2017.11.003) Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
Figure 2 Systematic literature review. The figure summarizes the strategy that was used to search the PubMed database for identifying articles for systematic literature review. The asterisk indicates a wildcard character meaning that the characters starting at the asterisk can be anything. [All Fields], all PubMed fields were searched for the term indicated; NGS, next-generation sequencing. The Journal of Molecular Diagnostics 2018 20, 4-27DOI: (10.1016/j.jmoldx.2017.11.003) Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
Figure 3 Example of a horizontally complex variant. An example of a horizontally complex variant in exon 19 of the EGFR gene (http://www.genome.jp/dbget-bin/www_bget?refseq; accession number NM_005228.3: c.2236_2253delinsTTG: p.E746_T751delinsL). One deleted sequence is flanked by a deleted sequence on the left and a single-nucleotide change on the right. These individual variants (two deletions and one single-nucleotide substitution) represent a single haplotype, which is a well-characterized activating EGFR variant with therapeutic significance. The Journal of Molecular Diagnostics 2018 20, 4-27DOI: (10.1016/j.jmoldx.2017.11.003) Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
Figure 4 Example of a vertically complex variant. An example of a vertically complex variant in exon 2 of the KRAS gene (https://www.ncbi.nlm.nih.gov/nuccore; accession numberNM_004985.3: c.35G>A:p.G12D; NM_004985.3: c.35G>T:p.G12V). The representative image of the sequence pileup demonstrates three alleles (one reference and two alternative alleles) at the same genomic position. The Journal of Molecular Diagnostics 2018 20, 4-27DOI: (10.1016/j.jmoldx.2017.11.003) Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions