Exome sequencing analysis of the mutational spectrum in carcinogen and genetic models of Kras-driven lung cancer Peter Westcott, Kyle Halliwill, Minh To,

Slides:



Advertisements
Similar presentations
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Advertisements

Next–generation DNA sequencing technologies – theory & practice
Finding the Lost Treasure of NGS Data Yan Guo, PhD.
The Hierarchy of Somatic Mutations in Follicular Lymphoma
DNAseq analysis Bioinformatics Analysis Team
Nuts and Bolts of Clinical Genomic Sequencing Thomas Stricker MD PhD Vanderbilt University.
Targeted Data Introduction  Many mapping, alignment and variant calling algorithms  Most of these have been developed for whole genome sequencing and.
The Challenges Of Sequencing FFPE DNA Using NGS
SNAP: Fast, accurate sequence alignment enabling biological applications Ravi Pandya, Microsoft Research ASHG 10/19/2014.
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis (DNA) Yan Guo.
High Throughput Sequencing
Considerations for Analyzing Targeted NGS Data BRCA Tim Hague,CTO.
Bioinformatics Tips NGS data processing and pipeline writing
Supplementary Figure 1. Somatic mutation spectrum # Substitutions # Substitutions per Mb b c a Repeats Pseudogenes Whole genome Splice sites Non-coding.
Considerations for Analyzing Targeted NGS Data BRCA Tim Hague,CTO.
Whole Exome Sequencing for Variant Discovery and Prioritisation
DRAW+SneakPeek: Analysis Workflow and Quality Metric Management for DNA-Seq Experiments O. Valladares 1,2, C.-F. Lin 1,2, D. M. Childress 1,2, E. Klevak.
Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.
Page 1 Mouse Genome CGH Microarray 44A. Page 2 Mouse Genome CGH Microarray Kit 44A Designed for CGH, Validated with samples of known aberrations Designed.
In partnership with UKMF Spring Day13 th March 2013 Intra-clonal heterogeneity is a critical early event in the preclinical stages of multiple myeloma.
Next Generation DNA Sequencing
Computational methods for genomics-guided immunotherapy
GenomeVIP: A Genomics Analysis Pipeline for Cloud Computing with Germline and Somatic Calling on Amazon’s Cloud R. Jay Mashl October 20, 2014.
Dose Response 100% 50% 0% IC50, EC50 Dose response curve.
1 Commentary 1.Do not get too worried about "methods" and details. I fully expect there to be concepts and techniques that you simply are not going to.
Gerton Lunter Wellcome Trust Centre for Human Genetics From calling bases to calling variants: Experiences with Illumina data.
Considerations for Analyzing Targeted NGS Data Exome Tim Hague, CTO.
The MRPS30 eQTL replicates in a validation cohort 29 discovery eQTL unique to breast cancer were tested in validation MRPS30 eQTL effect was significant.
Cancer Genome Assemblies and Variations between Normal and Tumour Human Cells Zemin Ning The Wellcome Trust Sanger Institute.
LARVA: An integrative framework for Large-scale Analysis of Recurrent Variants in noncoding Annotations M Gerstein, Yale Slides freely downloadable from.
Cancer genomics Yao Fu March 4, Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.
The Genome Assemblies of Tasmanian Devil Zemin Ning The Wellcome Trust Sanger Institute.
Computational methods for genomics-guided immunotherapy Sahar Al Seesi Computer Science & Engineering Department, UCONN Immunology Department, UCONN Health.
Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up.
Short read alignment BNFO 601. Short read alignment Input: –Reads: short DNA sequences (upto a few hundred base pairs (bp)) produced by a sequencing machine.
Current Data And Future Analysis Thomas Wieland, Thomas Schwarzmayr and Tim M Strom Helmholtz Zentrum München Institute of Human Genetics Geneva, 16/04/12.
Personalized genomics
Slide 1 of 24 VIII MUTATIONS Mutations Types of Mutations:
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Research discoveries to diagnostic panels – an update Darren D. O’Rielly, Ph.D., FCCMG Director, Molecular Genetics Laboratory, Eastern Health Director,
Recent Advances in Genomic Science Julian Sampson Institute of Medical Genetics, Cardiff.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
From Reads to Results Exome-seq analysis at CCBR
Canadian Bioinformatics Workshops
Data and Hartwig Medical Foundation
Short Read Sequencing Analysis Workshop
Genomon a high-integrity pipeline for cancer genome and transcriptome sequence analysis Kenichi Chiba(1), Yuichi Shiraishi(1), Ai Okada(1), Hiroko.
Cancer Genomics Core Lab
Disease risk prediction
Using RNA-seq data to improve gene annotation
The Genome Diversity in Africa Project
QC analysis Uppsala University Work done by Jonas Almlöf
Computational methods for genomics-guided immunotherapy
EMC Galaxy Course November 24-25, 2014
Comparison of Clinical Targeted Next-Generation Sequence Data from Formalin-Fixed and Fresh-Frozen Tissue Specimens  David H. Spencer, Jennifer K. Sehn,
DNA Extraction of Lung Cancer Samples for Advanced Diagnostic Testing
Congenic mice reveal effect of SNP, genomic rearrangements and expression variation on genome wide gene expression Introduction There is still no well-defined.
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
Gene – Expression – Mutation - polymorphism
A Targeted High-Throughput Next-Generation Sequencing Panel for Clinical Screening of Mutations, Gene Amplifications, and Fusions in Solid Tumors  Rajyalakshmi.
Validation of a Next-Generation Sequencing Pipeline for the Molecular Diagnosis of Multiple Inherited Cancer Predisposing Syndromes  Paula Paulo, Pedro.
Linking Genetic Variation to Important Phenotypes
Genomic alterations in breast cancer cell line MDA-MB-231.
黃其晟1 塗昭江1 張景年1 黃清水2 1天主教輔仁大學附設醫院 乳房外科 2國泰綜合醫院 外科部
Cancer WGS Analytical Pipeline Validation
TS Tumor Panel (15 Genes) Overview
Development of a Novel Next-Generation Sequencing Assay for Carrier Screening in Old Order Amish and Mennonite Populations of Pennsylvania  Erin L. Crowgey,
Presentation transcript:

Exome sequencing analysis of the mutational spectrum in carcinogen and genetic models of Kras-driven lung cancer Peter Westcott, Kyle Halliwill, Minh To, David Quigley, Reyno Delrosario, Erik Fredlund, David Adams 1, and Allan Balmain UCSF Helen Diller Family Comprehensive Cancer Center, rd Street, San Francisco. 1 Wellcome Trust Sanger Centre, Cambridge, England.

Why sequence tumors from mice?  Timing of initiation  collection  Initiating gene(s), carcinogen(s)  Can distinguish mutations involved in initiation from progression Control!

Specific goals of this study Part of the MMHCC TCGA Pilot Project  What is the effect of the causative carcinogen on mutation spectrum? Characterize the utility of sequencing mouse tumors:  Clean genetic induction (GEM) vs. carcinogen induction?  What mutations arise after Kras initiation?

Exome sequencing Urethane MNU Kras LA2 (GEM) 44 lung tumors from 17 mice 26 lung tumors from 7 mice 13 lung tumors from 4 mice Kras +/- (FVB/Ola) Kras +/- Kras +/+ Kras LA2 (FVB/Ola) Control tail DNA: 2 Kras +/+ tails Spontaneous lung tumors

Exome sequencing  Have a confident list of somatic variants  Have aligned reads to mouse genome, called against multiple controls and performed extensive QC (Kyle Hallilwill)  Illumina paired-end sequencing (Wellcome Trust Sanger Centre)

Exome sequencing

Carcinogen models of Kras-driven lung cancer  ~90% of lung tumors harbor Kras mutations. Urethane (ethyl carbamate)  Adenosine and cytidine DNA adducts lead to mispairing:  Kras Q61L (CAA  CTA), Q61R (CAA  CGA). A T Replication Mispairing

Carcinogen models of Kras-driven lung cancer MNU (methyl-nitroso urea)  ~90% of lung tumors harbor Kras mutations  Guanosine DNA adducts lead to G  A transitions  Kras G12D (GGT  GAT) Genome-wide spectrum of these carcinogen mutations not known G GG A Replication Mispairing

Mutation spectrum Urethane MNU LA2 Light shade = Kras +/-

Mutation spectrum Slight bias for mutations at G/C nucleotide Strong bias for mutations at G nucleotide with flanking G or A Strong bias for mutations at A/T nucleotide

Mutation spectrum Average counts per tumor  Purine bias at 5’ flanking base 5’ A 5’ G

Mutation spectrum  Are non-carcinogen mutations separable? Average counts per tumor For the most part NCG->T Other G->A A->T A->G A->C G->C G->T Urethane MNU LA2

ARE CARCINOGEN MUTATIONS RELEVANT?

Other driver mutations?  Analysis complicated: High mutation rates: MNU – 21.2/Mb Urethane – 6.4/Mb LA2 – 1.9/Mb Correlation between gene length and mutations  Start with variants within Vogelstein’s 2013 list of drivers: Selected only consequential mutations at highly conserved sites in expressed genes

Other driver mutations? GENEEXON_LENGTHNONSYN_MUT Mll Sf3b Crebbp75074 Asxl Pdgfra65533 Met66523 Cic60993 Atm Arid1b Alk59183 Gnas37172 Notch Arid1a81752 Fgfr Hnf1a31862 Flt Brca Akt Rb None of these mutations occur in LA2 tumors Slight enrichment for longer genes Modest increase in NS mutation ratio One S367 to F – required for autophosph. and activity Subclonal Myc T58P?

Conclusions  Clear recapitulation of expected carcinogen mutations Mutation Spectrum  GEM shows few mutations  Mutations highly specific and distinguishable Driver Mutations  Kras  Interesting candidates in carcinogen-induced tumors

Future work  InDel analysis.  Optimize list of potential driver mutations (relevant sites?).  Validate top 1000 interesting variants by Sequenom (Wellcome Trust Sanger Centre).  Array CGH (copy number analysis). Inverse correlation of point mutational burden and copy number changes?

Acknowledgments $: NSF Kyle Halliwill Minh To David Quigley Reyno Del Rosario Erik Fredlund ALLAN BALMAIN DAVID ADAMS (WELLCOME TRUST SANGER CENTRE) $: NIH Training Grant T32 GM $: MMHCC

Supplemental (Kyle’s Pipeline) Capture using Agilent mouse whole exome kit Sequenced on illumina HiSeq – Paired end, 75 bp each, average read span of 180 bp Converted back to FASTQ, then followed QC pipeline (next slide)

Supplemental (Kyle’s Pipeline) Align to Mm10 with BWA Mark duplicates and fix mate information with picard Base recalibration and realignment with GATK Alignment and coverage information with picard Variant calling with MuTect Filter for depth and previously observed variants with vcftools QC and Variant Calling Strategy

Supplemental (Kyle’s Pipeline) Sample.bam Sample.bam Control 1.bam Control 2.bam Intersect Variant List1.vcf Variant List2.vcf Variant Calling via MuTect Candidate Variant List.vcf Candidate Variant List.vcf Candidate Variants Filter, Annotate Variant Calling Details