Computational methods for genomics-guided immunotherapy

Slides:



Advertisements
Similar presentations
RNA-Seq as a Discovery Tool
Advertisements

Towards Personalized Genomics-Guided Cancer Immunotherapy Ion Mandoiu Department of Computer Science & Engineering Joint work with Sahar Al Seesi (CSE)
The Past, Present, and Future of DNA Sequencing
An Introduction to Studying Expression Data Through RNA-seq
Marius Nicolae Computer Science and Engineering Department
 Experimental Setup  Whole brain RNA-Seq Data from Sanger Institute Mouse Genomes Project [Keane et al. 2011]  Synthetic hybrids with different levels.
Next–generation DNA sequencing technologies – theory & practice
Peter Tsai Bioinformatics Institute, University of Auckland
Next-generation sequencing
(A) Mutations within neoepitopes lead to structural alterations across the peptide backbone, as illustrated with structural snapshots from the simulations.
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
Bioinformatics pipeline for detection of immunogenic cancer mutations by high throughput mRNA sequencing Jorge Duitama 1, Ion Mandoiu 1, and Pramod Srivastava.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Bioinformatics Tools for Personalized Cancer Immunotherapy
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Epitope Selection Rational Vaccine design. Why? Therapeutic vaccines Therapeutic vaccines Treatment of viral infections (e.g., HIV, HCV), and resistant.
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis (DNA) Yan Guo.
High Throughput Sequencing
Reconstruction of Haplotype Spectra from NGS Data Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science & Engineering.
NGS Analysis Using Galaxy
Whole Exome Sequencing for Variant Discovery and Prioritisation
Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.
Li and Dewey BMC Bioinformatics 2011, 12:323
Expression Analysis of RNA-seq Data
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
RNAseq analyses -- methods
Next Generation DNA Sequencing
The ICGC-TCGA DREAM Somatic Mutation Calling Challenge Summary November 10, 2014 Dr. Paul C. Boutros Principal Investigator, Informatics & Biocomputing.
HaloPlexHS Get to Know Your DNA. Every Single Fragment.
Serghei Mangul Department of Computer Science Georgia State University Joint work with Irina Astrovskaya, Marius Nicolae, Bassam Tork, Ion Mandoiu and.
Sahar Al Seesi and Ion Măndoiu Computer Science and Engineering
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Introduction to RNAseq
Computational methods for genomics-guided immunotherapy Sahar Al Seesi Computer Science & Engineering Department, UCONN Immunology Department, UCONN Health.
Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.
No reference available
Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.
Manuel Holtgrewe Algorithmic Bioinformatics, Department of Mathematics and Computer Science PMSB Project: RNA-Seq Read Simulation.
Current Data And Future Analysis Thomas Wieland, Thomas Schwarzmayr and Tim M Strom Helmholtz Zentrum München Institute of Human Genetics Geneva, 16/04/12.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Calling Somatic Mutations using VarScan
Canadian Bioinformatics Workshops
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Immuno and Epigenetic Therapies Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
From Reads to Results Exome-seq analysis at CCBR
A comparison of somatic mutation callers in breast cancer samples and matched blood samples THOMAS BRETONNET BIOINFORMATICS AND COMPUTATIONAL BIOLOGY UNIT.
SNP and Genomic analysis SNP/genomic signature Clinical sampling Personalized chemotherapy Personalized Targeted therapy Personalized RNA therapy Personalized.
Cancer immunotherapy: an update
Short Read Sequencing Analysis Workshop
Cancer Vaccine Design Ion Mandoiu
Cancer Genomics Core Lab
Socializing Individualized T-Cell Cancer Immunotherapy
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
Computational methods for genomics-guided immunotherapy
Overview of next-generation sequencing, neoantigen prediction, and functional T-cell analyses. Overview of next-generation sequencing, neoantigen prediction,
Sahar Al Seesi University of Connecticut CANGS 2017
2nd (Next) Generation Sequencing
Applications of Immunogenomics to Cancer
Pairing T-cell Receptor Sequences using Pooling and Min-cost Flows
Fig. 1 Cancer exome–based identification of neoantigens.
Utilizing NGS-Data to Evaluate Anti-PD-1 Treatment
Fig. 2 Estimate of the neoantigen repertoire in human cancer.
Elham Sherafat and Ion Mandoiu
Fig. 1 Cancer exome–based identification of neoantigens.
Presentation transcript:

Computational methods for genomics-guided immunotherapy Sahar Al Seesi and Ion Măndoiu Computer Science & Engineering Department University of Connecticut

Class I endogenous antigen presentation

Somatic rearrangement of T-cell receptor genes Potential TCR repertoire diversity: 1015

T-cell selection in thymus Estimated TCR repertoire diversity after selection: ~2x107

T-cell activation and proliferation

T-cell activation and proliferation

T-cell activation and proliferation

The immune system and cancer

Cutting the brakes: PD1 and CTLA-4 blockade

Stepping on the gas: vaccination with neoepitopes

Combined approach Ton N. Schumacher, and Robert D. Schreiber Science 2015;348:69-74

Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing

or Whole Genome Library prep Nextera Rapid Capture Exome Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Tumor DNA Normal DNA Tumor RNA or Whole Genome Library prep Nextera Rapid Capture Exome Whole Transcriptome Library prep Illumina HiSeq Sequencing

or Whole Genome Library prep Exome AmpliSeq Whole Transcriptome Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Tumor DNA Normal DNA Tumor RNA or Whole Genome Library prep Exome AmpliSeq Whole Transcriptome Library prep Ion PGM Ion Proton Sequencing Sequencing

ION-Torrent Proton Runs: read statistics Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing ION-Torrent Proton Runs: read statistics   # of reads # of bases Mean of reads lengths std of reads lengths Melanoma Patient PB 25,031,340 3,272,787,408 130.74 66.88 1T 19,252,932 2,589,624,915 134.5 68.67 2T 28,400,728 4,147,914,801 146.04 66.91 3T 26,039,006 3,800,446,471 145.95 67.02 Synthetic Tumor Normal 20,726,352 3,353,732,704 161.81 63.35 TumorAF10 3,360,840,809 162.15 63.14 TumorAF20 3,367,827,877 162.49 62.92

Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Tumor Exome Reads Human reference Normal Exome Reads Human reference Tumor RNA-Seq Reads

fastq QC Tools Tools to analyze and preprocess fastq files Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing fastq QC Tools Tools to analyze and preprocess fastq files FASTX (http://hannonlab.cshl.edu/fastx_toolkit/) Charts quality statistics Filters sequences based on quality Trims sequences based on quality Collapses identical sequences into a single sequence 

fastq QC Tools Tools to analyze and preprocess fastq files Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing fastq QC Tools Tools to analyze and preprocess fastq files PRINSEQ (http://prinseq.sourceforge.net/) Generates read length and quality statistics Filters reads based on length, quality, GC content and other criteria Trims reads based on length/position or quality scores

Mapping decisions What is the best mapper for your data? Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Mapping decisions What is the best mapper for your data? End-to-end unspliced alignments vs. spliced or local alignments Unique vs. non-unique alignments Tumor Exome Reads Human reference Normal Exome Reads Human reference Tumor RNA-Seq Reads

Aligned reads length mean Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing ION-Torrent Proton read mapping comparison   Bowtie2 TMAP Segemehl % of aligned bases Aligned reads length mean % of aligned bases 2 Aligned reads length mean2 Melanoma Patient PB 89% 138.8 100% 130.7 99% 135.4 1T 90% 143.3 134.5 140.4 2T 153.3 146 150.3 3T 91% 153.4 150.4 Synthetic Tumor Normal 172.89 161.81 98% 167.66 Tumor_AF10 173.03 162.15 167.87 Tumor_AF20 173.17 162.49 168.08

Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Tumor Exome Reads * * * * Human reference * * * * * * Normal Exome Reads

Somatic Variant Callers Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Tumor Exome Reads Somatic Variant Callers Mutect (Broad Inst.) VarScan2 (Wash. U.) SomaticSniper (Wash. U) Strelka (Illumina) SNVQ w/ subtraction (UConn) * * * * Human reference * * * * * * Normal Exome Reads

Coverage distribution of exome vs. SNV calls Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Coverage distribution of exome vs. SNV calls

Comparing Somatic Variant Callers Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Comparing Somatic Variant Callers Synthetic Tumors ION Torrent Proton exome sequencing of two 1K Genomes individual (mutations known) Downloaded from the public Torrent server Both exomes were sequenced on the same Proton chip Subset of the NA19240 sample was used as the normal sample Mixtures of NA19240 and NA12878 samples were used as the tumor samples Reads were mixed in different proportions to simulate allelic fractions, 0.1, 0.2, 0.3, 0.4 and 0.5.

Comparing Somatic Variant Callers Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Comparing Somatic Variant Callers

Comparing Somatic Variant Callers Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Comparing Somatic Variant Callers

The ICGC-TCGA DREAM Somatic Mutation Calling Challenge Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing The ICGC-TCGA DREAM Somatic Mutation Calling Challenge Initial Goal: Find the Best WGS Analysis Methods Challenge 1 Data: 10 Real Tumor/Normal pairs 5 from pancreatic tumors and 5 from prostate tumors Sequenced to ~50x/30x Up to 10K candidates will be validated re-sequencing to ~300x coverage using AmpliSeq primers on an IonTorrent

Criteria for selecting candidate epitopes Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Criteria for selecting candidate epitopes Gene harboring the SNV must be expressed (FPKM estimation) IsoEM (Nicolae et. al., Algorithms for Molecular Biology, 2011) http://dna.engr.uconn.edu/?page_id=105 RSEM (Li et. al., BMC Bioinformatics, 2011) http://deweylab.biostat.wisc.edu/rsem/ Not Expressed X

Criteria for selecting candidate epitopes Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Criteria for selecting candidate epitopes Gene harboring the SNV must be expressed Peptide will be generated inside the cell upon protein being cleaved by the proteasome Peptide will bind to an MHC molecule that will chaperon it to the cell surface NetChop Predicts cleavage sites of the human proteasome http://www.cbs.dtu.dk/services/NetChop/ SYFPEITHI Predicts MHC I, MHC II binding http://www.syfpeithi.de/ NETMHC Predicts MHC I binding http://www.cbs.dtu.dk/services/NetMHC/ NetCTL Combined cleavage and MHC biding predictions http://www.cbs.dtu.dk/services/NetCTL/ >example KYMDQLHRYTKLSYlVVFPLELRLFNTSG

Candidate neo-epitopes statistics for two mouse cell lines Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Candidate neo-epitopes statistics for two mouse cell lines Duan et. al., JEM 2014

Epi-Seq pipeline for neo-epitope prediction on local Galaxy server Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Epi-Seq pipeline for neo-epitope prediction on local Galaxy server

Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Rational vaccine design requires info on the clonal structure of the tumor Not all cells harbor all candidate epitopes Approaches to clonality analysis Computational inference from sequencing depth SNV allelic fractions only Targeted amplicon sequencing of selected mutations at single cell level More noisy data, potentially biased by capture protocols

Cell capture & pre-amp Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Cell capture & pre-amp

PCR on Access Array Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing PCR on Access Array

PCR on Access Array Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing PCR on Access Array

Captured cells in pilot run Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Captured cells in pilot run   1 2 3 4 5 6 7 8 9 10 11 12 A 1_C03 1_C02 1_C01 1_C49 1_C50 1_C51 1_C06 3_C05 1_C04 1_C52 1_C53 1_C54 B 2_C09 1_C08 1_C07 1_C55 1_C56 1_C57 1_C12 1_C11 1_C10 1_C58 1_C59 1_C60 C 1_C15 2_C14 2_C13 1_C61 1_C62 1_C63 1_C18 2_C17 1_C16 1_C64 1_C65 1_C66 D 1_C21 2_C20 1_C19 1_C67 1_C68 1_C69 2_C24 2_C23 4_C22 1_C70 1_C71 1_C72 E bulk 2_C26 1_C27 1_C75 1_C74 1_C73 0_C28 0_C29 0_C30 1_C78 1_C77 2_C76 F 1_C31 0_C32 1_C33 1_C81 1_C80 1_C79 0_C34 1_C35 0_C36 1_C84 0_C83 1_C82 G 0_C37 1_C38 0_C39 1_C87 1_C86 1_C85 1_C40 1_C41 1_C42 1_C90 1_C89 1_C88 H 1_C43 1_C44 1_C45 1_C93 1_C92 1_C91 1_C46 1_C47 1_C48 1_C96 1_C95 1_C94

Analysis Pipeline Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Analysis Pipeline Barcode list 96 fastq files: one per well Fastx Barcode Splitter pooled fastq file tmap 96 sam files: one per well mm9 BALBc genome fasta with +/- 300 bases around each SNV Generate Referece List of SNV Locations compute coverage 96x48 with total and fwd/rev variant coverage for each well/SNV

Target Aligned Reads Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Target Aligned Reads

Per SNV Coverage Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Per SNV Coverage

SNV Support Matrix Cells SNVs Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing SNV Support Matrix Cells SNVs Low High

Which Epitopes go into the vaccine? Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Which Epitopes go into the vaccine? Diversify across HLA alleles Select mutations with balanced allele specific expression Maximize (expected) clone coverage Include epitopes with high MHC binding affinity Include epitopes with high Differential Agretopic Index (DAI): difference in MHC affinity between mutant epitope and its wild type counter part

Duan et. al., JEM 2014 Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Duan et. al., JEM 2014

Expitope (Haase et. al., Bioinformatics 2014) Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Expitope (Haase et. al., Bioinformatics 2014)   Checks for cross-reactivity based on ENCODE RNA-Seq from normal tissues http://webclu.bio.wzw.tum.de/expitope/ OptiTope (Toussaint and Kohlbacher, Nucleic Acid Research 2009) Optimizes allele coverage http://etk.informatik.uni-tuebingen.de/optitope

Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing Toussaint and Kohlbacher, Nucleic Acid Research 2009

T Cell Receptor sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing T Cell Receptor sequencing Compare the TCR repertoire before and after immunization to determine response against used epitope(s) Primary analysis of TCR sequencing data IMSEQ (Kuchenbecker et. al., Bioinformatics 2015) http://www.imtools.org/ HTJoinSolver (Russ et. al., BMC Bioinformatics 2015) https://dcb.cit.nih.gov/HTJoinSolver/

tcR (Nazarov et. al., BMC Bioinformatics, 2015) Sequencing QC and Mapping Calling SNVs Epitope Prediction Clonality Analysis Vaccine Design TCR Sequencing tcR (Nazarov et. al., BMC Bioinformatics, 2015) R package for downstream analysis, including diversity measures, shared T cell receptor sequences identification http://imminfo.github.io/tcr/

THANK YOU! QC Tumor Specific epitope predicton pipeline Vaccine design FASTX: http://hannonlab.cshl.edu/fastx_toolkit/) PRINSEQ: http://prinseq.sourceforge.net/) Epitope prediction NetChop: http://www.cbs.dtu.dk/services/NetChop/ SYFPEITHI: http://www.syfpeithi.de/ NETMHC: http://www.cbs.dtu.dk/services/NetMHC/ NetCTL: http://www.cbs.dtu.dk/services/NetCTL/ Tumor Specific epitope predicton pipeline Epi-Seq: http://dna.engr.uconn.edu/?page_id=470 Also available on out galaxy server: http://mhc1.engr.uconn.edu:8080/ Vaccine design Expitope: http://webclu.bio.wzw.tum.de/expitope/ OptiTope: http://etk.informatik.uni-tuebingen.de/optitope TCR sequencing analysis Epi-Seq: http://imminfo.github.io/tcr/