We obtained breast cancer tissues from the Breast Cancer Biospecimen Repository of Fred Hutchinson Cancer Research Center. We performed two rounds of next-gen.

Slides:



Advertisements
Similar presentations
Regulation of Consumer Tests in California AAAS Meeting June 1-2, 2009 Beatrice OKeefe Acting Chief, Laboratory Field Services California Department of.
Advertisements

Tests of Hypotheses Based on a Single Sample
METHODS FOR HAPLOTYPE RECONSTRUCTION
Objectives (BPS chapter 24)
Basics of Linkage Analysis
The origin of metastatic disease: clues from genomics 7/13/2011.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
IGES 2003 How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini 1,3, Chiara Toni 2, Fabio Marroni.
Phylogenetic Trees Presenter: Michael Tung
Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema.
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
PSY 307 – Statistics for the Behavioral Sciences
Confidence Interval A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population.
Habil Zare Department of Genome Sciences University of Washington
Inference for regression - Simple linear regression
Inference for proportions - Comparing 2 proportions IPS chapter 8.2 © 2006 W.H. Freeman and Company.
Inference for proportions - Comparing 2 proportions IPS chapter 8.2 © 2006 W.H. Freeman and Company.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
Learning Objective Chapter 8 Primary Data Collection: Experimentation CHAPTER eight Primary Data Collection: Experimentation Copyright © 2000 by John Wiley.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Assay Development Breakout (red) Who was in the room? About half of attendees are active NGS users N=1 doing whole genome analyses Everyone else doing.
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
Agresti/Franklin Statistics, 1e, 1 of 139  Section 6.4 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Why are there so few key mutant clones? Why are there so few key mutant clones? The influence of stochastic selection and blocking on affinity maturation.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Computational Identification of Tumor heterogeneity
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
INFERENCE Farrokh Alemi Ph.D.. Point Estimates Point Estimates Vary.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
BAYCLONE: BAYESIAN NONPARAMETRIC INFERENCE OF TUMOR SUBCLONES USING NGS DATA SUBHAJIT SENGUPTA, JIN WANG, JUHEE LEE, PETER MULLER, KAMALAKAR GULUKOTA,
Samuel Aparicio, B.M., B.Ch., Ph.D., and Carlos Caldas, M.D.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Scientific Method Vocabulary Observation Hypothesis Prediction Experiment Variable Experimental group Control group Data Correlation Statistics Mean Distribution.
A comparison of somatic mutation callers in breast cancer samples and matched blood samples THOMAS BRETONNET BIOINFORMATICS AND COMPUTATIONAL BIOLOGY UNIT.
Quantitative Methods in the Behavioral Sciences PSY 302
Lecture Slides Elementary Statistics Twelfth Edition
Cell Lineage Analysis of a Mouse Tumor
Cancer Vaccine Design Ion Mandoiu
Dr. Habil Zare, PhD Oncinfo Lab Texas State University 4 Dec 2014
Chapter 9: Inferences Involving One Population
Multiple Alignment and Phylogenetic Trees
Primary Data Collection: Experimentation
Ranking Tumor Phylogeny Trees by Likelihood
Volume 67, Issue 4, Pages (April 2015)
Wei Jiao, Shankar Vembu, Amit G Deshwar,
Mohammed El-Kebir, Gryte Satas, Layla Oesper, Benjamin J. Raphael 
Optimizing Cancer Genome Sequencing and Analysis
Multiregional Tumor Trees Are Not Phylogenies
Outline Cancer Progression Models
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
SNP Arrays in Heterogeneous Tissue: Highly Accurate Collection of Both Germline and Somatic Genetic Information from Unpaired Single Tumor Samples  Guillaume.
Inferential Statistics
Volume 25, Issue 6, Pages (November 2018)
Utilizing NGS-Data to Evaluate Anti-PD-1 Treatment
Inferring Tumor Phylogenies from Multi-region Sequencing
High-Definition Reconstruction of Clonal Composition in Cancer
Delineating cancer evolution with single-cell sequencing
Comparing 3D Genome Organization in Multiple Species Using Phylo-HMRF
Presentation transcript:

We obtained breast cancer tissues from the Breast Cancer Biospecimen Repository of Fred Hutchinson Cancer Research Center. We performed two rounds of next-gen sequencing; 1) a primary whole exome sequencing on the normal subsection, seven primary subsections, and one metastatic subsection, which identified 281 mutation candidates. 2) a deep sequencing on the normal subsection, eight primary subsections, and two metastatic subsections which was targeted on the candidates and validated 17 mutations. Deep targeted sequencing (median coverage of 1100 reads per locus) provided reliable counts of the normal allele, and counts of the alternate allele for our algorithm. The approximate anatomic location of the samples, and the frequency of 17 alternative alleles for each of 12 tumor samples are shown in the plot in right. Performance on simulated data To validate our implementation of the EM optimization procedure and to understand our model’s behavior, we produce simulated deep sequencing data and measure the extent to which the model successfully recovers the true clonal structure of the data. We observed two primary trends: the overall error rate, as measured by either genotype or clone frequency error, decreases systematically as the number of samples increases, and increases as the number of clones increases. Overall, both error rates are low, especially for the case of 3 clones. Abstract The ability of cancer to evolve within an individual patient is the most significant reason that cancer treatments fail. Because today’s oncologists lack the means to predict a cancer’s next move, the disease can escape the current treatment strategies. Capturing the subclonal heterogeneity of tumors can improve cancer treatment significantly by providing effective insight into the structure of the tumor and the history of a patient's cancer. If the mutations specific to each subclone are known, clinicians can design the most efficient treatment to block the escape mechanisms of the tumor by attacking all subclones simultaneously. We obtained data from next-gen sequencing of several tumor samples from a single patient in a single time point, and developed a novel method that analyzes thess data to accurately estimate the frequency and mutational content of subclones. We model the counts of alternative allele with a binomial distribution, and infer the parameters such that the likelihood of the observed data is maximized. The outputs of our algorithm are genotypes of all clones, and their frequencies in each sample. Our results can be used to infer a phylogeny tree which describes the evolution of the cancer in time, and also provide a map of the clonal heterogeneity of cancer which describes how the tumor evolved anatomically. Methodology Clonal structure can be inferred from multiple sections of a breast cancer by a novel binomial model Habil Zare 1, Junfeng Wang 2, Alex Hu 1, Daniela Witten 3, Anthony Blau 2, and William S. Noble 1 1 Department of Genome Sciences, University of Washington, Seattle, WA, USA 2 Devision of Hematology, Deparment of Medicine, University of Washington, Seattle, WA, USA 3 Department of Biostatistics, University of Washington, Seattle, WA, USA Results Conclusion Data A collection of subsections are subjected to next-generation sequencing to measure counts of two allelse—the normal allele that was observed in a matched normal sample at that locus, and a tumor allele. The resulting counts matrices are provided as input to an inference procedure that estimates the clonal genotypes and frequencies. We developed a novel method that uses the EM algorithm to maximize the likelihood of observed data assuming the alternative counts have a binomial distribution. Clone frequencies vary smoothly across the tumor Each panel plots, for a different section, the pattern of inferred clone frequencies across subsections. Subsections in white were not subjected to sequencing. The primary clone frequencies vary in a monotonic fashion as we traverse the sample from left to right. Evolution of a breast cancer The above phylogeny tree shows the inferred clonal phylogeny from our deep sequencing data assuming there are 4 clones. Nodes correspond to observed or inferred clonal populations, and edges are annotated with mutations that occur between the parent and child clones. Two mutations are grouped into a colored box if they both occur on the same branch in all four phylogenies inferred under the assumption of 3, 4, 5, or 6 clones (data not shown). The tree provides valuable insight into the development of each clone by ordering in time the mutations which lead to that clone. We developed a method to infer the clonal structure of a single cancer from multiple samples of the same tumor. We provide three types of evidence that the inferred structure is accurate: (1) analysis of simulated data, (2) analysis of the inferred clone frequencies relative to tumor anatomy, and (3) consistency of the clonal genotypes with a phylogenetic tree. The inferred clonal architecture of a tumor may help in understanding its etiology and in designing appropriate treatments.