Download presentation
Presentation is loading. Please wait.
Published byShawn Young Modified over 9 years ago
1
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt, C. Sotiriou and G. Bontempi BIOINFORMATICS Vol. 24 no. 19 2008, pages 2200-2208
2
outline microarray Introduction –Motivation –Results –Purpose Risk prediction methods
3
Microarray http://www.bio.davidson.edu/courses/geno mics/chip/chip.html
4
Motivation Survival prediction of breast cancer (BC) patients, independently of treatment, also known as prognostication, is a complex task. –Clinically similar breast tumors and molecularly heterogeneous –Several clinical and pathological indicators such as histological grade, tumor size and lymph node –Although BC prognostication has been the object of intense research, a still open challenge is how to detect patient who needs adjuvant systemic therapy.
5
Motivation In recently years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insight into BC biology and to improve the quality of prognostication. –array-based technology –The sequencing of the human genome
6
Motivation Several research teams conducted comprehensive genome-wide assessments of gene expression profiling and identified prognostic gene expression signatures. With respect to clinical guidelines, these signatures were shown to correctly identify a larger group of low-risk patient not requiring treatment.
7
Motivation In fact, clinicians encounter problems when confronted with patient with intermediate-grade tumors (Grade 2). These tumors, which represent 30-60% of cases, are a major source of inter- observer discrepancy and may display intermediate phenotype and survival, making treatment decisions for the patients a great challenge, with subsequent under- or over-treatment.
8
Results Complex prediction methods are highly exposed to performance degradation despite the use of cross-validation techniques for the following reasons: –The large number of variables –The reduced amount of samples –The high degree of noise Our analysis shows that the most methods are not significantly better than the simplest one, a univariate model relying on a single proliferation gene.
9
Results This result suggests the proliferation might be the most relevant biological process for BC prognostication. The loss of interpretability deriving from the use of overcomplex methods may be not sufficiently counterbalabed by an improvement of the quality of prediction.
10
purpose Set up a common and independent assessment framework to compare the performance fo existing gene signatures and several state-of- the-art risk prediction method Compare the prediction accuracy of these methods in several BC microarray prognostication tasks to elucidate the key characteristics of a successful risk prediction method and to bring additional insights into BC biology.
11
Risk prediction methods In this work, we compare the performance of 13 risk prediction methods on more than 1000 patients. All the risk prediction models considered in this study return a risk score denoted by R f
12
The first risk prediction method is also the simplest one and defines the risk score as the expression of a single proliferation gene (AURKA)
13
–Genotype: It can be the expression of –Single proliferation gene (AURKA) –Biologically driven selection of genes of interest (BD) –The whole genome (GW) –Dimension reduction strategy: A simple univariate ranking (RANK) of the k most relevant features A selection of the first k principal components (PCA)
14
–Structure of the model: Multivariate (MULTIV) model a linear combination of univariate modes(COMBUNIV) –Learning algorithm: The linear combination of gene expressions weighted by the significance computed from the Wilconxon rank sum test (WILCOSON) The multivariate linear regression model (LM) The linear combination of gene expressions weighted by the significance computed from the univariate Cox’s proportional hazards model (COX)
15
Performance assessment In order to assess the performance of the risk prediction methods, we used five accuracy measures: –Time-dependent ROC Curve
16
Discussion and conclusions The loss of interpretability deriving from the use of overcomplex methods in survival analysis of BC microarray data might be not sufficiently counterbalanced by an improvement in the quality of prediction.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.