Validation of four gene-expression risk scores in a large colon cancer cohort and contribution to an improved prognostic method Antonio F. Di Narzo 1, Sabine Tejpar 3, Simona Rossi 1, Pu Yan 5, Vlad Popovici 1, Pratyaksha Wirapati 1, Eva Budinska 1, Tao Xie 6, Heather Estrella 6, Adam Pavlicek 6, Mao Mao 6, Eric Martin 6, Weinrich Scott 6, Graeme Hodgson 6, Eric Van Cutsem 3, Fred Bosman 5, Arnaud Roth 4,7, Mauro Delorenzi 1,2 1) Swiss Institute of Bioinformatics, Lausanne, Switzerland; 2) Département de formation et recherche, Centre Hospitalier Universitaire Vaudois and University of Lausanne, Lausanne, Switzerland; 3) Digestive Oncology Unit and Center for Human Genetics, University Hospital Gasthuisberg, Leuven, Belgium; 4) Oncosurgery, Geneva University Hospital Geneva, Switzerland; 5) Department of Pathology, Lausanne University, Lausanne, Switzerland; 6) Pfizer Inc., Worldwide Research and Development, Oncology Research Unit, Science Center Drive, La Jolla, CA 92121; 7) Swiss Group for Clinical Cancer Research (SAKK).
Background Prognosis prediction for resected primary colon cancer is currently based on the tumor, nodes, metastasis (TNM) staging system Different laboratories studied gene expression profiles and proposed distinct risk scoring systems Each single scoring system has been internally validated. But how do they compare? Are them equivalent? Four, well documented scoring systems were selected and tested on the PETACC-3 series
Aim Assess the performance of the 4 scoring systems for: – Overall Survival (OS) – Relapse Free Survival (RFS) – Survival After Relapse (SAR) Check agreement among them Is there space for improvement in biomarker development?
Patients: the PETACC-3 trial N = 688 samples with gene expression microarray data Van Cutsem et al., 2009
The selected scoring systems variable Scoring System abbreviation GHSVDSMDAALM provider Genomic HealthVeridex MD Anderson Cancer Center ALMAC diagnostics type of assayQ-RT-PCRmicroarray type of tissuefresh frozen formalin-fixed, paraffin-embedded referenceO’Connell et al. 2010Jiang et al. 2008Oh et al. 2011Kennedy et al. 2011
There is little overlap in the genes lists
Results: Overall Survival All HRs are relative to 1 Interquartile Range Variation of the risk score. Multivariate Cox model includes: Age, Gender, T-stage, N-stage, Grade, Location, Treatment Arm, Lymphovascular Invasion, Microsatellite Instability univariatemultivariate markerHR (95% CI)p-valueHR (95% CI)p-value GHS1.36 ( ) ( ) VDS1.24 ( ) ( ) MDA1.31 ( ) ( ) ALM1.38 ( )< ( ) Combined Score 1.87 ( )< ( )<0.001 The Combined Score is obtained as a geometric average of the rankings of the four scoring systems
Results: Relapse-Free Survival All HRs are relative to 1 Interquartile Range Variation of the risk score. Multivariate Cox model includes: Age, Gender, T-stage, N-stage, Grade, Location, Treatment Arm, Lymphovascular Invasion, Microsatellite Instability univariatemultivariate markerHR (95% CI)p-valueHR (95% CI)p-value GHS1.33 ( )< ( ) VDS1.29 ( ) ( ) MDA1.10 ( ) ( ) ALM1.31 ( )< ( ) Combined Score 1.68 ( )< ( )<0.001 The Combined Score is obtained as a geometric average of the rankings of the four scoring systems
Results: Survival After Relapse All HRs are relative to 1 Interquartile Range Variation of the risk score. Multivariate Cox model includes: Age, Gender, T-stage, N-stage, Grade, Location, Treatment Arm, Lymphovascular Invasion, Microsatellite Instability univariatemultivariate markerHR (95% CI)p-valueHR (95% CI)p-value GHS1.16 ( ) ( )0.199 VDS0.90 ( ) ( )0.170 MDA1.81 ( )< ( )<0.001 ALM1.19 ( ) ( )0.395 Combined Score 1.49 ( ) ( ) The Combined Score is obtained as a geometric average of the rankings of the four scoring systems
There is weak agreement in the predictions of the four risk scoring systems GHSVDSMDAALM GHS Spearman correlation: Spearman correlation: Spearman correlation: VDS agreement: 37.5% Spearman correlation: Spearman correlation: MDA agreement: 70.3% agreement: 33.1% Spearman correlation: ALM agreement: 57.8% agreement: 49.1% agreement: 54.1% The percentage of patients with the same predicted outcome according to 2 distinct scoring systems is rather small The VDS scoring systems is anti-correlated with GHS and MDA, and almost uncorrelated with ALM.
Conclusions These four scoring systems are based on different gene populations with little overlap These four scoring system have, in our hands, a confirmed prognostic value for OS but concur poorly on a per patient basis There is a high variability in prognostic values depending on the endpoint (OS, RFS, SAR) A combined score based on these four scoring systems seems to lead to an improved prognosis prediction compared to each system separately