ONLINE BIOMARKER VALIDATION OF SURVIVAL- ASSOCIATED BIOMARKERS IN BREAST AND OVARIAN CANCER USING MICROARRAY DATA OF 3,862 4,323 PATIENTS Balázs Győrffy 1, András Lánczky 2, Zoltán Szállási 3,4 1 Research Laboratory of Pediatrics and Nephrology, Hungarian Academy of Sciences, Budapest, Hungary; 2 Pázmány Péter University, Budapest, Hungary; 3 Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark; 4 Children's Hospital Informatics Program, Harvard Medical School, Boston, USA. AIMS The pre-clinical validation of prognostic gene candidates in large independent patient cohorts is a pre-requisite for the development of robust biomarkers. In present study we expanded our online Kaplan-Meier plotter tool to assess the effect of genes on ovarian cancer prognosis. CONCLUSIONS We extended our global biomarker validation platform to assess the prognostic power of 22,277 genes in 2,977 breast and 1,346 ovarian cancer patients. Online access at: METHODS Gene expression data and survival information of breast and ovarian cancer patients were downloaded from GEO and TCGA. To analyze the prognostic value of the selected gene in the various cohorts the patients are divided into two groups according to the quantile expression of the gene. Filtering is implemented for stage, grade, and histology subtypes. Follow-up threshold is implemented to exclude long-term effects. A Kaplan-Meier survival plot is generated and significance is computed in the R statistical environment using Bioconductor packages. The combination of several probe sets can be employed to assess the mean of their expression as a multigene predictor of survival. RESULTS All together 1,346 ovarian cancer patients and 2,977 breast cancer patients were entered into the database. These groups can be compared using relapse free survival or overall survival. We used this integrative data analysis tool to validate the prognostic power of 37 biomarkers identified in the literature. Of these, CA125 (p=3.7e-5, HR=1.4), CDKN1B (p=5.4e-5, HR=1.4), KLK6 (p=0.002,HR=0.79), IFNG (p=0.004, HR=0.81), P16 (p=0.02, HR=0.66) and BIRC5 (p= , HR=0.75) were associated with survival. Analysis at Analysis at Raw data n=5,032 Raw data n=5,032 PostgreSQL database PostgreSQL database Remaining n=4,323 Clinical data Real time computation in R Graphical feedback (KM-plot, hazard ratio and p-value) Graphical feedback (KM-plot, hazard ratio and p-value) Filtering for gene expression 1.Quality control 2.Normalization 3.Combination of platforms 1.Quality control 2.Normalization 3.Combination of platforms GEO, TCGA TOP2A in breast cancer CA125 in ovarian cancer Distribution of CA125 Figure 1. The online query pages Figure 2. Overview of the system SymbolSurv.Analyzed in:Affymetrix IDHRp CA125PFSAll patients220196_atn.s _s_at * _s_at1.43.7e-05* KRT19PFSDebulk = subopt _atn.s. KLK6PFSAll patients216699_s_at * _atn.s. KLK10PFSStage = _s_atn.s. IL6OSAll patients205207_atn.s. FASPFSAll patients204780_s_at _s_atn.s _s_at _x_atn.s _x_atn.s. VEGFROSAll patients203934_at CCND1 OSStage = _s_at n.s _at n.s. CCND3 OSAll patients _at n.s. CCNEOSDebulk = subopt _atn.s _atn.s. P15PFSAll patients204599_s_atn.s _x_at * _s_at _atn.s. P16PFSDebulk = subopt _at * _x_atn.s. CDKN1APFSHistology = serous202284_s_atn.s. CDKN1BPFSAll patients209112_at1.45.4e-05* RB1OSStage = _atn.s. E2F1PFSAll patients2028_s_at E2F4PFSAll patients38707_r_atn.s. TP53PFSStage = _s_atn.s _at BAXPFS Therapy = contains Taxol _s_atn.s _s_atn.s. BCL2L1PFSAll patients212312_at _s_atn.s. BIRC2OSStage = _atn.s. BIRC5PFSAll patients210334_x_at * _at _s_at EGFRPFSStage = _s_atn.s _s_atn.s _atn.s. ERBB2PFSHistology = serous216836_s_atn.s. METOSStage = _atn.s _atn.s _x_atn.s _x_atn.s. MMP2PFSHistology = endom _at MMP9OSStage = _s_atn.s. MMP14OSStage = _atn.s _s_atn.s _s_atn.s. HE4PFSAll patients203892_atn.s. SERPINB5PFSDebulk = subopt _atn.s. BRCA1OSAll patients204531_s_atn.s. ERCC1 PFS Stage = 3 Therapy=Tax+Plat _at n.s _s_at n.s. Table 1. The association between prognostic markers and survival. The markers were analyzed in subsets of patients with equivalent clinical characteristics to the cohorts in which the association has previously been described. GRANT SUPPORT: OTKA PD 83154; TAMOP B-09/1/KMR ; The PREDICT consortium (EU grant no )