Triple negative (TN) vs. positive (TP) breast cancer (BC) analysis with public genomic data yields AGR3 as a biomarker Anita Umesh, Jenny Park, James Shima, Joseph Delaney, Robert Wisotzkey, Erin Kelly, Elizabeth Beatrice Chiu, Mamatha Shekar, Ilya Kupershmidt NextBio, 475 El Camino Real, Suite 100, Santa Clara, CA Abstract Objectives Materials & Methods Results Summary of results To identify differences in genomic signatures amongst TNBC versus triple positive breast cancer (TPBC) patients in the TCGA breast cancer dataset. To identify novel potential biomarkers candidates that are not within the category of hormone receptors. Background: TNBCs have an aggressive clinical phenotype with worse prognosis than TPBC. Here we sought to identify other genomic changes in patients with TN vs. TPBC, by mining TCGA breast cancer genomic data using the NextBio Clinical platform. Methods: Breast cancer patients with decreased (n=61) and increased (n=33) ER/PR/Her2 mRNA levels were selected with the molecular filter in NB Clinical. Top- ranked differentially expressed genes were identified then analyzed via NB Research, which comprises a library of individually curated publicly available research and clinical grade genomic data. Results: Whole transcriptome analysis of TNBC and TPBC patients identified anterior gradient-3 (AGR3), a gene encoding a protein disulfide isomerase, as most downregulated (TNBC vs. TPBC, fold change = , p=1.3E-27). AGR3 mRNA was reduced in 95% of TNBC and increased in 88% of TPBC patients. AGR3 was hypomethylated in 100% of TPBC vs. 52% of TNBC patients, suggesting regulation by methylation. When the entire breast cancer cohort was stratified by AGR3 expression, underexpression was more prevalent in younger patients (<50 yr) compared to those with overexpression (<50 yr). Further, 68% of AGR3 underexpressing patients had severe TP53 mutations vs. 13% with AGR3 overexpression. Breast cancer was most correlated to AGR3 in the NextBio Disease Atlas, with supporting data from 124 studies. AGR3 expression was decreased in ER- vs. ER+, PR- vs. PR+, Her2- vs. Her2+, and G3 vs. G1 tumors, and AGR3 promoter was hypermethylated in ER- vs. ER+, consistent with the TCGA results. Body Atlas showed strongest AGR3 expression in mucosa, fallopian tube, breast, and epithelial cells, in line with its proposed role in epithelial barrier function. Knockdown Atlas results showed TP53 mutation associated with reduced AGR3 mRNA, and the PharmacoAtlas showed positive regulation of AGR3 by estrogen analogues. Conclusions: Using the NextBio Clinical and Research platforms, we identified AGR3 as a marker of TNBC vs. TPBC that correlates with TP53 mutation status. AGR3 may provide an alternate route of TNBC treatment. Its potential in regulating epithelial barrier function may provide insights into the mechanism of disease progression. AGR3 RNA is downregulated in TNBC patients and upregulated in TPBC patients AGR3 promoter is differentially methylated between TNBC and TPBC breast cancer patients NextBio Clinical This specialized platform allows for exploration of relationships between genomic data, clinical observations, and clinical outcomes of individual patients. Molecular changes in each sample are calculated and organized in a manner such that clinically relevant changes can be easily displayed for a given patient. Selection of patient cohort: The NextBio Clinical platform was used to mine TCGA Breast Cancer patients, and to select TNBC and TPBC patients based on RNA expression levels of ESR1, PGR, and HER2. The Group Comparison feature in NextBio Clinical was used to select patients with the following criteria: -2.0 fold downregulation of ESR1, PGR, and Her2 for the TNBC cohort, and 1.6 fold upregulation of ESR1, PGR, and Her2 for the TPBC cohort. This resulted in the selection of 61 patients in the former group, and 33 patients in the latter group with data for RNA expression. Biomarker analysis: The biomarker analysis feature in NextBio Clinical was used determine candidate genes that were differentially regulated in TNBC versus TPBC patients. Upon identifying a specific candidate (AGR3) that was differentially regulated when assessed by both significance p-value and magnitude of RNA expression fold change difference, Biomarker analysis was used to stratify the TCGA breast cancer patient population by downregulation of AGR3 RNA expression (>= -2.0 fold) versus upregulation (> = 2.0 fold). This identified close to 200 patients in each arm of the cohort, respectively. NextBio Research Over 10,000 genomic studies containing nearly 78,000 signatures from 13 species and multiple datatypes have been curated and included in the NextBio library. Individual studies were obtained from public data sources, analyzed and imported to NextBio. All data points are organized in a framework of accredited biomedical ontologies such as SNOMED CT and MeSH. Scores and ranks are recomputed each time new datasets are added. Disease atlas: This application allows for the identification of rank-ordered diseases from studies containing statistically significant measurements for the gene. Disease centric studies are manually tagged according to their experimental design and biomedical attributes. Knockdown atlas: This application identifies all studies in which any gene perturbation experiments were performed, that result in statistically significant measurements for a queried gene of interest. These studies are grouped together by perturbed gene and then scored to yield a list of ranked genetic perturbations. Pharmaco atlas: Compound treatment-related experiments are manually tagged according to their experimental design and treatment conditions. The results are presented in a rank ordered list of compounds that regulate the queried gene in a statistically significant manner. Stratification of TCGA breast cancer patients by AGR3 RNA downregulated (Group A) vs. upregulated (Group B) shows increased TP53 mutation frequency in AGR3 downregulated patients B. 1.In a cohort of TCGA breast cancer patients stratified by hormone receptor RNA expression levels, TNBC patients had statistically significant high-magnitude downregulation of AGR3 RNA levels compared to TPBC patients. 2.TNBC patients had AGR3 promoter hypermethylation compared to TPBC patients, providing potential mechanism for the regulation at the RNA level. 3.Higher proportion of younger patients had AGR3 RNA downregulated compared to older patients. 4.Patients selected for AGR3 downregulation had higher TP53 mutation frequency and ER-/PR- protein status. 5.Amongst all diseases, AGR3 gene was most highly correlated with Breast cancer, with notable downregulation in (1) ER and/or PR negative versus positive tumors, and (2) grade 3 versus grade 1 tumors. 6.AGR3 levels are significantly upregulated in 4-hydroxytamoxifen, estradiol and tibolone treated breast cancer cells, suggesting that AGR3 expression is regulated by the estrogen signaling pathway. Conclusion A. B. A. Using a genomic patient stratification approach, we identify AGR3 as a biomarker that is downregulated in TNBC versus TPBC. Given that AGR3 protein expression has recently been associated with longer median survival in serous ovarian carcinomas (King et al 2011 Am J Surg Pathol.), we propose AGR3 RNA expression as a candidate biomarker indicative of a more favorable disease outcome in breast cancer AGR3 RNA expression (log 2 [patient signal/disease median]) TNBC patients (n = 61) mean fold change = mean fold change in log2scale = TPBC patients (n = 33) mean fold change = 1.64 mean fold change in log2scale = 0.71 Fold change difference = Fold change difference in log2scale = p = 1.3E-27 AGR3 promoter % differential methylation TNBC patients (n = 21) mean = TPBC patients (n = 17) mean = Methylation change difference = 24.4 p = 1.1E-6 Stratification of TCGA Breast Cancer patients by downregulated (Group A) versus upregulated (Group B) ESR1/PGR/HER2 RNA expression in NextBio Clinical returns differentially regulated genes, of which PGR and ESR1 are the most significantly regulated by RNA expression. C. Group A = AGR3 RNA expression downregulated Group B = AGR3 RNA expression upregulated Mutation frequency difference = 54.91%, p = 1.8E-21 NextBio Clinical: Discovery and exploration of novel biomarkers in individual TCGA breast cancer patients NextBio Research: Validation and further characterization of AGR3 biomarker using a library of curated public genomic data 7 Disease Atlas returns Breast cancer as the most significantly correlated disease with AGR3 expression Disease Atlas for AGR3 Querying AGR3 for all studies in NextBio Research where AGR3 expression in a given disease is compared to normal tissue expression returns Breast cancer as the most significantly correlated disease amongst all other diseases with AGR3 expression. AGR3 is downregulated in Breast cancer compared to normal tissue. This finding is supported by 124 studies of different datatypes including Methylation (ME), RNA expression (RE), Copy number variation (CN), and miRNA expression (MI). A. Breast cancer studies that correlate significantly with AGR3 were predominantly those where ER and/or PR negative status were compared to receptor positive status. This figure shows a subset of such studies, where all show negative fold change expression of AGR3 in receptor negative vs. positive samples. Breast cancer studies that correlate most significantly with AGR3 expression show a correlation with signatures for ER/PR status (A) and tumor grade (B) 8 B. AGR3 expression correlated significantly with many signatures that compared grade 3 tumors to grade 1 tumors. This figure illustrates 14 such signatures from 14 independent studies where AGR3 was underexpressed in grade 3 tumors compared to grade 1 tumors Negative fold change of AGR3 expression ER and/or PR negative vs. positive Individual study number A. B Negative fold change of AGR3 expression Individual study number Grade 3 vs. Grade 1 tumors Knockdown Atlas returns 16 independent studies showing correlation of TP53 status with AGR3 downregulation Knockdown Atlas for AGR3 Querying AGR3 for all studies in NextBio Research in which gene perturbations are introduced, returns a list of genes that affect AGR3 expression. Most notable amongst these, is the perturbation of the TP53 gene. Sixteen independent RNA expression (RE), Copy number variation (CN), and miRNA (MI) studies show that TP53 mutations lead to downregulated AGR3. Pharmaco Atlas returns 4-hydroxytamoxifen, estradiol, and tibolone as correlating with upregulation of AGR Querying AGR3 for all studies in NextBio Research in which compound treatments are performed, returns a list of compounds that affect AGR3 expression. Most notable amongst these, is the upregulation of AGR3 by 4-hydroxytamoxifen, estradiol, and tibolone which all bind steroid hormone receptors. Three, 17 and 1 independent studies respectively support this result. 4 […] 5.9E E E E-42 Biomarker Analysis for RNA Expression n=8 n=118 n=203 n=83n=61 Stratification of TCGA breast cancer patients by AGR3 RNA downregulated vs upregulated shows higher proportion of younger patients with AGR3 downregulated Biomarker Analysis for Somatic Mutations Stratification of TCGA breast cancer patients by AGR3 RNA downregulated vs. upregulated identifies patients with ER-, PR- and ER+, PR+ protein expression status, respectively, as indicated by TCGA provided information. Her2 protein expression status does not correlate with AGRAGR3 RNA expression