Impact of Formal Methods in Biology and Medicine Final Review Debashis Sahoo Department of Computer Science CSE291 – H00 – Lecture 18
Outline Introduction History of Bioinformatics Introduction to computing Data collection Experiment design Data analysis
Biological Systems What I would like to cover are these points: More formal Basic intro to boolean logic Describe with example Show how these are applied Markers vs cell types Normal vs cancer
Tissue What I would like to cover are these points: More formal Basic intro to boolean logic Describe with example Show how these are applied Markers vs cell types Normal vs cancer
Boolean Analysis George Boole (1815-1864) Two values High, Low 1, 0 Boolean operations AND, OR, NOT Implication x y Add High/Low – maybe remove True/False Remove figures: Simple Boolean logic that can simplify incredbly complex continuous data Infer biological meaning.
Boolean Implication Pair of genes. Four quadrants. Sparse quadrants. ACPP GABRB1 45,000 Affymetrix microarrays Pair of genes. Four quadrants. Sparse quadrants. Boolean relationships. If ACPP high, then GABRB1 low If GABRB1 high, then ACPP low Put the introductory slides How many microarrays Seems like a fundamental… If -> then Describe x and y axis. Describe a point. Statistical tests for identifying sparse quadrant.
Threshold Calculation Threshold for each gene Sort expression values StepMiner High CDH expression Intermediate Threshold Low Say about linear shape. Labels in the graph bigger. Put forbidden zone threshold. Labels. Sorted arrays [Sahoo et al. 07]
StepMiner Statistics
BooleanNet Statistics nAlow = (a00+ a01), nBlow = (a00+ a10) total = a00+ a01+ a10+ a11, observed = a00 expected = (nAlow/ total * nBlow/ total) * total statistic = (expected – observed) expected √ a00 (a00+ a01) (a00+ a10) + ( ) 1 2 error rate = Put the introductory slides How many microarrays Seems like a fundamental… If -> then Describe x and y axis. Describe a point. Statistical tests for identifying sparse quadrant. Boolean Implication = (statistic > 3, error rate < 0.1) [Sahoo et al. Genome Biology 08]
Modeling Colon Tissue What I would like to cover are these points: More formal Basic intro to boolean logic Describe with example Show how these are applied Markers vs cell types Normal vs cancer
Normal Colon Tissue http://www.siumed.edu/~dking2/erg/GI125b.htm Make a case before http://www.siumed.edu/~dking2/erg/GI125b.htm
Simple Patterns in Cancer Dataset Gene CA1, Gene KRT20 Dalerba*, Kalisky* and Sahoo* et al. Nat Biotechnol. 2011 Nov 13;29(12):1120-7.
Search for the Stem Cell Genes X Y Expression KRT20 ALCAM Colon Cell Differentiation Criteria: 1. KRT20 high => X high 2. Y high => ALCAM high 3. KRT20 high => Y low 4. X low => ALCAM high
Search for High Risk Colon Cancer Patients List of genes fulfilling the pattern X low => ALCAM high GPX2 CDX2 EPS8L3 GPR35 LAD1 DTX4 CDX1 USH1C VIL1 PPP1R14D MUC3B PLEKHG6 IHH ACOT11 NHEJ1 Change this Dalerba P et al. N Engl J Med 2016;374:211-222.
CDX2 mRNA Expression and Disease-free Survival Discovery Dataset (JSTO, n=466) Figure 2. Relationship between CDX2 Expression and Disease-free Survival in the NCBI-GEO Discovery Data Set. Analysis of CDX2 messenger RNA (mRNA) expression in the NCBI-GEO discovery data set revealed the presence of a minority subgroup of CDX2-negative colon cancers that were characterized by high ALCAM mRNA expression levels (Panel A) and that were associated with a lower rate of 5-year disease-free survival than CDX2-positive colon cancers (Panel B). In Panel A, each circle in the scatter plot represents one patient sample. The association between CDX2-negative cancers and a lower rate of disease-free survival remained significant in a multivariate analysis that excluded tumor stage, tumor grade, age, and sex as confounding variables (Panel C). Dalerba P et al. N Engl J Med 2016;374:211-222.
CDX2 Protein Expression and Disease-free Survival Validation Dataset (NCI CDP, n=466) Change colors Dalerba P et al. N Engl J Med 2016;374:211-222.
Dalerba P et al. N Engl J Med 2016;374:211-222. CDX2 Expression and Benefit from Adjuvant Chemotherapy. Stage II Colon Cancer (Pooled dataset, n=669) Figure 5. Relationship between CDX2 Expression and Benefit from Adjuvant Chemotherapy. The relationship between CDX2 expression and benefit from adjuvant chemotherapy was evaluated in a pooled database of 669 patients with stage II disease (Panel A) and 1228 patients with stage III disease (Panel B) from four independent data sets (NCBI-GEO, NCI-CDP, NSABP C-07, and Stanford TMAD). Among all patients with stage II disease in the entire database, treatment with adjuvant chemotherapy was not associated with a higher rate of 5-year disease-free survival. However, treatment with adjuvant chemotherapy was strongly associated with a higher rate of 5-year disease-free survival in the CDX2-negative subgroup, but it was not associated with a higher rate of 5-year disease-free survival in the CDX2-positive subgroup. Among patients with stage III disease, treatment with adjuvant chemotherapy was associated with a higher rate of 5-year disease-free survival in the entire database and in both the CDX2-negative and CDX2-positive subgroups. A test for an interaction between the biomarker and the treatment indicated that in both stage II and stage III disease, the benefit associated with adjuvant chemotherapy was superior among CDX2-negative patients than among CDX2-positive patients. Dalerba P et al. N Engl J Med 2016;374:211-222.