Raphael Sandaltzopoulos, PhD, MBA Professor at MBG (Molecular Biology) Lab. of Gene Expression, Molecular Diagnosis and Modern Therapeutics, MBG, DUTH BIOMARKERS DISCOVERY: An introduction
What is a Biomarker? Biomarker is a measured characteristic which may be used as an indicator of a certain biological state or condition. Biomarker is a measureable characteristic which may be used as an indicator of a certain biological state or condition. Measureable: ease of sampling, reliability, reproducibility and accuracy of detection method. Indicator: robust correlation with biological condition.
A good Biomarker shows a robust correlation with a certain biological condition. Pearson correlation: How tightly does the value of one parameter correlate with the value of another parameter? Difference of average: Is a value of a parameter significantly different among two groups which are defined by the value of a second parameter?
Expression of T-lymphocyte markers (mRNA) is a reliable indicator of CD8+ T-lymphocyte infiltration in ovarian tumors CD3 staining CD8 staining Example of difference of average (one or two-tailed t-test)
TTF1 is overexpressed in TIL+ Epithelial Ovarian Cancer ttf1 mRNA was quantified by qPCR in 45 ovarian tumors (classified as TIL+ or TIL-). mRNA levels were higher in TIL+ compared to TIL- samples (p=0.039).
TTF1 expression correlates with T-cell infiltration qPCR analysis of the relative expression of ttf1 and cd8a mRNA in 96 ovarian cancer samples, verified a significant correlation (Pearson correlation=0.329, p<0.01). Example of Pearson correlation analysis
TMEM132D is over-expressed in TIL+ Epithelial Ovarian Cancer TMEM132D: Single-pass type I transmembrane protein. May serve as a cell- surface marker for oligodendrocyte differentiation. Implicated in anxiety disorders. Example of Pearson correlation analysis 95% Confidence Interval: If we were to repeat the sampling 100 times and calculate a population of 100 r values, the population mean would probably lie between and 95% of the time, the confidence intervals contain the true mean.
Common pitfalls of correlation analysis Correlation and causality: a. Correlation does not necessarily imply causality.
Common pitfalls of correlation analysis Correlation and causality: b. Normal distribution and lack of correlation does not necessarily imply independence. X and Y are uncorrelated; Both have the same normal distribution; X and Y are not independent.
Common pitfalls of correlation analysis Correlation and linearity: c. Correlation does not necessarily imply linearity Four sets of data with the same correlation of 0.816
The nature of a Biomarker
Adapted from Kaiser J, Science 330:576.
The nature of a Biomarker
Technology platforms commonly used in Biomarkers’ Discovery
In silico Biomarkers disparate data resources advanced bioinformatics modeling Biomarkers discovered by data analysis and mining workflows and algorithms capable of working with disparate data resources in the areas of gene expression, copy number, pathway networks, miRNA, and metabolomics data, next generation sequencing (NGS) data including RNA-seq, ChIP-seq, whole genome, exome, and others, using advanced bioinformatics modeling and development of analytics.
The biomarker evolution pathway in translational medicine
An example of Risk Score Assessment of microRNA Biomarkers in NSCLC Adapted from Zander et al, 2011, Clin. Cancer Res 11:3360. Risk score = 0,969 x miR ,973 x miR30d - 0,650 x miR1 - 0,815 miR499
Sensitivity: The fraction of people with the disease that the test correctly identifies as positive. Specificity: The fraction of people without the disease that the test correctly identifies as negative. A “good” Biomarker has a great combination of sensitivity and specificity
Receiver Operating Characteristic (ROC) curve A receiver operating curve (ROC) is a plot of sensitivity versus 1-specificity. The name is derived from its original use in radar technology. The dotted line represents a useless test that has no discriminatory power. The size of the area between the dotted line and the solid line in the ROC curve reflects the ability of a test to discriminate between diseased and non-diseased individuals across the range of potential cut-offs.ROC curve
Receiver Operating Characteristic (ROC) curve Test value Cutoff value chosen to minimize inclusion of false positives. (for example when treatment is expensive or invasive). Cutoff value chosen at statistical mean. Equal number of true positive missed as the number of people diagnosed as false positive. The cutoff value is chosen to minimize missing true positives, thus including more false positives. (for example: when consequences of missing a case are serious, e.g. testing potential blood donors for HIV). Setting the criterion value is a compromise between sensitivity and specificity.
Area Under Curve (A.U.C.) as a measure of a Biomarker’s usefulness Test value Large distribution of measured values between the diseased and the non-diseased populations make it easy to choose a cut off value. Small difference in distribution of measured values between the diseased and the non-diseased populations make it very difficult to choose a cut off value.
81 patients, 46 benign 27 healthy women * - p<0,05; ** - p<0,01; *** - p<0,001. Serum concentration of endothelial markers
117 patients, 59 benign * - p<0,05; ** - p<0,01; *** - p<0,001. Validation in an independent group of patients
ROC analysis: better Diagnostics by combination of biomarkers
The Kaplan-Meier estimator for the survival function from lifetime data. Used to measure the fraction of patients living for a certain amount of time after treatment. A Kaplan–Meier plot of the survival function is a series of horizontal steps of declining magnitude which, when a large enough sample is taken, approaches the true survival function for the population. censored data Advantage of the Kaplan–Meier curve: It takes into account some types of censored data, particularly right-censoring, which occurs if a patient withdraws from a study, i.e. is lost from the sample before the final outcome is observed. On the plot, small vertical tick-marks indicate losses, where a patient's survival time has been right-censored.
Overview of Multi-omic Approaches Applied in Biomarker Discovery
Sectors of Biomarkers applications in medicine
Biomarkers in personalized treatment
Qualities of a good Biomarkers in diagnosis and therapy Discrimination of healthy/deseased speciments (sensitivity and specificity). Easily detectable and quantifiable. Simple (not a combination of many factors). Extracellular (secreted) or cell surface. Presence/expression in a restricted tissue, cell type as a result of a certain condition. Minimal influence by irrelevant factors. Consistent behavior within a population.