3 rd Summer School in Computational Biology September 10, 2014 Frank Emmert-Streib & Salissou Moutari Computational Biology and Machine Learning Laboratory.

Slides:



Advertisements
Similar presentations
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
Advertisements

Breakout Session 4: Personalized Medicine and Subgroup Selection Christopher Jennison, University of Bath Robert A. Beckman, Daiichi Sankyo Pharmaceutical.
Transforming Correlative Science to Predictive Personalized Medicine Richard Simon, D.Sc. National Cancer Institute
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Lecture 3 Survival analysis. Problem Do patients survive longer after treatment A than after treatment B? Possible solutions: –ANOVA on mean survival.
Stat 512 – Lecture 12 Two sample comparisons (Ch. 7) Experiments revisited.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
PSY 307 – Statistics for the Behavioral Sciences
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
BS704 Class 7 Hypothesis Testing Procedures
Genetic Testing in Genomic Medicine Gail H. Vance M.D. Professor, Department of Medical & Molecular Genetics Indiana University School of Medicine.
Sample Size Determination
EVIDENCE BASED MEDICINE
MammaPrint, the story of the 70-gene profile
3 rd Summer School in Computational Biology September 8, 2014 Frank Emmert-Streib Computational Biology and Machine Learning Laboratory Center for Cancer.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
AM Recitation 2/10/11.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Estimation and Hypothesis Testing Now the real fun begins.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Molecular Diagnosis Florian Markowetz & Rainer Spang Courses in Practical DNA Microarray Analysis.
Analysis of Variance ( ANOVA )
Chapter 8 Introduction to Hypothesis Testing
Metrological Experiments in Biomarker Development (Mass Spectrometry—Statistical Issues) Walter Liggett Statistical Engineering Division Peter Barker Biotechnology.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
INTRODUCTION TO SURVIVAL ANALYSIS
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Use of Candidate Predictive Biomarkers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
The Use of Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun Practice 2.
MRNA Expression Experiment Measurement Unit Array Probe Gene Sequence n n n Clinical Sample Anatomy Ontology n 1 Patient 1 n Disease n n ProjectPlatform.
Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
Ch. 18 – Sampling Distribution Models (Day 1 – Sample Proportions) Part V – From the Data at Hand to the World at Large.
Section Power AP Statistics March 11, 2008 CASA.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X = decrease (–) in cholesterol.
1 Probability and Statistics Confidence Intervals.
The p-value approach to Hypothesis Testing
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Raphael Sandaltzopoulos, PhD, MBA Professor at MBG (Molecular Biology) Lab. of Gene Expression, Molecular Diagnosis and Modern Therapeutics,
PRAGMATIC Study Designs: Elderly Cancer Trials
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Carolinas Medical Center, Charlotte, NC Website:
NCT: Gaining Medical Insights and Enhancing Care for Cancer Patients with SAP HANA® Organization National Center for Tumor Diseases (NCT) Heidelberg, part.
GENETIC BIOMARKERS.
Sampling Distribution Estimation Hypothesis Testing
April 18 Intro to survival analysis Le 11.1 – 11.2
Lecture Nine - Twelve Tests of Significance.
Distribution of the Sample Proportion
Diagnostics and Prognostics
ESTIMATION.
Presentation transcript:

3 rd Summer School in Computational Biology September 10, 2014 Frank Emmert-Streib & Salissou Moutari Computational Biology and Machine Learning Laboratory Center for Cancer Research and Cell Biology Queen’s University Belfast, UK

Exercise – Survival Analysis Homework ~ 1.5 hours

1. Kaplan-Meier Survival Curves 3

Result: Survival Curve 4 S(t)

Goal: estimate S(t) from data A survival curve shows S(t) as a function of t. – S(t): survival function (survivor function) – t: time S(t) gives the probability that the random variable T is larger than a specified time t, i.e., S(t) = Pr(T>t) T: is the event 5 Problem: censoring

Small example: Leukemia 6 Chemotherapy (we use this info later) censoring Acute Myelogenous Leukemia (AML) survival time Only 5 patients

Small example: Leukemia 7 censoring Number in riskNumber of events event ???

Kaplan-Meier estimator for S(t) Estimator: n i : number of subjects at time t i d i : number of events at time t i 8 Kaplan & Meier 1958

Kaplan-Meier estimator for S(t) Estimator: n i : number of subjects at time t i d i : number of events at time t i 9

Check S(t) till t 10

Kaplan-Meier estimator for S(t) Estimator: n i : number of subjects at time t i d i : number of events at time t i 11

Check S(t) till t 12

Kaplan-Meier estimator for S(t) Estimator: n i : number of subjects at time t i d i : number of events at time t i 13 Last time seen, still alive at that time

Check S(t) till t 14

Kaplan-Meier estimator for S(t) Estimator: n i : number of subjects at time t i d i : number of events at time t i 15

Check S(t) till t 16

Kaplan-Meier estimator for S(t) Estimator: n i : number of subjects at time t i d i : number of events at time t i 17

Check S(t) till t 18

Full data set: Leukemia patients

R code 20

2. Comparing Survival Curves 21

Reasons for comparing survival curves (SC) Treatment vs no treatment: – Compare a SC for patients that have been treated with a certain medication with the SC for patient that have not been treated. – Result: Has the treatment an effect on the survival of the patients? 22

Reasons for comparing survival curves Chemotherapy vs no chemotherapy : – Compare a SC for patients that had chemotherapy with the SC for patient that have not had chemotherapy. – Result: Has the chemotherapy an effect on the survival of the patients? 23 Survival Analysis has a big practical relevance

Data: Leukemia patients with chemo 12 patients without Goal: compare the two SCs statistically Group 1 Group 2

R code 25

Log-rank test (Mantel-Haenszel) Hypothesis: Null hypothesis H 0 : No difference in survival between (group 1) and (group 2). Alternative hypothesis H 1 : Difference in survival between (group 1) and (group 2). 26 Mantel and Haenszel 1959

Idea of the test For each time t, estimate the expected number of events for (group 1) and (group 2). 27 Number in risk at t in i Number of events at t in i

28 The e it are obtained assuming H 0 is true. Hence, m it – e it is a measure for the deviation of the data from H 0. sum E2E2 E1E1 O 1 - E 1 O 2 – E 2

Wrapping up Test statistic: Sampling distribution: s follows a chi-square distribution with one degree of freedom 29

R code Back to our leukemia data set: 30

Data: Leukemia patients with chemo 12 patients without Goal: compare the two SCs statistically Group 1 Group 2

Survival Analysis & Biomarkers

NIH Definition of Biomarker A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to therapeutic intervention.

FDA Definition of Biomarker Any measurable diagnostic indicator that is used to assess the risk or presence of disease

What is a biomarker? These definitions are very broad and do not help in finding practical implementations for a particular disease.

Our “definition” Remark: We do not want to address all possible problems that can involve biomarkers but focus on a particular application. Application: Identify a set of genes that can be used for a prognostic analysis. …that are good!

Definition of ‘prognosis’ A prognosis is a medical term denoting the prediction of how a patient will progress over time. For instance, a patient with a diagnosed disease can have: – Long time survival – Short time survival

Our “definition” Remark: We do not want to address all possible problems that can involve biomarkers but focus on a particular application. Application: Identify a set of genes that can be used for a prognostic analysis. Set of genes: we call biomarkers Use biomarkers to predict the prognostic outcome of a patient to classify survival

Underlying idea to identify biomarkers The identification of biomarkers is a composite approach (or a procedure) that is based on a couple of other methods. In the previous example: 1.Survival analysis 2.Differential expression of genes 3.Classification

Underlying idea to identify biomarkers The identification of biomarkers is a composite approach (or a procedure) that is based on a couple of other methods. In the previous example: 1.Clustering 2.Survival analysis 3.Differential expression of genes 4.Classification

Our “definition” Remark: We do not want to address all possible problems that can involve biomarkers but focus on a particular application. Application: Identify a set of genes that can be used for a prognostic analysis. Structured patient groups vs unstructured patient groups Statistics: Feature selection problem

Underlying idea to identify biomarkers The identification of biomarkers is a composite approach (or a procedure) that is based on a couple of other methods. The definition of the procedure is part of the experimental design of the whole experiment. Yes, the experimental design includes the analysis of the data!

Summary & Outlook to Genome and Network Medicine Almost there!

Schedule 17 lectures

Interdisciplinary summer school

Vision of the VC Universities require interdisciplinary engagement in the educational and research effort Professor Patrick Johnston of President and Vice-Chancellor (VC) of Queen’s University

A look 5 years ahead

1. Single cell experiments Experimental measurements of – DNA – Gene expression (mRNA) – Protein binding within single cells. What do the other high-throughput data provide information for? Populations of cells. NGS

1. Single cell experiments Experimental measurements of – DNA – Gene expression (mRNA) – Protein binding within single cells. What do the other high-throughput data provide information for? Populations of cells. NGS Study the heterogeneity of cancer tumors.

1. Single cell experiments PacBio (Pacific Biosciences) SMRT: Single molecule real time sequencing

2. Personalized Medicine The idea behind Personalized medicine is to provide a customization of healthcare using molecular analysis - with medical decisions, practices etc, which are tailored to the needs of the individual patient. One drug for all customized treatment.

2. Personalized Medicine 2012

What does this all mean?

It means first of all more data!

What does this all mean? It means first of all more data!

Survey Please participate in the survey about the summer school in order to help us to improve. We will send it early next week.

Thank you to everyone for participating! We hope you enjoyed the summer school.