Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010 Janet Dunn Louise Hiller.

Slides:



Advertisements
Similar presentations
Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Advertisements

A PowerPoint®-based guide to assist in choosing the suitable statistical test. NOTE: This presentation has the main purpose to assist researchers and students.
David Pieper, Ph.D. STATISTICS David Pieper, Ph.D.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Departments of Medicine and Biostatistics
Statistical Tests Karen H. Hagglund, M.S.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Statistical presentation in international scientific publications 6. Reporting more complicated findings Malcolm Campbell Lecturer in Statistics, School.
Final Review Session.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Nonparametric Tests & Course Summary.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Analysis of Complex Survey Data
Statistical Methods in Clinical Research
Non-Parametric Methods Professor of Epidemiology and Biostatistics
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Quantitative Methods: Choosing a statistical test Summer School June 2015 Dr. Tracie Afifi.
AS 737 Categorical Data Analysis For Multivariate
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
Inferential Statistics: SPSS
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Overall agenda Part 1 and 2  Part 1: Basic statistical concepts and descriptive statistics summarizing and visualising data describing data -measures.
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
PTP 560 Research Methods Week 11 Question on article If p
Choosing Appropriate Descriptive Statistics, Graphs and Statistical Tests Brian Yuen 15 January 2013.
Simple Linear Regression
Statistics for clinical research An introductory course.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Non-Parametric Methods Professor of Epidemiology and Biostatistics
How to Teach Statistics in EBM Rafael Perera. Basic teaching advice Know your audience Know your audience! Create a knowledge gap Give a map of the main.
SIMPLE TWO GROUP TESTS Prof Peter T Donnan Prof Peter T Donnan.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
2nd Half Review ANOVA (Ch. 11) Non-Parametric (7.11, 9.5) Regression (Ch. 12) ANCOVA Categorical (Ch. 10) Correlation (Ch. 12)
Linear correlation and linear regression + summary of tests
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Statistical Inference for more than two groups Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
Fundamental Concepts of Biostatistics Cathy Jenkins, MS Biostatistician II Lisa Kaltenbach, MS Biostatistician II April 17, 2007.
Statistics in IB Biology Error bars, standard deviation, t-test and more.
Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Nonparametric Statistics
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.3 Other Ways of Comparing Means and Comparing Proportions.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Dr.Rehab F.M. Gwada. Measures of Central Tendency the average or a typical, middle observed value of a variable in a data set. There are three commonly.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 16 : Summary Marshall University Genomics Core Facility.
Nonparametric Statistics
Correlation – Regression
Statistics.
Statistical Inference for more than two groups
Basic Statistics Overview
Statistics 103 Monday, July 10, 2017.
Nonparametric Statistics
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010 Janet Dunn Louise Hiller

Data types What type of data do you have? Categorical2- levels More than 2 levels Ordered Non- ordered Continuous Normally distributed Non- normally distributed Time to event

Data types What type of data do you have? Categorical2- levels More than 2 levels Ordered Non- ordered Continuous Normally distributed Non- normally distributed Time to event

2-level categorical (binary) data N (%)12Row total 1a (%)b (%)a+b 2c (%)d (%)c+d Column totala+cb+dn Variable 1 Variable 2 Frequency Table

2-level categorical (binary) data - Test of association Null hypothesis: The 2 factors are independent Chi-squared test, with continuity correction  2 =11.4 p=  Treatment and gender are NOT independent N (%)12Row total Male55 (58%) 32 (33%) 87 Female40 (42%) 66 (67%) 106 Column total Treatment Gender

2-level categorical (binary) data - Test of association Null hypothesis: The 2 factors are independent Commonly used with small numbers, Fisher’s exact test p=0.51  Treatment and gender are independent N (%)12Row total Male4 (10%) 6 (17%) 10 Female35 (90%) 30 (83%) 65 Column total Treatment Gender

2-level categorical (binary) data – Measure of agreement A measure of agreement between reviewers, above that expected by chance Kappa  =0.71 (95%CI )  There is good agreement between reviewers ResponseNo responseRow total Response No response Column total Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor fair moderate good very good

2-level categorical (binary) data – Measure of agreement A measure of agreement between reviewers, above that expected by chance Kappa  =-0.04 (95%CI )  There is poor agreement between reviewers ResponseNo responseRow total Response No response Column total Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor fair moderate good very good

2-level categorical (binary) data – Exploring patterns in the data Odds ratio (OR): the ratio of the odds of an event occurring in the 1 st gp to the odds of it occurring in the 2 nd gp OR=1 - event is equally likely to occur in both gps OR>1 - event is more likely to occur in 1 st gp OR<1 - event is less likely to occur in 1 st gp OR=4.1 (95%CI )  The odds of a male having a response are 4 times those of a female having a response YesNoRow total Male Female Column total Response Gender

2-level categorical (binary) data – Exploring patterns in the data Relative Risk (RR): the ratio of the risk of an event occurring in the 1 st gp to the risk of it occurring in the 2 nd gp RR=1 - event is equally likely to occur in both gps RR>1 - event is more likely to occur in 1 st gp RR<1 - event is less likely to occur in 1 st gp RR=1.7 (95%CI )  New trt patients are 1.7 times more likely to suffer an SAE than control patients YesNoRow total New trt Control Column total SAE suffered Treatment

Odds Ratio/Relative Risk plots 20.5

Exploring patterns in multivariate data - Logistic Regression A statistical modelling method that describes the relationship between a categorical response variable and 1 or more categorical and/or continuous variables e.g. Association between bearing grudges & medical conditions OR95%CIp Heart attack High blood pressure Heart disease Epilepsy Stroke

Ordered categorical data – Test for trend Null hypothesis: No linear trend between groups Chi-squared tests for trend  2 =10.8 p=0.001  There is a linear trend between groups N (%)12Row total Mild17 (20%) 32 (38.5%) 49 Moderate29 (35%) 32 (38.5%) 61 Severe38 (45%) 19 (23%) 57 Column total Treatment Toxicity

Ordered categorical data – Test for trend (>2 rows & columns) Null hypothesis: No linear trend between rows and columns Chi-squared tests for trend  2 =7.1 p=0.008  There is a linear trend between rows & columns N (%)1mg2mg3mgRow total Mild30 (36%) 19 (23%) 18 (22%) 67 Moderate31 (37%) 32 (38.5%) 27 (33%) 90 Severe22 (27%) 32 (38.5%) 37 (45%) 91 Column total Treatment dose Toxicity

Ordered categorical data – Measure of agreement A measure of agreement between reviewers, above that expected by chance CRPRSDRow total CR PR SD Column total Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor fair moderate good very good Weighted kappa  =0.38 (95%CI )  There is fair agreement between reviewers

Non-ordered categorical data - Test of association Null hypothesis: The 2 factors are independent Chi-squared test  2 =0.51 p=0.78  Treatment and disease site are independent N (%)12Row total Head & Neck26 (23%)29 (26%)55 Limbs32 (28%)33 (30%)65 Body55 (49%)49 (44%)104 Column total Treatment Disease site

Non-ordered categorical data – Measure of agreement A measure of agreement between reviewers, above that expected by chance ABCRow total A B C Column total Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor fair moderate good very good Kappa  =0.31 (95%CI )  There is fair agreement between reviewers

Categorical data – RECAP. LevelsTest of associationMeasure of agreement Exploring patterns in the data 2  2 test with continuity correction; Fisher’s exact test KappaOdds Ratio & Relative Risk; Logistic regression >2 (ordered)  2 test for trend Weighted kappa Not covered >2 (non-ordered)  2 test Kappa Not covered

Data types What type of data do you have? Categorical2- levels More than 2 levels Ordered Non- ordered Continuous Normally distributed Non- normally distributed Time to event

Normally distributed data Data forms a bell-shaped curve Non-significant Shapiro-Wilk test result

Mean & Standard Deviation graph Treatments Change over time in QOL (%)

Parametric tests Differences between means of 2 groups –T-tests Differences between means of >2 groups –ANOVA –Linear regression Correlation –Pearson’s correlation coefficient, r

Non-normally distributed data

Box and Whisker graphs Outliers (observations that lie outside of the 95% CIs) are sometimes plotted individually

Box and Whisker graphs Parallel box plots show the differences between groups

Non-parametric tests Differences between medians of 2 groups –Wilcoxon rank sum test Differences between medians of >2 groups –Kruskal-Wallis 1-way analysis of variance test Correlation –Spearman’s rank order correlation coefficient, 

Transforming data Can transform non-normally distributed data (e.g. logarithm, square root, reciprocal) to make create normally distributed data Then analyse transformed data using parametric methods

Data types What type of data do you have? Categorical2- levels More than 2 levels Ordered Non- ordered Continuous Normally distributed Non- normally distributed Time to event

Time-to-event data Why is this different to other continuous data? –Censoring TNO KEY Randomisation date Date of event Censor date Time 20* 8 8* 14 1* 16*

What time? What event? Start date? –Diagnosis –Surgery Event? –Onset / worsening of pain –Hospital discharge –Death (OS) –Relapse (RFI/DFI/ Plateau) –Relapse or death (RFS/DFS) You need to know what you’re looking at to know how to interpret it / what to compare it to –Randomisation –Start/End of treatment

Time-to-event data analysis (‘Survival Analysis’) Can be used to measure time to any event –Arthritic joint remaining pain-free post steroid injections –Elderly patient with a fractured hip remaining in hosp. Calculate ‘survival’ time for each patient (some may be censored times) –Recruitment takes place over time so varying lengths of follow-up are expected Rank these times and calculate proportions alive at certain points, with due allowance for incomplete follow-up These proportions and times are plotted and overall distributions of curves compared

Time-to-event data Why is this different to other continuous data? –Censoring TNO KEY Randomisation date Date of event Censor date Time 20* 8 8* 14 1* 16*

Kaplan-Meier Curves Median survival = 1.3 years Minimum & median FU indicate the maturity of the data

Kaplan-Meier Curves Numbers at Risk: ECMF CMF % 84%

Undesirable comparisons of survival rates

Statistical tests for time-to-event data Log-rank tests compare the overall distributions of the curves (  2 and p-value presented) –Null hypothesis: all curves are samples from populations with the same risk of the event –Compares the number of deaths observed on each treatment arm with the number expected under the null hypothesis that the 2 survival distributions are identical Cox proportional hazards model (Hazard Ratio, 95% CI’s and p-value presented) –Identifies which variables from a group of several are independently related to survival –In what order of importance –Gives you a measure of their relation to survival

Forest plots [Bars=95% confidence interval. Size of boxes can represent sample size]

Longitudinal data analysis A variable can be measured on the same patient over time (e.g. Baseline, 3 month, 6 month …) Can be any type of data (categorical, continuous)

Longitudinal data analysis – Summary Measures Change from Baseline in Global QOL CMF ECMF Change at 1 year (p=0.01) Change at 2 years (p=0.06) Improvement Deterioration TRT A TRT B

Longitudinal data analysis – Modelling Pulmonary function (TLCO score) over time Graphs show each patient as a separate line Solid line = Trt A pts Dashed line = Trt B pts Random effects modelling predicts the average patient score on each treatment arm

Cluster Randomised Trial data Patients within 1 cluster are often more likely to respond in a similar manner, and thus can not be assumed to act independently ICC = Intracluster Correlation Coefficient. A statistical measure of this dependence –Takes values between 0 and 1 –Higher values = greater between-cluster variation. e.g. Management within sites are consistent but, across different sites, there is wide variation Analysis must incorporate the effects of clustering i.e. the values of the ICC and design effect

Useful References Gore & Altman – Statistics in Practice Bland - An Introduction to Medical Statistics Altman - Practical Statistics for Medical Research Peto et al - Design and Analysis of Randomized Clinical- Trials Requiring Prolonged Observation of each patient –1/ Introduction and Design. British Journal of Cancer (6) –2/ Analysis and Examples. British Journal of Cancer (1) 1-39