ASPIRE Class 5 Biostatistics and Data Collection Tools

Slides:



Advertisements
Similar presentations
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Advertisements

Departments of Medicine and Biostatistics
Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
Statistical Tests Karen H. Hagglund, M.S.
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
QUANTITATIVE DATA ANALYSIS
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Introduction to Educational Statistics
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Today Concepts underlying inferential statistics
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Choosing Statistical Procedures
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Statistical Analysis Statistical Analysis
Simple Linear Regression
Biostatistics Breakdown Common Statistical tests Special thanks to: Christyn Mullen, Pharm.D. Clinical Pharmacy Specialist John Peter Smith Hospital 1.
Which Test Do I Use? Statistics for Two Group Experiments The Chi Square Test The t Test Analyzing Multiple Groups and Factorial Experiments Analysis of.
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
DATA IDENTIFICATION AND ANALYSIS. Introduction  During design phase of a study, the investigator must decide which type of data will be collected and.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Objectives To understand the difference between parametric and nonparametric Know the difference between medically and statistically significant Understand.
Medical Statistics as a science
Chapter Eight: Using Statistics to Answer Questions.
Chapter 6: Analyzing and Interpreting Quantitative Data
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Chapter 13 Understanding research results: statistical inference.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Statistics & Evidence-Based Practice
Basics of Pharmaceutical Statistics
Inferential Statistics
Statistical tests for quantitative variables
Research Methodology Lecture No :25 (Hypothesis Testing – Difference in Groups)
Statistical Core Didactic
Statistics.
Understanding Results
Statistics.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Analyzing and Interpreting Quantitative Data
Chapter 5 STATISTICS (PART 1).
CHOOSING A STATISTICAL TEST
Introduction to Inferential Statistics
Inferential statistics,
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Medical Statistics Dr. Gholamreza Khalili
SDPBRN Postgraduate Training Day Dundee Dental Education Centre
Introduction to Statistics
Comparing Groups.
Descriptive and inferential statistics. Confidence interval
Chi Square (2) Dr. Richard Jackson
Elements of a statistical test Statistical null hypotheses
Unit XI: Data Analysis in nursing research
15.1 The Role of Statistics in the Research Process
Understanding Statistical Inferences
Chapter Nine: Using Statistics to Answer Questions
InferentIal StatIstIcs
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

ASPIRE Class 5 Biostatistics and Data Collection Tools Daniel M. Witt, PharmD, BCPS, FCCP

Learning Objectives ASPIRE Class 5: Biostatistics Differentiate between descriptive and inferential statistics Choose an appropriate statistical test based on the type of data being analyzed Describe the concepts of normal distribution, population, and sample Formulate the analytic plan for a research study Evaluate various data collection tools and databases to collect study data

Elements of a Research Protocol Background Population Design Objectives Procedures Analytical Plan

Class 5 Assignment Please come prepared with the above items October 19th, 2:30-5:00 p.m. @ Kaiser Permanente Central Support Services

Daniel M. Witt, PharmD, FCCP, BCPS, CACP Kaiser Permanente Colorado Biostatistics Daniel M. Witt, PharmD, FCCP, BCPS, CACP Kaiser Permanente Colorado

Why Biostatistics? Which medical practices actually help? Determining what therapies are helpful based on simple experience doesn’t work Biologic variability Placebo effect

Drug appears to be effective at increasing CO An Example Drug appears to be effective at increasing CO Cardiac output Drug dose

Clearly no relationship between drug dose and CO An Example Clearly no relationship between drug dose and CO Cardiac output Drug dose

Biostatistics A useful tool Turns clinical and laboratory experience into quantitative statements Determines whether and by how much a treatment or procedure affected a group of patients Turns boring data into an interesting story

Learning Point Experiments rarely include entire population Selecting unrepresentative samples (bad luck) is unlikely but possible Biostatistical procedures permit estimation of the chance of such bad luck Tell a story (who, what, why, where, how)

General Research Goals Obtain descriptive information about a population based on a sample of that population Test hypotheses about the population Minimize bias

Random Variables Definition: Two types “This is important because….” Outcomes of an experiment or observation whose values cannot be anticipated with certainty Two types Discrete Continuous “This is important because….” choosing (and evaluating) statistical methods depends, in part, on the type of data (variables) used

Discrete (counting) Variables 2 types- Nominal: classified into groups in no particular order, and with no indication of relative severity (e.g., sex, mortality, disease state, bleeding, stroke, MI) Ordinal: ranked in a specific order, but with no consistent level of magnitude difference between ranks (e.g., NYHA class, trauma score) 1 2 3 Discrete Variables Caution: Mean and standard deviation is NOT reported with this type of data

Continuous (measuring) Variables Data are ranked in a specific order with a consistent change in magnitude between units; (e.g., heart rate, LDL cholesterol, blood glucose, INR, blood pressure, time, distance) 1 2 Continuous Data

Summarizing Data Bell-shaped frequency distribution Landmarks x: mean SD: standard deviation (SD) Normal distribution: (most common model for population distributions) 30 35 40 45 50 N=200 Mean=40 SD=5.0 x SD SD

Mean (average) Only used for continuous, normally distributed data SD=2.5 10 15 20 Mean (average) Only used for continuous, normally distributed data Sensitive to outliers Most commonly used measure of central tendency

Non-Normal Distributions Mean ± SD N=100 Mean=37.6 SD=4.5 Although mean and SD can be calculated for any population, Does not summarize the distribution as well as for normal distributions A better approach is to use percentiles

Median Half of observations fall below and half lie above Median (50th percentile) Median Half of observations fall below and half lie above Can be used for ordinal or continuous data Insensitive to outliers

Percentiles 25th percentile 75th percentile The in a distribution where a value is larger than 25% or 75% of the other values in the sample Does not assume that the population has a normal distribution

Standard Deviation (SD) 68% 95% - 2SD - 1SD mean + 1SD + 2SD Standard Deviation (SD) Appropriately applied only to data that are normally or near normally distributed Applicable only to continuous data Within +/- 1 SD are found 68% of the sample’s values, Within +/- 2 SD are found 95% of the sample’s values

Hypothesis Testing The null hypothesis (Ho) posits no difference between groups being compared (Group A = Group B) a statistical convention (but a good one) is used to assist in determining if any observed differences between groups is due to chance alone (bad luck) in other words, is any observed difference likely due to sampling variation?

Hypothesis Testing Example: A new anti-obesity medication is compared to an existing one to determine if one agent is better at achieving goal BMI at the recommended starting dose. Results: Ho: success rate for new drug = success rate for old drug

Hypothesis Testing Tests for statistical significance determine if the data are consistent with Ho If Ho is “rejected” = statistically significant difference between groups (unlikely due to chance or ‘bad luck’) If Ho is “accepted” = no statistically significant difference between groups (results may be due to ‘bad luck’)

Hypothesis Testing The distribution (range of values) for statistical tests when Ho is true is known Depending on this statistic’s value, Ho is accepted or rejected Choosing the appropriate statistical test depends on: Type of data (nominal, ordinal, continuous) Study design (parallel, cross-over, etc.) presence of Confounding variables

Hypothesis Testing For our example, 0.05 0.01 C2 3.84 6.64 For our example, data is nominal data, parallel design with no confounders appropriate test is C2 The frequency distribution of C2 when Ho is true is shown above

Hypothesis Testing Large values are possible when Ho is true, but they occur infrequently (5% of the time when C2 is >3.84 and only 1% of the time when C2 is > 6.64) These extreme values are used to demarcate the point(s) at which Ho is accepted or rejected

Hypothesis Testing For our example: using the data in the formula for calculating C2 yields a value of 1.64 because 1.64 < 3.84, accept Ho and say that the new drug is not statistically significantly better than the old drug in getting patients to their goal BMI with the recommended starting dose 1.64 3.84 C2

Decision Errors

Decision Errors The probability of making a Type I error is defined as the significance level a By setting a at 0.05, this effectively means that 1 out of 20 times a Type I error will occur when Ho is rejected The calculated probability that a Type I error has occurred is called the “p-value” When the a level is set a priori, Ho is rejected when p < a

Decision Errors The probability of making a Type II error (accepting Ho when it should be rejected) is termed b By convention, b should be < 0.20

Decision Errors Power (1-b) The ability to detect actual differences between groups Power is increased by: Increasing a Increasing n Large differences between populations Power is decreased by: Poor study design Incorrect statistical tests

Statistical Significance Areas for Vigilance Size of p-value is not related to the importance of the result Statistically significant does not necessarily mean clinically significant Lack of statistical significance does not mean results are unimportant

Choosing a Statistical Test Parametric versus non-parametric Parametric tests assume an underlying normal distribution Non-parametric tests: Non-normally distributed data Nominal or ordinal data

Choosing a Statistical Test Continuous Data Student’s t-test 1 sample: compares mean of study population to the mean of a population whose mean is known 2 sample (independent samples): compares the means of 2 normal distributions Paired: compares the means of paired or matched samples

Choosing a Statistical Test Continuous Data Analysis of variance (ANOVA) Compares the means of 3 or more groups in a study Multiple comparison procedures are used to determine which groups actually differ from each other e.g., Bonferroni, Tukey, Scheffe, others Analysis of covariance (ANACOVA) Controls for the effects of confounding variables

Choosing a Statistical Test Ordinal Data Wilcoxon rank sum Mann-Whitney U Wilcoxon signed rank Kruskal-Wallis Friedman These tests may also be used for non-normally distributed continuous data

Choosing a Statistical Test Nominal Data X2 Compares percentages between 2 or more groups Fisher’s exact test Infrequent outcomes McNemar’s Paired samples Mantel-Haenszel Controls for influence of confounders

95% Confidence Intervals When the ABSOLUTE difference between groups is considered: A 95% confidence interval that excludes zero is considered statistically significant The 95% confidence interval also provides information regarding the MAGNITUDE of the difference between groups

Regression Regression useful in constructing predictive models Multiple regression involves modeling many possible predictor variables to ascertain which predict a particular target variable Regression modeling often used to control or adjust for the effects of confounding variables

Example of predictive modeling Expected performance derived from regression model Expected performance (99% CI) Observed performance Observed differs from expected by >5% Circ Cardiovasc Qual Outcomes 2011;4:22-29

Survival Analysis Studies the time between entry into a study and some event (e.g., death) Takes into account that some subjects leave the study due to reasons other than the ‘event’ (e.g. lost to follow up, study period ends) May be utilized to arrive at different types of models Kaplan-Meier Cox Regression Model Proportional hazards regression analysis

Kaplan Meier Uses survival times (or censored survival times) to estimate the proportion of people who would survive a given length of time under the same circumstances Allows for the production of a survival curve Uses log-rank test to test for statistically significant differences between groups

Survival Analysis-Kaplan Meier Survival Curve Cumulative Proportion Surviving 1.0 0.8 0.6 0.4 0.2 0.0 Time Treatment Control

Cox Regression Modeling Reported graphically like Kaplan-Meier Investigates several variables at a time Allows calculation of relative risk estimate while adjusting for differences between groups