Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

Departments of Medicine and Biostatistics
Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
Data Analysis Statistics. Inferential statistics.
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
The Simple Regression Model
Final Review Session.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Chapter 2 Simple Comparative Experiments
Social Research Methods
Data Analysis Statistics. Inferential statistics.
Today Concepts underlying inferential statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
5-3 Inference on the Means of Two Populations, Variances Unknown
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Inferential Statistics
The Practice of Social Research
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242.
AM Recitation 2/10/11.
Statistical Analysis I have all this data. Now what does it mean?
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Statistics. Question Tell whether the following statement is true or false: Nominal measurement is the ranking of objects based on their relative standing.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
T tests comparing two means t tests comparing two means.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
RESULTS & DATA ANALYSIS. Descriptive Statistics  Descriptive (describe)  Frequencies  Percents  Measures of Central Tendency mean median mode.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
Linear correlation and linear regression + summary of tests
Medical Statistics as a science
Simple linear regression Tron Anders Moger
Chapter Eight: Using Statistics to Answer Questions.
Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.
T tests comparing two means t tests comparing two means.
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 13 Understanding research results: statistical inference.
Nonparametric Statistics
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Dr.Rehab F.M. Gwada. Measures of Central Tendency the average or a typical, middle observed value of a variable in a data set. There are three commonly.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Appendix I A Refresher on some Statistical Terms and Tests.
Quantitative Techniques – Class I
Data measurement, probability and Spearman’s Rho
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Social Research Methods
SDPBRN Postgraduate Training Day Dundee Dental Education Centre
Introduction to Statistics
Review for Exam 2 Some important themes from Chapters 6-9
Ass. Prof. Dr. Mogeeb Mosleh
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Statistics II: An Overview of Statistics
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Descriptive Statistics
Presentation transcript:

Going from data to analysis Dr. Nancy Mayo

Getting it right Research is about getting the right answer, not just an answer An answer is easy The right answer is hard to find

© Nancy E. Mayo Types of Questions About hypotheses Is treatment A better than treatment B? Answer: Yes or No About parameters What is the extent to which treatment A improves outcome in comparison to treatment B? Answer: A number / value (parameter)

Research is about relationships Links one variable or factor to another One is thought or supposed (hypothesized) to be the “cause” of the second variable

What’s in a name? DisciplineCauseEffect EpidemiologyExposureOutcome Medical/clinicalRisk factorDisease PsychologyIndependentDependent StatisticalStimulusResponse MathematicalXy

Why do I need statistics? Reduce data Define relationships Make inferences from your sample to the population

X, exposure, independent variable Y, outcome, dependent variable Linear None

X, exposure, independent variable Y, outcome, dependent variable Linear None

X, exposure, independent variable Y, outcome, dependent variable Linear None Only linear relationships can be examined by correlation

©Nancy E. Mayo 2004 Population Target Available Inference from Sample to Population Sample Need stats

What kind of statistics do I need?

Depends on your DATA MeasuredCounted

Only 2 kinds of data Measured = Continuous –can take on any value the precision of which depends upon the calibration of your measurement device –Distribution is expected to be normal Counted = Categorical (values are fixed) –Binary (dichotomous) Polychotomous –Ordinal ranked (need for assistance) ranked (need for assistance) interval (categories are equally spaced: falls) interval (categories are equally spaced: falls) ratio (there is a natural 0 ) ratio (there is a natural 0 ) –Nominal – named values, no order (diagnosis)

Your Job When reading an article (later doing your own research) IDENTIFY THESE VARIABLES IDENTIFY WHAT SCALE THEY ARE MEASURED ON MATCH DATA TO ANALYSIS

Quantitative Research The answer to the question is found in the tables

What tables should I find in an article Table 1 – basic characteristics sample Table 2 – outcomes / exposures Table 3 - answer the main question –Relationship between exposure and outcome Table 4 – interesting subgroup

What tables should I find in an article Table 1 – characteristics of the sample on features relating to target and available population Table 2 – distribution of the sample on exposure and outcome variables Table 3 - relationship between the exposure and outcome Table 4 – interesting sub-groups

What kind of statistics should I find in these Tables?

What kind of statistics are there? Depends on your DATA Depends on your QUESTION

Data UsesContinuousCategorical Reduce Data (Descriptive) Means (SD) medians (percentiles, range) Proportions Define relationshipsScatter plotHistogram Linear (Pearson correlation) Correlation (Spearman ranked ) Relative risk Make inferences (Simple univariate (bivariate) t-test independent paired t-test Chi-square test McNemar’s test MultivariateANOVA multiple linear regression Logistic regression

Standard Normal Distribution Showing the proportion of the population that lies within 1, 2 and 3 SD (Wikipedia)

Questions HYPOTHESISPARAMETER QuestionQuestions is answered by YES or NO Question demands a numeric response Test or parameterValue of the test has no meaning (t-test, F test) Difference between two means, rate or a risk SignificanceP –value (probability that what you observed occurred by chance alone) 95% confidence intervals (with studies of this nature, 95% of the time the mean will lie within this interval)

UsesContinuousCategorical Reduce Data (Descriptive) Means (SD) medians (percentiles, range) Proportions Lets look at Table 1

Data UsesContinuousCategorical Define relationshipsScatter plotHistogram Linear (Pearson correlation) Correlation (Spearman ranked ) Relative risk Go to internet: scatter plot Got to internet: histogram

Probability Degree of likelihood that something will happen. Statistical probabilities are expressed as as decimals 0.5, 0.25, 0.75 between 0 and 1. For example, a probability of 0 means that something can never happen; a probability of 1 means that something will always happen. The probability of an event is calculated as follows: –n favourable outcomes / n of all possible outcomes The probability of getting heads in one toss is: p(heads) = 1/(1 + 1) = 1⁄2.

Statistical probability Probability that what you observed could have occurred by chance Wish that to be a very small number By convention: p < 0.05 is considered very unlikely to have occurred by chance Means that in studies like this, an observation this extreme or more extreme would occur by chance alone only in 5 of 100 studies

Remember: one study is only a sample Likely to occurred by chance; unlikely to be because of anything that was done in the study Unlikely to have occurred by chance, the assumption is that it occurred because of something done in the study

When you start a study, there are risks Probability that you are one of the yellow studies You conclude that there was an effect when there was not You conclude that there was an effect when there was not Type I or alpha error By convention, we set this risk at 5 chances out of 100 or p=0.05 Any finding that has a p value associated with it of <0.05 is considered statistically significant (unlikely to have occurred by chance alone)

Correlation >0.8 strong 0.5 to 0.8 moderate <0.5 weak

Correlation What proportion of outcome is explained by the exposure? ANSWER: r 2 r = 0.5 (moderate) r 2 = 0.25 (not much) r = 0.9 (strong) r 2 = 0.81 (still a lot) r = 0.3 (weak) r 2 = 0.09 (almost nothing)

Measuring Effects Effect Post-onlyGroups similar at baseline so effect of I will be observed at t=post. Assumes pre value unimportant; event dara (eg. Falls) Change pre to post Assumes pre value unimportant; reduces variability as a change value can occur in different ways; analyses based on explaining variability Change pre to follow up Often addresses maintenance of effects GrowthLongitudinal change; good for interventions over long term or with multiple measurements (4 or more ideal); pre-value is considered c Nancy E. Mayo (Nov 2005)

RCT’s are Longitudinal Designs Analyses of post only or change are cross- sectional Time may be important Effect of intervention may depend on time c Nancy E. Mayo (Nov 2005)

Estimating Effects Time: pre / post Time effect = impact of time averaged over group Group: Intervention Control At baseline, groups are equal Group effect= effect of group averaged over time, as baseline is equal, group effect can only be due to post-score Group * Time: does the effect of group depend on time

c Nancy E. Mayo (Nov 2005) Main Effect of Group Time Effect X X X X } Group effect (averaged over time)

c Nancy E. Mayo (Nov 2005) Main Effect of Time Time Effect X X X X Time effect (averaged over group) a a a

c Nancy E. Mayo (Nov 2005) Group*Time Effect Time Effect X X X X The effect of group depended on the time: same at baseline but increasingly different over time } } }

95% CI Mean ± 1.96 X SE SE = SD / sqrt N (number of subjects) 1.96 is the area under the curve of a standard normal (mean of 0 and sd 1) distribution that is outside of the 95% range

Interpretation of 95% CI With 100 studies like this one The mean change in PPT will lie Between the 95% confidence bounds 95 times out of 100 Likely that a gain will be between 4 and 8 units of change

Linking Data to Statistics Exposure3Exposure1Exposure2Outcome