Presentation is loading. Please wait.

Presentation is loading. Please wait.

Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.

Similar presentations


Presentation on theme: "Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right."— Presentation transcript:

1 Going from data to analysis Dr. Nancy Mayo

2 Getting it right Research is about getting the right answer, not just an answer An answer is easy The right answer is hard to find

3 © Nancy E. Mayo Types of Questions About hypotheses Is treatment A better than treatment B? Answer: Yes or No About parameters What is the extent to which treatment A improves outcome in comparison to treatment B? Answer: A number / value (parameter)

4 Research is about relationships Links one variable or factor to another One is thought or supposed (hypothesized) to be the “cause” of the second variable

5 What’s in a name? DisciplineCauseEffect EpidemiologyExposureOutcome Medical/clinicalRisk factorDisease PsychologyIndependentDependent StatisticalStimulusResponse MathematicalXy

6 Why do I need statistics? Reduce data Define relationships Make inferences from your sample to the population

7 61103120112311111211111112121111222 62102231222221222221211122233333333 63203229112221122111111111111121111 64103241111111133111111111121122233 65203220111331332312211112111121212 66214141122321321221221211221122232 67103241111111111111111111122911123 68103220111211111111111111111111111 69203220121321324421113412342244213 70102241122211232111121111222222333 71202431111111133311111111111111111 72103141111311122211111111133332232 73113120111321111111111111111113312 74203441133421422212233313441244443 75104341111211112211121211311113223 76202441111111111211111111131114224 77202141112421311213411211131111113 78103220111111122111112111221111222 79112240221221211211111112221111121 80113241111411244121111111211111234 81112120211111111111111111133323334 82101120111111111111191111111111111 83102320211221122111111212132333942

8 X, exposure, independent variable Y, outcome, dependent variable Linear None

9 X, exposure, independent variable Y, outcome, dependent variable Linear None

10 X, exposure, independent variable Y, outcome, dependent variable Linear None Only linear relationships can be examined by correlation

11 ©Nancy E. Mayo 2004 Population Target Available Inference from Sample to Population Sample Need stats

12 What kind of statistics do I need?

13 Depends on your DATA MeasuredCounted

14 Only 2 kinds of data Measured = Continuous –can take on any value the precision of which depends upon the calibration of your measurement device –Distribution is expected to be normal Counted = Categorical (values are fixed) –Binary (dichotomous) Polychotomous –Ordinal ranked (need for assistance) ranked (need for assistance) interval (categories are equally spaced: falls) interval (categories are equally spaced: falls) ratio (there is a natural 0 ) ratio (there is a natural 0 ) –Nominal – named values, no order (diagnosis)

15 Your Job When reading an article (later doing your own research) IDENTIFY THESE VARIABLES IDENTIFY WHAT SCALE THEY ARE MEASURED ON MATCH DATA TO ANALYSIS

16 Quantitative Research The answer to the question is found in the tables

17 What tables should I find in an article Table 1 – basic characteristics sample Table 2 – outcomes / exposures Table 3 - answer the main question –Relationship between exposure and outcome Table 4 – interesting subgroup

18 What tables should I find in an article Table 1 – characteristics of the sample on features relating to target and available population Table 2 – distribution of the sample on exposure and outcome variables Table 3 - relationship between the exposure and outcome Table 4 – interesting sub-groups

19 What kind of statistics should I find in these Tables?

20 What kind of statistics are there? Depends on your DATA Depends on your QUESTION

21 Data UsesContinuousCategorical Reduce Data (Descriptive) Means (SD) medians (percentiles, range) Proportions Define relationshipsScatter plotHistogram Linear (Pearson correlation) Correlation (Spearman ranked ) Relative risk Make inferences (Simple univariate (bivariate) t-test independent paired t-test Chi-square test McNemar’s test MultivariateANOVA multiple linear regression Logistic regression

22 Standard Normal Distribution Showing the proportion of the population that lies within 1, 2 and 3 SD (Wikipedia)

23 Questions HYPOTHESISPARAMETER QuestionQuestions is answered by YES or NO Question demands a numeric response Test or parameterValue of the test has no meaning (t-test, F test) Difference between two means, rate or a risk SignificanceP –value (probability that what you observed occurred by chance alone) 95% confidence intervals (with studies of this nature, 95% of the time the mean will lie within this interval)

24 UsesContinuousCategorical Reduce Data (Descriptive) Means (SD) medians (percentiles, range) Proportions Lets look at Table 1

25 Data UsesContinuousCategorical Define relationshipsScatter plotHistogram Linear (Pearson correlation) Correlation (Spearman ranked ) Relative risk Go to internet: scatter plot Got to internet: histogram

26 Probability Degree of likelihood that something will happen. Statistical probabilities are expressed as as decimals 0.5, 0.25, 0.75 between 0 and 1. For example, a probability of 0 means that something can never happen; a probability of 1 means that something will always happen. The probability of an event is calculated as follows: –n favourable outcomes / n of all possible outcomes The probability of getting heads in one toss is: p(heads) = 1/(1 + 1) = 1⁄2.

27 Statistical probability Probability that what you observed could have occurred by chance Wish that to be a very small number By convention: p < 0.05 is considered very unlikely to have occurred by chance Means that in studies like this, an observation this extreme or more extreme would occur by chance alone only in 5 of 100 studies

28 Remember: one study is only a sample Likely to occurred by chance; unlikely to be because of anything that was done in the study Unlikely to have occurred by chance, the assumption is that it occurred because of something done in the study

29 When you start a study, there are risks Probability that you are one of the yellow studies You conclude that there was an effect when there was not You conclude that there was an effect when there was not Type I or alpha error By convention, we set this risk at 5 chances out of 100 or p=0.05 Any finding that has a p value associated with it of <0.05 is considered statistically significant (unlikely to have occurred by chance alone)

30 Correlation >0.8 strong 0.5 to 0.8 moderate <0.5 weak

31 Correlation What proportion of outcome is explained by the exposure? ANSWER: r 2 r = 0.5 (moderate) r 2 = 0.25 (not much) r = 0.9 (strong) r 2 = 0.81 (still a lot) r = 0.3 (weak) r 2 = 0.09 (almost nothing)

32 Measuring Effects Effect Post-onlyGroups similar at baseline so effect of I will be observed at t=post. Assumes pre value unimportant; event dara (eg. Falls) Change pre to post Assumes pre value unimportant; reduces variability as a change value can occur in different ways; analyses based on explaining variability Change pre to follow up Often addresses maintenance of effects GrowthLongitudinal change; good for interventions over long term or with multiple measurements (4 or more ideal); pre-value is considered c Nancy E. Mayo (Nov 2005)

33 RCT’s are Longitudinal Designs Analyses of post only or change are cross- sectional Time may be important Effect of intervention may depend on time c Nancy E. Mayo (Nov 2005)

34 Estimating Effects Time: pre / post Time effect = impact of time averaged over group Group: Intervention Control At baseline, groups are equal Group effect= effect of group averaged over time, as baseline is equal, group effect can only be due to post-score Group * Time: does the effect of group depend on time

35 c Nancy E. Mayo (Nov 2005) Main Effect of Group Time Effect X X X X } Group effect (averaged over time)

36 c Nancy E. Mayo (Nov 2005) Main Effect of Time Time Effect X X X X Time effect (averaged over group) a a a

37 c Nancy E. Mayo (Nov 2005) Group*Time Effect Time Effect X X X X The effect of group depended on the time: same at baseline but increasingly different over time } } }

38 95% CI Mean ± 1.96 X SE SE = SD / sqrt N (number of subjects) 1.96 is the area under the curve of a standard normal (mean of 0 and sd 1) distribution that is outside of the 95% range

39 Interpretation of 95% CI With 100 studies like this one The mean change in PPT will lie Between the 95% confidence bounds 95 times out of 100 Likely that a gain will be between 4 and 8 units of change

40 Linking Data to Statistics Exposure3Exposure1Exposure2Outcome


Download ppt "Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right."

Similar presentations


Ads by Google