Download presentation
Presentation is loading. Please wait.
Published byKenneth Holmes Modified over 9 years ago
1
1 Foundations of Research Cranach, Tree of Knowledge [of Good and Evil] (1472) Click “slide show” to start this presentation as a show. Remember: focus & think about each point; do not just passively click. Click “slide show” to start this presentation as a show. Remember: focus & think about each point; do not just passively click. © Dr. David J. McKirnan, 2014 The University of Illinois Chicago McKirnanUIC@gmail.com Do not use or reproduce without permission Statistics: The Z score and the normal distribution
2
2 Foundations of Research The statistics module series 1. Introduction to statistics & number scales 2. The Z score and the normal distribution 5. Calculating a t score 6. Testing t: The Central Limit Theorem You are here © Dr. David J. McKirnan, 2014 The University of Illinois Chicago McKirnanUIC@gmail.com Do not use or reproduce without permission 4. Testing hypotheses: The critical ratio 7. Correlations: Measures of association 3. The logic of research; Plato's Allegory of the Cave
3
3 Foundations of Research Here we will see how to use Z scores to evaluate data, and will introduce the concept of critical ratio. Using Z scores to evaluate data Testing hypotheses: the critical ratio. Evaluating data Shutterstock.com
4
4 Foundations of Research Using Z to evaluate data Z is at the core of how we use statistics to evaluate data. Z indicates how far a score is from the M relative to the other scores in the sample. Z combines… A score The M of all scores in the sample The variance in scores above and below M.
5
5 Foundations of Research Using Z to evaluate data Z is at the core of how we use statistics to evaluate data. Z indicates how far a score is from the M relative to the other scores in the sample. So… If X (an observed score) = 5.2 And M (The Mean score) = 4 X - M = 1.2 If S (Standard deviation of all scores in the sample) = 1.15
6
6 Foundations of Research Using Z to evaluate data Z is at the core of how we use statistics to evaluate data. Z indicates how far a score is from the M relative to the other scores in the sample. So… If X = 5.2 And M = 4 If S = 1.15 X - M = 1.2 Z for our score is 1 (+). 5.2 – 4 1.15 = 1.05Z = X– M S = 1.2 1.15 =
7
7 Foundations of Research Using Z to evaluate data Z is at the core of how we use statistics to evaluate data. Z indicates how far a score is from the M relative to the other scores in the sample. So… If X = 5.2 And M = 4 If S = 1.15 X - M = 1.2 This tells us that our score is higher than ~ 84% of the other scores in the distribution. Z for our score is 1 (+).
8
8 Foundations of Research Using Z to evaluate data Z is at the core of how we use statistics to evaluate data. Z indicates how far a score is from the M relative to the other scores in the sample. This tells us that our score is higher than ~ 84% of the other scores in the distribution. Unlike simple measurement with a ratio scale where a value – e.g. < 32 o – has an absolute meaning. …inferential statistics evaluates a score relative to a distribution of scores. Shutterstock.com
9
9 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 34.13% of cases 13.59% of cases 2.25% of cases 13.59% of cases 2.25% of cases 50% of the scores in a distribution are above the M [Z = 0] 34.13% of the distribution +13.59% +2.25%...etc. 50% of scores are below the M Z scores: areas under the normal curve, 2 0
10
10 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 34.13% of cases 13.59% of cases 2.25% of cases 13.59% of cases 2.25% of cases Z scores: areas under the normal curve, 2 84% of scores are below Z = 1 (One standard deviation above the Mean) 34.13% + 34.13%+ 13.59% + 2.25%... +1
11
11 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 34.13% of cases 13.59% of cases 2.25% of cases 13.59% of cases 2.25% of cases Z scores: areas under the normal curve, 2 84% of scores are above Z = -1 (One standard deviation below the Mean)
12
12 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 34.13% of cases 13.59% of cases 2.25% of cases 13.59% of cases 2.25% of cases Z scores: areas under the normal curve, 2 +2 98% of scores are less than Z = 2 Two standard deviations above the mean 13.59% + 34.13% + 34.13% + 13.59% + 2.25%…
13
13 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 34.13% of cases 13.59% of cases 2.25% of cases 13.59% of cases 2.25% of cases Z scores: areas under the normal curve, 2 -2 98% of scores are above Z = -2
14
14 Foundations of Research Evaluating Individual Scores How good is a score of ‘6' in the group described in… Table 1? Table 2? Evaluate in terms of: A. The distance of the score from the M. B. The variance in the rest of the sample C. Your criterion for a “significantly good” score
15
15 Foundations of Research Using Z to compare scores Table 1; high variance Mean [M] = 4, Score (X) = 6 (S) = 2.4 Standard Deviation (S) = 2.4 = 0.88Z = X - M S = = Table 1; low variance Mean [M] = 4, Score (X) = 6 (S) = 1.15 Standard Deviation (S) = 1.15 = 1.74 1.Calculate how far the score (X) is from the mean (M); X–M. 2.“Adjust” X–M by how much variance there is in the sample via standard deviation (S). 3.Calculate Z for each sample 6 - 4 2.4 2 2.4 Z = = X - M S 6 - 4 1.15 2 1.15 =
16
16 Foundations of Research Using the normal distribution, 2 Table 1, high variance X - M = 6 - 4 = 2 (S) = 2.4 Standard Deviation (S) = 2.4. Z =0.88 Z = (X – M / S) = (2 / 2.4) = 0.88 About 70% of participants are below this Z score Table 2, low(er) varianc e X - M = 6 - 4 = 2 (S) = 1.15 Standard Deviation (S) = 1.15. Z =1.74 Z = (X – M / S) = (2 / 1.15) = 1.74 About 90% of participants are below this Z score B. The variance in the rest of the sample: Since Table 1 has more variance, a given score is not as good relative to the rest of the scores. A. The distance of the score from the M. The participant is 2 units above the mean in both tables.
17
17 Foundations of Research Comparing Scores: deviation x Variance High variance (S = 2.4) Less variance (S = 1.15) ‘6’ is not that high compared to rest of the distribution Here ‘6’ is the highest score in the distribution
18
18 Foundations of Research Normal distribution; high variance -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) Z =.88 About 70% of cases Table 1, high variance X - M = 6 - 4 = 2 S = 2.4 Z =0.88 Z = (X – M / S) = (2 / 2.4) = 0.88 About 70% of participants are below this Z score
19
19 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) Normal distribution; low variance About 90% of cases Z = 1.74 Table 2, low(er) variance X - M = 6 - 4 = 2 = 1.15 S = 1.15. Z =1.74 Z = (X – M / S) = (2 / 1.15) = 1.74 About 90% of participants are below this Z score
20
20 Foundations of Research Evaluating scores using Z -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) S = 2.4 X = 6, M = 4, S = 2.4, Z =.88 S = 1.15 X = 6, M = 4, S = 1.15, Z = 1.74 70% of cases 90% of cases C. Criterion for a “significantly good” score If a “good” score is better than 90% of the sample….. with high variance ’6' is not so good, with less variance ‘6’ is > 90% of the rest of the sample.
21
21 Foundations of Research Summary: evaluating individual scores A. The distance of the score from the M. In both groups ‘6’ is two units > the M (X = 6, M = 4). B. The variance in the rest of the sample One group has low variance and one has higher. C. Criterion for “significantly good” score What % of the sample must the score be higher than… How “good” is a score of ‘6' in two groups? With low variance ‘6’ is higher relative to other scores then in a sample with higher variance.
22
22 Foundations of Research Z / “standard” scores Z scores (or standard deviation units) standardize scores by putting them on a common scale. In our example the target score and M scores are the same, but come from samples with different variances. We compare the target scores by translating them into Zs, which take into account variance. Any scores can be translated into Z scores for comparison… Using Z to standardize scores
23
23 Foundations of Research We cannot directly compare these scores because they are on different scales. One is measured in hours & minutes, one in 10 ths of a second. We can use Z scores to change each scale to common metric i.e., as % of the larger distribution each score is above or below. Z scores can be compared, since they are standardized by being relative to the larger population of scores. Using Z to standardize scores, cont. Which is “faster”; a 2:03:00 marathon, or a 4 minute mile? Roberto Caucino / Shutterstock.com
24
24 Foundations of Research Comparing Zs Location of 2:03 marathon on distribution; Z > 4 -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 4:30 4:25 4:20 4:10 4:00 3:50 3:45 Mile times Distribution of mile times, translated into Z scores Location of 4 minute mile on distribution; Z = 1. Distribution of world class marathon times as Z scores -4 -3 -2 -1 0 +1 +2 +3 +4 Z Scores (standard deviation units) 2:50 2:45 2:40 2:30 2:20 2:15 2:10 Marathon times (raw scores) A 2:03 marathon is “faster” than a 4 minute mile
25
25 Foundations of Research Quiz 1 About what percentage of scores are below the line? A.45% B.66% C.84% D.16% E.50%
26
26 Foundations of Research Quiz 1 About what percentage of scores are below the line? A.45% B.66% C.84% D.16% E.50% Scores below the line are known as the “area under the curve” The area under the curve below Z = 1 is 50% (below the M [0]) + ~34% (one standard deviation above the mean to the mean; Z = 1 Z = 0 ). Scores below the line are known as the “area under the curve” The area under the curve below Z = 1 is 50% (below the M [0]) + ~34% (one standard deviation above the mean to the mean; Z = 1 Z = 0 ).
27
27 Foundations of Research Quiz 1 About what is the likelihood of this score occurring by chance? A.45% B.66% C.84% D.16% E.50%
28
28 Foundations of Research Quiz 1 About what is the likelihood of this score occurring by chance? A.45% B.66% C.84% D.16% E.50% The “area under the curve” above z = 1 is ~14% (Z = 1 Z = 2) + ~2% (Z = 2 Z = 3). The logic is that about 16% of scores will be higher than this score by chance alone. The “area under the curve” above z = 1 is ~14% (Z = 1 Z = 2) + ~2% (Z = 2 Z = 3). The logic is that about 16% of scores will be higher than this score by chance alone.
29
29 Foundations of Research Quiz 1 You got a score of 20 on your last exam. The M = 14, the maximum score = 25. Did you go well? A.Of course; you are only 5 points from a perfect score. B.No, your Tiger Mom will only accept 25/25. C.Without the variance you cannot estimate how you did relative to your peers. D.Midway between the average and the max. is at least a ‘C’, so I did OK. Shutterstock
30
30 Foundations of Research Shutterstock Quiz 1 You got a score of 20 on your last exam. The M = 14, the maximum score = 25. Did you go well? A.Of course; you are only 5 points from a perfect score. B.No, your Tiger Mom will only accept 25/25. C.Without the variance you cannot estimate how you did relative to your peers. D.Midway between the average and the max. is at least a ‘C’, so I did OK. If the exam is graded in absolute terms – if, say, the instructor sets “A’ at anything better than 80% - you are in.
31
31 Foundations of Research Quiz 1 You got a score of 20 on your last exam. The M = 14, the maximum score = 25. Did you go well? A.Of course; you are only 5 points from a perfect score. B.No, your Tiger Mom will only accept 25/25. C.Without the variance you cannot estimate how you did relative to your peers. D.Midway between the average and the max. is at least a ‘C’, so I did OK. Tough luck.
32
32 Foundations of Research Quiz 1 You got a score of 20 on your last exam. The M = 14, the maximum score = 25. Did you go well? A.Of course; you are only 5 points from a perfect score. B.No, your Tiger Mom will only accept 25/25. C.Without the variance you cannot estimate how you did relative to your peers. D.Midway between the average and the max. is at least a ‘C’, so I did OK. If your instructor is grading the way a statistician would, evaluating scores relative to the distribution (grading on the curve), you do not know. You would need your score, the M score, and the Standard Deviation, i.e.: Z= 20 - 25 / S If your instructor is grading the way a statistician would, evaluating scores relative to the distribution (grading on the curve), you do not know. You would need your score, the M score, and the Standard Deviation, i.e.: Z= 20 - 25 / S
33
33 Foundations of Research Quiz 1 You got a score of 20 on your last exam. The M = 14, the maximum score = 25. Did you go well? A.Of course; you are only 5 points from a perfect score. B.No, your Tiger Mom will only accept 25/25. C.Without the variance you cannot estimate how you did relative to your peers. D.Midway between the average and the max. is at least a ‘C’, so I did OK. Evaluating statistical outcomes always involves our setting a criterion for a “significantly good” score. By convention we consider a research result as “significant” if it would have occurred less than 5% of the time by chance. However, some have more lax criteria… Evaluating statistical outcomes always involves our setting a criterion for a “significantly good” score. By convention we consider a research result as “significant” if it would have occurred less than 5% of the time by chance. However, some have more lax criteria…
34
34 Foundations of Research Using Z scores to evaluate data Testing hypotheses: the critical ratio. The critical ratio Illustration of the nebular hypothesis Click for nebular vs. catastrophic hypotheses about the origin of the solar system. (David Darling, Encyclopedia of Science.)
35
35 Foundations of Research Using statistics to test hypotheses: Core concept: No scientific finding is “absolutely” true. Any effect is probabilistic: We use empirical data to infer how the world words We evaluate inferences by how likely the effect would be to occur by chance. We use the normal distribution to help us determine how likely an experimental outcome would be by chance alone. Core concept: No scientific finding is “absolutely” true. Any effect is probabilistic: We use empirical data to infer how the world words We evaluate inferences by how likely the effect would be to occur by chance. We use the normal distribution to help us determine how likely an experimental outcome would be by chance alone.
36
36 Foundations of Research Probabilities & Statistical Hypothesis Testing Scientific observations are “innocent until proven guilty”. If we compare two groups or test how far a score is from the mean, the odds of their being different by chance alone is always greater than 0. We cannot just take any result and call it meaningful, since any result may be due to chance, not the Independent Variable. So, we assume any result is by chance unless it is strong enough to be unlikely to occur randomly. Scientific observations are “innocent until proven guilty”. If we compare two groups or test how far a score is from the mean, the odds of their being different by chance alone is always greater than 0. We cannot just take any result and call it meaningful, since any result may be due to chance, not the Independent Variable. So, we assume any result is by chance unless it is strong enough to be unlikely to occur randomly. Null Hypothesis: All scores differ from the M by chance alone.
37
37 Foundations of Research Probabilities & Statistical Hypothesis Testing Using the Normal Distribution: More extreme scores have a lower probability of occurring by chance alone Z = the % of cases above or below the observed score A high Z score may be “extreme” enough for us to reject the null hypothesis Null Hypothesis: All scores differ from the M by chance alone. Alternate (experimental) hypothesis: This score differs from M by more than we would expect by chance…
38
38 Foundations of Research “Statistical significance” We assume a score with less than 5% probability of occurring (i.e., higher or lower than 95% of the other scores… p <.05) is not by chance alone Z > +1.98 occurs < 95% of the time (p <.05). If Z > 1.98 we consider the score to be “significantly” different from the mean To test if an effect is “statistically significant” Compute a Z score for the effect Compare it to the critical value for p<.05; + 1.98 Statistical Significance
39
39 Foundations of Research -3 -2 -1 0 +1 +2 +3 Z Scores (standard deviation units) 34.13% of cases 13.59% of cases 2.25% of cases 13.59% of cases 2.25% of cases 2.4% of cases Z = +1.98Z = -1.98 In a hypothetical distribution: 2.4% of cases are higher than Z = +1.98 2.4% of cases are lower than Z = -1.98 Statistical significance & Areas under the normal curve 95% of cases With Z > +1.98 or < -1.98 we reject the null hypothesis & assume the results are not by chance alone. Thus, Z > +1.98 or < -1.98 will occur < 5% of the time by chance alone.
40
40 Foundations of Research Evaluating Research Questions One participant’s score The mean for a group Means for 2 or more groups Scores on two measured variables Does this score differ from the M for the group by more than chance? Does this M differ from the M for the general population by more than chance? Is the difference between these Means more than we would expect by chance? -- more than the M difference between any 2 randomly selected groups? Is the correlation (‘r’) between these variables more than we would expect by chance -- more than between any two randomly selected variables? DataStatistical Question
41
41 Foundations of Research Critical ratio Critical ratio = The strength of the results (experimental effect) Amount of error variance (“noise” in the data) To estimate the influence of chance we weight our results by the overall amount of variance in the data. In “noisy” data (a lot of error variance) we need a very strong result to conclude that it was unlikely to have occurred by chance alone. In very “clean” data (low variance) even a weak result may be statistically significant. This is the Critical Ratio:
42
42 Foundations of Research Critical ratio Z is a basic Critical ratio Distance of the score from the mean Error variance or “noise” in the data In our example the two samples had equally strong scores (X - M). …but differed in the amount of variance in the distribution of scores Weighting the effect – X - M – in each sample by it’s variance [S] yielded different Z scores:.88 v. 1.74. This led us to different judgments of how likely each result would be to have occurred by chance. Strength of the experimental result Standard Deviation
43
43 Foundations of Research Applying the critical ratio to an experiment Critical Ratio = Treatment Difference Random Variance (Chance) In an experiment the Treatment Difference is variance between the experimental and control groups. Random variance or chance differences among participants within each group. We evaluate that result by comparing it to a distribution of possible effects. We estimate the distribution of possible effects based on the degrees of freedom (“ df ”). We will get to these last 2 points in the next modules.
44
44 Foundations of Research Examples of Critical Ratios Z score = Individual Score – M for Group Standard Deviation (S) for group t-test = Difference between group Ms Standard Error of the Mean F ratio = Between group differences ( differences among > 3 group Ms) Within Group differences (random variance among participants within groups) r (correlation) = Random variance between participants within variables Association between variables (joint Z scores) summed across participants (Z variable1 x Z variable2 ) = =
45
45 Foundations of Research Quiz 2 Where would z or t have to fall for you to consider your results “statistically significant”? (Choose a color). A. B. C. D. F.
46
46 Foundations of Research Quiz 2 Where would z or t have to fall for you to consider your results “statistically significant”? (Choose a color). A. B. C. D. F. Both of these are correct. A Z or t score greater than or less than 1.98 is consided it significant. This means that the result would occur < 5% of the time by chance alone (p < 05). Both of these are correct. A Z or t score greater than or less than 1.98 is consided it significant. This means that the result would occur < 5% of the time by chance alone (p < 05).
47
47 Foundations of Research Quiz 2 Where would z or t have to fall for you to consider your results “statistically significant”? (Choose a color). A. B. C. D. F. This value would also be statistically significant....it exceeds the.05% value we usually use, so it is a more conservative stnandard. This value would also be statistically significant....it exceeds the.05% value we usually use, so it is a more conservative stnandard.
48
48 Foundations of Research Numbers are important for representing “reality” in science (and other fields). Different measures of central tendency are useful & accurate for different data; Mean is the most common. Median useful for skewed data Mode useful for simple categorical data Variance (around the mean) is key to characterizing a set of numbers. We understand a set of scores in terms of the: Central tendency – the average or Mean score The amount of variance in the scores, typically the Standard Deviation. Summary
49
49 Foundations of Research Z is the prototype critical ratio: Summary Statistical decisions follow the critical ratio: Distance of the score (X) from the mean (M) Variance among all the scores in the sample [standard deviation (S)] Z = X–M S = t is also a basic critical ratio used for comparing groups : Difference between group Means Variance within the two groups [standard error of the M (SE)] t = M 1 – M 2 = Summary
50
50 Foundations of Research The critical ratio The next module will show you how to derive a t value.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.