Download presentation
Presentation is loading. Please wait.
Published byJocelyn Payne Modified over 6 years ago
1
Comparing Groups April 6-7, 2017 CS 160 – Section 10
2
Outline Summary measures Normal distribution Comparison of means
Chi-square and its uses Within-subjects and Between subjects study types
3
Summary Measures Where is the center of these data?
Mean Median Mode How spread out are these data? Variance Standard Deviation
4
Measures of central tendency
Refresher Mean = average of all values Median = middle most value when ordered Mode = most frequently occurring value
5
Which one to use? Mean Median Mode
Easy to calculate but distorted by outliers and skewed data Mean Uses less information but is not affected by outliers Median More useful for categorical or ordinal data Mode
6
Variance Average ‘spread’ or ‘deviation’ of values around the mean of observations Mean
7
Calculating Variance - Measure difference of each observation from mean - Take square of all these - Add them up - Divide by (total # of obs. – 1)
8
Standard Deviation (SD)
Square root of Variance! Because SD has the same units as that of original observations
9
Example of SD calculation
Take diastolic BP of 4 any people: 86.25 Mean
10
Example of SD calculation
Variance (s): Subject BP 1 80 2 95 3 96 4 74 Total 345 Difference from mean -6.25 8.75 9.75 -12.25 Square of difference 39.06 76.56 95.06 150.06 360.75 360.75 4 - 1 120.25 =√120.25 SD: = 10.25 Mean = 86.25
11
Using SD Determine the degree of dispersion of values around the mean in a given dataset Compare various datasets of a variable Smaller SD more homogenous data Larger SD more variability in data Larger SD may be masking different populations Smaller SD Larger SD
12
Frequency Polygons Note: Hypothetical data, do not quote
Same as a histogram but uses a line connecting the tops of all the bars instead of the bars themselves Note: Hypothetical data, do not quote
13
Bell shaped (unimodal)
Normal Distribution Symmetrical Bell shaped (unimodal) SD = σ frequency x μ
14
Normal Distribution Example
Typing speed of young adults SD = ± 10wpm frequency x μ = 100 wpm
15
Using SD with Normal Distribution
68% frequency 95% 99% x μ μ-3σ μ-2σ μ-σ μ+σ μ+2σ μ+3σ
16
Comparing means _ μo = 60 wpm x = 100 wpm
17
Comparing means When population SD is known Z - table p - value
18
Comparing means When population SD is unknown t - table p - value
19
Interpreting p-value Mathematically, p-value = area in the tails of the probability distribution curve = Probability that the difference in means is just by chance If p-value < significance level, consider the difference as significant Typically, 0.05 or 0.01 level is used
20
Comparing means _ μo = 60 wpm x = 100 wpm p = α = 0.05 95%
21
Comparing means _ μo = 60 wpm x = 100 wpm p = 0.07 α = 0.05 95%
22
Comparing two means μ1 μ2 Both populations should have normal distribution and equal variance
23
Chi-square test (For Count data)
24
Voice Assistance vs. Accidents
Has had accidents Uses Voice assistance Yes No Total 76 277 353 153 640 793 229 917 1146
25
Question Is Voice assistance associated with having had an accident?
OR Are accidents more common in drivers using Voice Assistance?
26
Make Hypotheses… Null Hypothesis:
Proportion of accidents among drivers using Voice Assistance is same as in those who do not use it Alternate Hypothesis: Proportion of accidents is not similar in the two groups
27
How do we do the chi-square test?
Make 2 x 2 table of observed frequencies Calculate expected frequencies for each cell assuming no difference in the two groups Calculate difference between observed and expected values using chi- square formula Convert chi-square to p value Compare p-value with significance level chosen for the test
28
Example: Observed frequencies
Has had accidents Uses Voice assistance Yes No Total 76 277 353 153 640 793 229 917 1146
29
Calculating Expected Frequencies
Has had accidents Used Voice Assistance Yes No Total 76 277 353 153 640 793 229 917 1146 α = 0.05 353 Overall proportion of accidents = = 30.8% 1146 793 Proportion of no accidents = = 69.2% 1146
30
Calculating Expected Frequencies
Has had accidents Used Voice Assistance Yes No Total 76 277 353 153 640 793 229 917 1146 Expected accident rate in VA users = 30.8% * 229 = 70.5 Expected non-accident rate in VA users = 69.2% * 229 = 158.5
31
Calculating Expected Frequencies
Has had accidents Used Voice Assistance Yes No Total 76 277 353 153 640 793 229 917 1146 Expected accident rate in VA non-users = 30.8% * 917 = 282.5 Expected non-accident rate in VA non-users = 69.2% * 917 = 634.5
32
Expected frequencies Expected VA used Accidents Yes No Total 70.5
282.5 353 158.5 634.5 793 229 917 1146
33
Just for comparison Has had accidents VA users Yes No Total 76 277 353
153 640 793 229 917 1146 Expected VA users Accidents Yes No Total 70.5 282.5 353 158.5 634.5 793 229 917 1146
34
Calculating differences
For our example, χ2 = 0.630
35
Convert chi-square to p-value
Use Chi-distribution table to look up the p-value corresponding to 0.630 OR Use Excel to calculate exact p-value of 0.630 Online calcualtor: efault2.aspx
36
Compare p-value to significance level
Our Calculated p-value = 0.43 Significance level (α) = 0.05 So p-value > 0.05 There is no significant difference in proportion of accidents among VA users and non-users
37
Comparing multiple interfaces
Within vs Between Subject studies
38
Within Subjects design
Each participant tested for all conditions/interface alternatives For example 4 participants first try Interface A Then the same 4 participants try Interface B While you measure performance in both phases 12:38 A B
39
Between Subjects design
Participants are randomly assigned to groups Each participant tested for only one conditions/interface For example 2 participants first try Interface A And the other 2 participants try Interface B While you measure performance for both groups A 12:38 B
40
Learning effects Drawback of within-subjects study design
Learning or gaining experience by participants due to order of presentation of interfaces E.g. Trying out Interface A might give them experienced that improves their performance on the following Interface B
41
Counter-balancing (2 levels)
A solution for learning is counter-balancing: Showing interfaces in different order to different participants Group 1 A B Group 2 Counter-balancing for two levels (two interfaces)
42
Counter-balancing (3 levels)
Group 1 A B C Group 2 Group 3 Further reading:
43
Comparison of both types
Considerations Between Within Sample Size needed Large Small Carryover Effects No Yes Impact on Attitudes Comparative Judgment Better Study Duration Shorter Longer Adapted from:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.