Testing Group Difference

Testing Group Difference

Are These Groups the Same?
Testing group differences Do Canadians have different attitudes toward Dell than Americans? Do Fujitsu and Toshiba have different brand images among Pepperdine students? Does a high income class eat more beef than a lower income class? Do sales reps taking different training programs in different regions show different performance?

Warming Up: Testing One Mean
Average weight = 240? (two-tailed test) Setting up the null hypothesis (H0: μ = 240; Ha: μ ≠ 240) Determining the confidence level ( = significance level) Calculating sample mean Calculating the standard error of the mean When population variance (σ2) is known When population variance (σ2) is unknown Calculating z-statistic (or t-statistic: d.f.= n-1) If |statistic| > critical value, then reject the null

When Do We Use Normal or t Distribution?
large (>30) sample size? Yes Normal Distribution (Calculating z-statistic) No Population variance known? Yes No t Distribution (Calculating t-statistic; d.f. = n-1)

Example: Testing One Mean
Ho: Racquet No. Weight 1 240 2 230 3 220 4 5 250 6 260 7 8 9 10 200 Sum 2350 Confidence interval: 95% Sample mean = 235 Population variance is unknown Estimating population variance = 316.7 Standard error of the sample mean = 5.6 t-statistic = -0.89 Critical value = 2.262

Example (Cont’d) Ho: μ = 240 Ha: μ ≠ 240 α = 0.05 0.025 0.025 -t t

Decision Rule: When to Reject Ho?
One-Sided (Tailed) (Ha: μ>μo or μ<μo or π>πH or π <πH) Two-Sided (Tailed) (Ha: μ ≠μo or π≠πH) Test Statistics Zobs>Zα or Zobs<-Zα tobs>tα or tobs<-tα Zobs>Zα/2 or Zobs<-Zα/2 tobs>tα/2 or tobs<-tα/2 P-value P-value < α P-value < α/2 SPSS P-value (Two-sided) (Sig.)/2 < α Sig. < α

Comparing Two Independent Means
If two populations are independent Pop. Mean Pop. Std Dev. Sample Size Sample Mean Sample SD μA σA nA sA μB σB nB sB Setting up the null hypothesis (H0: μA = μB; Ha: μA ≠ μB) Determining the confidence level ( = significance level) Calculating sample means

Comparing Two Independent Means (Cont’d)
Calculating the standard errors of the means When population variance (σ2) is known When population variance (σ2) is unknown Calculating the standard errors of the “difference in means” Calculating z-statistic (or t-statistic: d.f.= nA+nB-2) If |statistic| > critical value, then reject the null

Example: Comparing Two Independent Means
Racquet No. Weight (Machine A) Weight (Machine B) 1 240 2 230 250 3 220 260 4 5 6 7 8 9 10 200 Sum 2350 2500 Ho: Confidence interval: 95% Sample mean = 235 vs. 250 Population variance is unknown Estimating population variance = vs. 66.7 Standard error of the difference = 6.2 t-statistic = -2.42 Critical value = 2.101

Example (Cont’d) Ho: µA=µB Ha: µA≠ µB α = 0.05 0.025 0.025 -t t

Comparing Two Related Means
If they are related (e.g., Pretest & Post Test Scores) Setting up the null hypothesis (H0: D = 0; Ha: D ≠ 0) Determining the confidence level ( = significance level) Calculating the difference of each pair (d) Calculating sample means of the difference Calculating the standard errors of the mean difference Calculating t-statistic (d.f. = n – 1) If |statistic| > critical value, then reject the null

Example: Comparing Two Related Means
Racquet No. Weight (at time 1) Weight (at time 2) d 1 240 2 230 250 -20 3 220 260 -40 4 5 10 6 7 8 9 -30 200 -50 Sum 2350 2500 -150 Ho: Confidence interval: 95% Sample mean of d = -15 Estimating variance of d = 405.6 Standard error of the mean difference = 6.4 t-statistic = -2.36 Critical value = 2.262

Example (Cont’d) Ho: D = 0 Ha: D ≠ 0 α = 0.05 0.025 0.025 -t t

Paired T Test: Results

Decision Rule: When to Reject Ho?
One-Sided (Tailed) (Ha: μ>μo or μ<μo or π>πH or π <πH) Two-Sided (Tailed) (Ha: μ ≠μo or π≠πH) Test Statistics Zobs>Zα or Zobs<-Zα tobs>tα or tobs<-tα Zobs>Zα/2 or Zobs<-Zα/2 tobs>tα/2 or tobs<-tα/2 P-value P-value < α P-value < α/2 SPSS P-value (Two-sided) (Sig.)/2 < α Sig. < α

Comparing Three or More Means: Analysis of Variance (ANOVA)
Idea: If a significant portion of total variation can be explained by between-group variation, then we can conclude that groups are different Total variation = Between group variation + Within group variation Sum of total variation Sum of between-group variation Sum of within-group variation

Example: Comparing Multiple Groups
No. A B C 1 240 230 2 250 3 220 260 4 5 6 7 8 9 10 200 Sum 2350 2500 1920 Mean 235 Total mean = 6770 / 28 = 241.8 Total variation = 5,410.7 Between-group variation = 1,160.7 Within-group variation = 4,250

Calculating Mean Squared Variation
Mean squared variation = sum of squared variation / degrees of freedom Degrees of freedom: Total variation: # total observation (n) – 1 Between-group variation: # groups (k) – 1 Within-group variation: # total observation (n) – # group (k)

No. A B C 1 240 230 2 250 3 220 260 4 5 6 7 8 9 10 200 Sum 2350 2500 1920 Mean 235 Total mean = 6770 / 28 = 241.8 Total variation = 5,410.7 Between-group variation = 1,160.7 Within-group variation = 4,250 Degrees of freedom for total variation = 27 Degrees of freedom for B variation = 2 Degrees of freedom for W variation = 25 MST = 200.4 MSB = 580.4 MSW = 170.0

Then What? Calculating F ratio
This F ratio follows the F-distribution with degrees of freedom of numerator (k-1) and denominator (n-k) Finding a critical value from the F-distribution table If the calculated F ratio is greater than the critical value, we reject the null hypothesis that each group mean is the same (i.e., μA = μB = μC)

No. A B C 1 240 230 2 250 3 220 260 4 5 6 7 8 9 10 200 Sum 2350 2500 1920 Mean 235 Total mean = 6770 / 28 = 241.8 Total variation = 5,410.7 Between-group variation = 1,160.7 Within-group variation = 4,250 Degrees of freedom for total variation = 27 Degrees of freedom for B variation = 2 Degrees of freedom for W variation = 25 MST = 200.4 MSB = 580.4 MSW = 170.0 F ratio = 3.41 critical value (2, 25) = 3.39

ANOVA Procedure Step 1: Calculate each group mean and total mean
Step 2: Calculate total / between-group / within-group variation Step 3: Calculate degrees of freedom for each variation Step 4: Calculate mean variation Step 5: Calculate F ratio Step 6: Obtain critical value from the F-distribution table depending on significance level and degrees of freedom Step 7: Reject the null (i.e., conclude that groups are different) if the F ratio is greater than the critical value. Otherwise we fail to reject the null (i.e., conclude that groups are not different)

ANOVA Tables SPSS Excel

ANOVA Tables

Check Points for ANOVA What is the null hypothesis?
Do you measure the difference in what? What is the treatment? Can you calculate total / between-group / within-group variation? Can you calculate degrees of freedom for each variation? Can you calculate the F-ratio and find the critical value from the F-distribution table?

Part 2 Cross Tabulations and Chi-square Analysis Correlation Analysis

Are Two Variables Associated?
Categorical variables (i.e., nominal and ordinal scales) Continuous variables (i.e., interval and ratio scales)

Investigating contingent relationships… Are two variables independent?
Cross Tabulation Investigating contingent relationships… Are two variables independent?

Income vs. Number of Cars
0 or 1 2+ Total < $37,500 48 6 54 > $37,500 27 19 46 75 25 100 What would be a null hypothesis?

Statistical Independence
In general, the probability of two events occurring jointly is If the two events are independent, then the probability is Since

Example What is the probability of having two aces in a row?
(1) Without replacement (2) With replacement

Expected Numbers Under H0
Income Number of Cars 0 or 1 2+ Total < $37,500 54 > $37,500 46 75 25 100 B1 B2 A1 A2 Get the expected probability now……

Observed vs. Expected Numbers
Income Number of Cars 0 or 1 2+ Total < $37,500 54 > $37,500 46 75 25 100 B1 B2 A1 A2

χ2 Test for Statistical Independence
(Step 1) Calculate the test statistic (Step 2) Find a critical value given the degree of freedom and a significance level (α) from a χ2 table (Step 3) If the test statistic is greater than the critical value, then reject the null hypothesis (i.e.,Ho: two variables are related each other). Otherwise, fail to reject the null (i.e., Ha: two variables are independent)

In Our Example Income Number of Cars 0 or 1 2+ Total < $37,500
54 > $37,500 46 75 25 100

χ2 Test for Goodness-of-Fit
H0: The sample represents the population Observed Expected Brands oi ei (oi-ei)2 / ei US 32 38 0.9474 Japanese 27 31 0.5161 European 21 18 0.5000 Korean 9 0.0000 Other 11 4 Total 100

Are Two Variables Related?
Correlation Analysis Measure of a linear association between two interval- or ratio- scaled variables  Correlation Coefficient Simple Linear Regression Using an interval- or ratio- scaled variable to predict another interval- or ratio- scaled variable  Simple Linear Regression Model Multiple Regression Analysis Introducing multiple predictor variables to predict a focal variable

Correlation Analysis Y Y r = 0.8 r = 0 X Y X r = 1.0 Y X Y r = ?

Correlation Does Not Mean Causation
High correlation Rooster’s crow and the rising of the sun Rooster does not cause the sun to rise. Ice cream consumption and the virus break outs Covary because they are both influenced by a third variable

Calculating Correlation
Sales (y) Ad (x) 100 50 160 60 120 55 90 40 150 80 130 35 110 45 65 30 140 70 Means Standard Deviations Covariance Correlation

Next Class: Are these variables related?
* * Regression * * * * * * * * * * * * * * * * * Simple & Multiple Regression *

Testing Group Difference

Similar presentations

Presentation on theme: "Testing Group Difference"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Testing Group Difference

Similar presentations

Presentation on theme: "Testing Group Difference"— Presentation transcript:

Similar presentations

About project

Feedback