Download presentation
Presentation is loading. Please wait.
Published byJames Parrish Modified over 9 years ago
1
5-5 Inference on the Ratio of Variances of Two Normal Populations 5-5.1 The F Distribution We wish to test the hypotheses: The development of a test procedure for these hypotheses requires a new probability distribution, the F distribution.
2
5-5 Inference on the Ratio of Variances of Two Normal Populations 5-5.1 The F Distribution
3
5-5 Inference on the Ratio of Variances of Two Normal Populations 5-5.1 The F Distribution
4
5-5 Inference on the Ratio of Variances of Two Normal Populations The Test Procedure
5
5-5 Inference on the Ratio of Variances of Two Normal Populations The Test Procedure
6
5-5 Inference on the Ratio of Variances of Two Normal Populations The Test Procedure
7
5-5 Inference on the Ratio of Variances of Two Normal Populations
8
5-5 Inference on the Ratio of Variances of Two Normal Populations Example 5-10 OPTIONS NOOVP NODATE NONUMBER LS=80; DATA ex510; n1=16; n2=16; alpha=0.05; df1=n1-1; df2=n2-1; s1=1.96;s2=2.13; f0=(s1**2)/(s2**2); If f0 <= 1 then pvalue=2*probf(f0, df1, df2); /*f0=0.85 less than 1.0 */ Else pvalue=2*(1-probf(f0, df1, df2)); f1=finv(alpha/2, df1, df2); f2=1/finv(alpha/2, df2, df1); fl=finv(alpha/2, df2, df1); fu=finv(1-alpha/2, df2, df1); CL=f0*fl; CU=f0*fu; PROC PRINT; var f0 df1 df2 pvalue f1 f2 fl fu CL CU; RUN; QUIT; SAS 시스템 OBS f0 df1 df2 pvalue f1 f2 fl fu CL CU 1 0.84675 15 15 0.75151 0.34939 2.86209 0.34939 2.86209 0.29585 2.42346
9
5-5 Inference on the Ratio of Variances of Two Normal Populations 5-5.2 Confidence Interval on the Ratio of Two Variances
10
5-5 Inference on the Ratio of Variances of Two Normal Populations
11
5-5 Inference on the Ratio of Variances of Two Normal Populations
12
5-6 Inference on Two Population Proportions 5-6.1 Hypothesis Testing on the Equality of Two Binomial Proportions
13
5-6 Inference on Two Population Proportions 5-6.1 Hypothesis Testing on the Equality of Two Binomial Proportions
14
5-6 Inference on Two Population Proportions
15
5-6 Inference on Two Population Proportions
16
5-6 Inference on Two Population Proportions
17
5-6 Inference on Two Population Proportions OPTIONS NOOVP NODATE NONUMBER LS=80; DATA EX512; N1=300; N2=300; ALPHA=0.05; DFT1=253;DFT2=196; P1HAT=DFT1/N1; P2HAT=DFT2/N2; DIFFP=P1HAT-P2HAT; PHAT=(DFT1+DFT2)/(N1+N2); Z0=DIFFP/SQRT(PHAT*(1-pHAT)*(1/N1+1/N2)); PVALUE=2*(1-PROBNORM(Z0)); ZVALUE=ABS(PROBIT(ALPHA/2)); /*PROBNORM is the inverse of the PROBIT function */ LIMIT=ZVALUE*SQRT((P1HAT*(1-P1HAT)/N1) + (P2HAT*(1-p2HAT)/N2)); UL=DIFFP+LIMIT; LL=DIFFP-LIMIT; PROC PRINT; var P1HAT P2HAT DIFFP PHAT Z0 ZVALUE PVALUE LL UL; RUN; QUIT; OBS P1HAT P2HAT DIFFP PHAT Z0 ZVALUE PVALUE LL UL 1 0.84333 0.65333 0.19 0.74833 5.36215 1.95996 8.2238E-8 0.12224 0.25776
18
5-6 Inference on Two Population Proportions 5-6.2 Type II Error and Choice of Sample Size
19
5-6 Inference on Two Population Proportions 5-6.2 Type II Error and Choice of Sample Size
20
5-6 Inference on Two Population Proportions 5-6.2 Type II Error and Choice of Sample Size
21
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance Treatments (Replicates)
22
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance The levels of the factor are sometimes called treatments. Each treatment has six observations or replicates. The runs are run in random order.
23
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance
24
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance Linear Statistical Model
25
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance
26
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance
27
5-8 What If We Have More Than Two Samples?
28
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance
29
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance
30
5-8 What If We Have More Than Two Samples?
31
5-8 What If We Have More Than Two Samples? 5-8.1 Completely Randomized Experiment and Analysis of Variance
32
5-8 What If We Have More Than Two Samples? Which means differ?
33
5-8 What If We Have More Than Two Samples?
34
5-8 What If We Have More Than Two Samples?
35
5-8 What If We Have More Than Two Samples?
36
5-8 What If We Have More Than Two Samples?
37
5-8 What If We Have More Than Two Samples?
38
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
39
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
40
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
41
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
42
5-8 What If We Have More Than Two Samples? OPTIONS NOOVP NODATE NONUMBER LS=80; proc format; value hc 1=' 5%' 2='10%' 3='15%' 4='20%'; DATA ex514; INPUT hc strength @@; format hc hc.; CARDS; 1 7 2 12 3 14 4 19 1 8 2 17 3 18 4 25 1 15 2 13 3 19 4 22 1 11 2 18 3 17 4 23 1 9 2 19 3 16 4 18 1 10 2 15 3 18 4 20 proc anova data=ex514; class hc; model strength= hc; means hc/lsd snk tukey; TITLE 'proc anova balanced 1-way anova'; proc sort; by hc; proc boxplot; plot strength*hc; proc glm data=ex514; class hc; model strength= hc; means hc/lsd snk tukey; output out=new p=phat r=resid; TITLE 'proc glm 1-way anova'; proc plot data=new; plot resid*(phat hc); Title 'Residual plot'; RUN; QUIT;
43
5-8 What If We Have More Than Two Samples? proc anova balanced 1-way anova The ANOVA Procedure Class Level Information Class Levels Values hc 4 5% 10% 15% 20% Number of Observations Read 24 Number of Observations Used 24 proc anova balanced 1-way anova The ANOVA Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 3 382.7916667 127.5972222 19.61 <.0001 Error 20 130.1666667 6.5083333 Corrected Total 23 512.9583333 R-Square Coeff Var Root MSE strength Mean 0.746243 15.98628 2.551144 15.95833 Source DF Anova SS Mean Square F Value Pr > F hc 3 382.7916667 127.5972222 19.61 <.0001
44
5-8 What If We Have More Than Two Samples? proc anova balanced 1-way anova The ANOVA Procedure t Tests (LSD) for strength NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 20 Error Mean Square 6.508333 Critical Value of t 2.08596 Least Significant Difference 3.0724 Means with the same letter are not significantly different. t Grouping Mean N hc A 21.167 6 20% B 17.000 6 15% B 15.667 6 10% C 10.000 6 5%
45
5-8 What If We Have More Than Two Samples? proc anova balanced 1-way anova The ANOVA Procedure Student-Newman-Keuls Test for strength NOTE: This test controls the Type I experimentwise error rate under the complete null hypothesis but not under partial null hypotheses. Alpha 0.05 Error Degrees of Freedom 20 Error Mean Square 6.508333 Number of Means 2 3 4 Critical Range 3.0724227 3.726419 4.1225627 Means with the same letter are not significantly different. SNK Grouping Mean N hc A 21.167 6 20% B 17.000 6 15% B 15.667 6 10% C 10.000 6 5%
46
5-8 What If We Have More Than Two Samples? proc anova balanced 1-way anova The ANOVA Procedure Tukey's Studentized Range (HSD) Test for strength NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 20 Error Mean Square 6.508333 Critical Value of Studentized Range 3.95829 Minimum Significant Difference 4.1226 Means with the same letter are not significantly different. Tukey Grouping Mean N hc A 21.167 6 20% B 17.000 6 15% B 15.667 6 10% C 10.000 6 5%
47
5-8 What If We Have More Than Two Samples?
48
5-8 What If We Have More Than Two Samples? proc glm 1-way anova The GLM Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 3 382.7916667 127.5972222 19.61 <.0001 Error 20 130.1666667 6.5083333 Corrected Total 23 512.9583333 R-Square Coeff Var Root MSE strength Mean 0.746243 15.98628 2.551144 15.95833 Source DF Type I SS Mean Square F Value Pr > F hc 3 382.7916667 127.5972222 19.61 <.0001 Source DF Type III SS Mean Square F Value Pr > F hc 3 382.7916667 127.5972222 19.61 <.0001
49
5-8 What If We Have More Than Two Samples? Residual plot resid*phat 도표. 범례 : A = 1 관측치, B = 2 관측치, 등. resid | | 5.000 + A 4.500 + 4.000 + 3.500 + A 3.000 + A 2.500 + A 2.000 + A 1.500 + A A 1.000 + A B 0.500 + A 0.000 + A A -0.500 + A -1.000 + A A -1.500 + A -2.000 + A -2.500 + A A -3.000 + A A -3.500 + A -3.667 + A | ---+----------+----------+----------+----------+----------+----------+-- 10 12 14 16 18 20 22 phat Residual plot resid*hc 도표. 범례 : A = 1 관측치, B = 2 관측치, 등. resid | | 5.000 + A 4.500 + 4.000 + A 3.500 + A 3.000 + 2.500 + A 2.000 + A 1.500 + A A 1.000 + A B 0.500 + A 0.000 + A A -0.500 + A -1.000 + A A -1.500 + A -2.000 + A A -2.500 + A -3.000 + A A -3.500 + A -3.667 + A | ---+-----------------+-----------------+-----------------+-- 5% 10% 15% 20% hc
50
5-8 What If We Have More Than Two Samples?
51
5-8 What If We Have More Than Two Samples? Example An article in Fortune compared rent in five American cities: New York, Chicago, Detroit, Tampa, and Orlando. The following data are small random samples of rants (in dollars) in the five cities. The New York data are Manhattan only. Conduct the Kruskal-Wallis test to determine whether evidence exists that there are significant differences in the rents in these cities. If differences exit, where are they? OPTIONS NOOVP NODATE NONUMBER LS=80; proc format; value city 1=' New York' 2='Chicago' 3='Detroit' 4='Tampa' 5='Orlando'; DATA rent; INPUT city rent @@; format city city.; CARDS; 1 900 1 1200 1 850 1 1320 1 1400 1 1150 1 975 2 625 2 640 2 775 2 1000 2 690 2 550 2 840 2 750 3 415 3 400 3 420 3 560 3 780 3 620 3 800 3 390 4 410 4 310 4 320 4 280 4 500 4 385 4 440 5 340 5 425 5 275 5 210 5 575 5 360 ods graphics on; proc npar1way data=rent wilcoxon plots=wilcoxonboxplot; class city; var rent; TITLE 'Kruskal-Wallis Test'; run; ods graphics off; quit; Kruskal-Wallis Test The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable rent Classified by Variable city Sum of Expected Std Dev Mean city N Scores Under H0 Under H0 Score -------------------------------------------------------------------------------- New York 7 228.0 129.50 25.018327 32.571429 Chicago 8 192.0 148.00 26.280538 24.000000 Detroit 8 135.0 148.00 26.280538 16.875000 Tampa 7 62.0 129.50 25.018327 8.857143 Orlando 6 49.0 111.00 23.558438 8.166667 Kruskal-Wallis Test Chi-Square 26.4930 DF 4 Pr > Chi-Square <.0001
52
5-8 What If We Have More Than Two Samples?
53
5-8 What If We Have More Than Two Samples? NYw 8ChicagoDetroitTampaOrlando 2281921356249 179*143*86*13- 62166*130*73*- 13593*57*- 19236*- 228- * Significantly different.
54
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment The randomized block design is an extension of the paired t-test to situations where the factor of interest has more than two levels.
55
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment For example, consider the situation where two different methods were used to predict the shear strength of steel plate girders. Say we use four girders as the experimental units.
56
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment
57
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment The appropriate linear statistical model: We assume treatments and blocks are initially fixed effects blocks do not interact
58
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment The hypotheses of interest are:
59
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment
60
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment The mean squares are:
61
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment The expected values of these mean squares are:
62
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment
63
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment
64
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment The model for the randomized block design is the same as the two- way factorial where the interaction terms are assumed to be zero. The main question of interest is whether the means are the same for the different treatment levels. The blocks are used to explain some of the variability and many times to simplify the mechanics of collecting the data. For instance if the blocks are cities, it is much easier to collect data in one city at a time. The SSTR is the same as SSA in the two-way factorial, SSBL is the same as the SSB in the two-way factorial. Notice SSE in the corresponding two-way factorial model has zero degrees of freedom when you have only one observation per cell. The two-way factorial with only one observation is analyzed the same as the randomized block, and also assumes no interaction between the two factors.
65
5-8 What If We Have More Than Two Samples?
66
5-8 What If We Have More Than Two Samples?
67
5-8 What If We Have More Than Two Samples?
68
5-8 What If We Have More Than Two Samples?
69
5-8 What If We Have More Than Two Samples? Which means differ?
70
5-8 What If We Have More Than Two Samples? 5-8.2 Randomized Complete Block Experiment
71
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
72
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
73
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
74
5-8 What If We Have More Than Two Samples? Residual Analysis and Model Checking
75
5-8 What If We Have More Than Two Samples? OPTIONS NOOVP NODATE NONUMBER LS=80; DATA ex515; DO type=1 to 4; DO fabsam=1 TO 5; INPUT strength @@; OUTPUT; END; END; CARDS; 1.3 1.6 0.5 1.2 1.1 2.2 2.4 0.4 2 1.8 1.8 1.7 0.6 1.5 1.3 3.9 4.4 2 4.1 3.4 PROC GLM DATA=ex515; CLASS type fabsam; MODEL strength= type fabsam; MEANS TYPE/SNK; output out=new p=phat r=resid; TITLE 'Randomized Block Design FabSample: Block, ChemType: Treatment'; proc sort; by type; proc boxplot; plot strength*type; proc plot data=new; plot resid*phat; /* Residual Plot */ plot resid*type; /* Residual by Chemical Type */ plot resid*fabsam; /* Residuals by block */ proc anova data=ex515; class type; model strength=type; means type/snk; TITLE 'one-way anova'; run;quit; Example 5-15
76
5-8 What If We Have More Than Two Samples? Randomized Block Design FabSample: Block, ChemType: Treatment The GLM Procedure Class Level Information Class Levels Values type 4 1 2 3 4 fabsam 5 1 2 3 4 5 Number of Observations Read 20 Number of Observations Used 20
77
5-8 What If We Have More Than Two Samples? Randomized Block Design FabSample: Block, ChemType: Treatment The GLM Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 7 24.73700000 3.53385714 44.59 <.0001 Error 12 0.95100000 0.07925000 Corrected Total 19 25.68800000 R-Square Coeff Var Root MSE strength Mean 0.962979 14.36295 0.281514 1.960000 Source DF Type I SS Mean Square F Value Pr > F type 3 18.04400000 6.01466667 75.89 <.0001 fabsam 4 6.69300000 1.67325000 21.11 <.0001 Source DF Type III SS Mean Square F Value Pr > F type 3 18.04400000 6.01466667 75.89 <.0001 fabsam 4 6.69300000 1.67325000 21.11 <.0001
78
5-8 What If We Have More Than Two Samples? Randomized Block Design FabSample: Block, ChemType: Treatment The ANOVA Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 7 24.73700000 3.53385714 44.59 <.0001 Error 12 0.95100000 0.07925000 Corrected Total 19 25.68800000 R-Square Coeff Var Root MSE strength Mean 0.962979 14.36295 0.281514 1.960000 Source DF Anova SS Mean Square F Value Pr > F type 3 18.04400000 6.01466667 75.89 <.0001 fabsam 4 6.69300000 1.67325000 21.11 <.0001
79
5-8 What If We Have More Than Two Samples? Randomized Block Design FabSample: Block, ChemType: Treatment The ANOVA Procedure Student-Newman-Keuls Test for strength NOTE: This test controls the Type I experimentwise error rate under the complete null hypothesis but not under partial null hypotheses. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 0.07925 Number of Means 2 3 4 Critical Range 0.3879266 0.4749996 0.5285978 Means with the same letter are not significantly different. SNK Grouping Mean N type A 3.5600 5 4 B 1.7600 5 2 C B 1.3800 5 3 C 1.1400 5 1
80
5-8 What If We Have More Than Two Samples? Randomized Block Design FabSample: Block, ChemType: Treatment The ANOVA Procedure Student-Newman-Keuls Test for strength NOTE: This test controls the Type I experimentwise error rate under the complete null hypothesis but not under partial null hypotheses. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 0.07925 Number of Means 2 3 4 Critical Range 0.3879266 0.4749996 0.5285978 Means with the same letter are not significantly different. SNK Grouping Mean N type A 3.5600 5 4 B 1.7600 5 2 C B 1.3800 5 3 C 1.1400 5 1
81
5-8 What If We Have More Than Two Samples?
82
5-8 What If We Have More Than Two Samples? Randomized Block Design FabSample: Block, ChemType: Treatment resid*phat 도표. 범례 : A = 1 관측치, B = 2 관측치, 등. resid | | 0.5 + | A | 0.4 + | 0.3 + A A | A | 0.2 + | 0.1 + A A | A A | A 0.0 + A A | A | -0.1 + A A | A | A A -0.2 + | A -0.3 + | -0.4 + | | A -0.5 + | --+-------------+-------------+-------------+-------------+-------------+- 0 1 2 3 4 5 phat Randomized Block Design FabSample: Block, ChemType: Treatment resid*fabsam 도표. 범례 : A = 1 관측치, B = 2 관측치, 등. resid | | 0.5 + | A | 0.4 + | 0.3 + A A | A | 0.2 + | 0.1 + A A | A A | A 0.0 + A A | A | -0.1 + A A | A | A A -0.2 + | A -0.3 + | -0.4 + | | A -0.5 + | ---+--------------+--------------+--------------+--------------+-- 1 2 3 4 5 fabsam Randomized Block Design FabSample: Block, ChemType: Treatment resid*type 도표. 범례 : A = 1 관측치, B = 2 관측치, 등. resid | | 0.5 + | A | 0.4 + | 0.3 + A A | A | 0.2 + | 0.1 + B | A A | A 0.0 + A A | A | -0.1 + A A | A | B -0.2 + | A -0.3 + | -0.4 + | | A -0.5 + | ---+------------------+------------------+------------------+-- 1 2 3 4 type
83
5-8 What If We Have More Than Two Samples? one-way anova The ANOVA Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 3 18.04400000 6.01466667 12.59 0.0002 Error 16 7.64400000 0.47775000 Corrected Total 19 25.68800000 R-Square Coeff Var Root MSE strength Mean 0.702429 35.26503 0.691195 1.960000 Source DF Anova SS Mean Square F Value Pr > F type 3 18.04400000 6.01466667 12.59 0.0002
84
5-8 What If We Have More Than Two Samples? one-way anova The ANOVA Procedure Student-Newman-Keuls Test for strength NOTE: This test controls the Type I experimentwise error rate under the complete null hypothesis but not under partial null hypotheses. Alpha 0.05 Error Degrees of Freedom 16 Error Mean Square 0.47775 Number of Means 2 3 4 Critical Range 0.9267163 1.1279913 1.2506944 Means with the same letter are not significantly different. SNK Grouping Mean N type A 3.5600 5 4 B 1.7600 5 2 B 1.3800 5 3 B 1.1400 5 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.