= == Critical Value = 1.64 X = 177 = 170 S = 16 N = 25 Z =
= == Critical Value = X = = 120 S = 21.2 N = 100 t =
9.10 Pre/post BeforeAfterDifference D squared Total t paired = t p = d - 0 Standard error of d = d - 0 S d 2 N d = D/N N = 15 d 2 = D 2 – ( D) 2 / N S d 2 = d 2 / N - 1 = 3/15 =.2 = /15 = 175 = 175 / 15 – 1 = 12.5 = 0.2 / = = 0.2 / 12.5 / 15 = 0.2 / df = N – 1 = >
9.11 Pre/post BeforeAfterDifference D squared Total t paired = t p = d - 0 Standard error of d = d - 0 S d 2 N d = D/N N = 30 d 2 = D 2 – ( D) 2 / N S d 2 = d 2 / N - 1 = 15/30 =.5 = /30 = = / 30 – 1 = = 0.5 / = = 0.5 / / 30 = 0.5 / df = N – 1 = >
Pooled estimate of the SED (SEDp) 1 Estimate of the 1 N sN s N ns + SEDp of x E - x C = Sp 2 = Pooled estimate of the variance (x s - x s ) 2 + (x ns - x ns ) 2 Sp 2 = N s + N ns
Pooled estimate of the SED (SEDp) 1 Estimate of the 1 14 s 18 ns + SEDp of x E - x C = Sp Sp 2 = Sp 2 = 30 =
t-Test (Two Tailed) x s - x ns - 0 t = Sp 2 [ ( 1/N s ) + ( 1/N ns ) ] d f = N s + N ns
t-Test (Two Tailed) t = [ ( 1/14 ) + ( 1/18) ] d f =
t-Test (Two Tailed ) t = [ ( 1/14 ) + ( 1/18) ] d f = = [ ( 1/14 ) + ( 1/18) ] t = ( ) = (0.126) = = = Critical value =
ANOVA
Analysis of Variance Allows the statistician to analyze multiple data sets. Number of combinations to be made take two groups at a time –N(N-1)/2 If individual z tests were performed on each combination of a large number of groups the number of calculations would be prohibitive.
Assumptions underlying the use of ANOVA 1.The individuals in the various subgroups should be selected on the basis of random sampling from normally distributed populations. 2.The variance of the subgroups should be homogeneous. (H0: s 1 = s 2 = … = s n ) 3.The samples comprising the groups should be independent.
Single classification ANOVA Group A X Group B X Group C X Group A X 2 Group B X 2 Group C X X = X 2 = X = Xt = 11.90
Values needed for ANOVA The Total Sum of the Squares x 2 t = X 2 – ( X) 2 / N The “Between” Sum of Squares x 2 b = (X – X T ) 2 n The “Within” Sum of Squares x 2 = X 2 – ( X) 2 / n for each group or x 2 w = X 2 t - x 2 b The Degrees of Freedom N between groups –1 plus N within groups -1
Values needed for ANOVA The Total Sum of the Squares x 2 t = X 2 – ( X) 2 / N = [( ) 2 /21] = The “Between” Sum of Squares x 2 b = [ (X ) 2 / n] - x 2 t /N =[(82) 2 /7 + (108) 2 /7 +(60) 2 /7] – (250) 2 /21 =165.0 The “Within” Sum of Squares x 2 = X 2 – ( X) 2 / n for each group or x 2 w = X 2 t - x 2 b = = The Degrees of Freedom N between groups –1 plus N within groups – 1 3 – 1 + (7 – – – 1) = = 20
ANOVA Table Source of variation df Sum of Squares Mean Square “Between” Groups “Within” Groups Total
The F-Test F = mean square for “between”groups mean square for “within” groups = = 5.06 “Between” df = 2 “Within” df = 18 Value of F needed of significance at the 5% level = 3.55 Page 325
Tests after the F test F = (X 1 – X 2 ) 2 /s 2 w (N 1 + N 2 )/ N 1 N 2 A vs. B F = (11.71– 15.43) 2 / 16.3 (14)/49 = (3.72) 2 /4.66 = 2.97 A vs. C F = (11.71– 8.57) 2 /16.3 (14)/49 = (3.14) 2 /4.66 = 2.12 B vs. C F = (15.43– 8.57) 2 /16.3 (14)/49 = (6.86) 2 /4.66 = 10.1
Page 181 X = ABCD
Page 181 XX AX2X2 BX2X2 CX2X2 DX2X X2X2
Page 181 XX AX2X2 BX2X2 CX2X2 DX2X X2X2 X a = 1 X b = 7X c = 5X d = 4 X t = 4.25 =85 =523
Values needed for ANOVA The Total Sum of the Squares x 2 t = X 2 – ( X) 2 / N = [ ] -[( ) 2 /20] = The “Between” Sum of Squares x 2 b = [ (X ) 2 / n] - x 2 t /N =[(5) 2 /5 + (35) 2 /5 +(25) 2 /5+(20) 2 /5 ] – (85) 2 /20 =93.75 The “Within” Sum of Squares x 2 = X 2 – ( X) 2 / n for each group or x 2 w = X 2 t - x 2 b = – = 68 The Degrees of Freedom N between groups –1 plus N within groups –1 4 – 1 + (5 – – – ) = = 19
ANOVA Table Source of variation df Sum of Squares Mean Square “Between” Groups “Within” Groups Total F = 31/25/4.25 = 7.35
HSD = = 4.05(9.22) = 3.73 Tukey’s HSD test = 0.5 k = 4 n – k = 16 Appendix C: q = 4.05 Pair Mean Difference A-B6 A-C4 A-D3 B-C2 B-D3 C-D1
CHAPTER 11 Inferences Regarding Proportions
OUTLINE 11.1 INFERENCES WITH QUALITATIVE DATA Discusses the problem of inference in qualitative data 11.2 MEAN AND STANDARD DEVIATION OF THE BINOMIAL DISTRIBUTION Explains how to compute a mean and a standard deviation for the binomial distribution 11.3 APPROXIMATION OF THE NORMAL TO THE BINOMIAL DISTRIBUTION Shows that, using the normal approximation it is possible to compute a Z score for a number of successes 11.4 TEST OF SIGNIFICANCE OF A BINOMIAL PROPORTION Gives instructions on how to test hypothesis regarding proportions if the distribution of the proportion of successes is known 11.5 TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN Illustrates that, because the difference between two proportions is approximately normally distributed, a hypothesis test for the difference may be easily set up 11.6 CONFIDENCE INTERVALS Discusses and illustrates confidence intervals for
LEARNING OBJECTIVES 1. Compute the mean and the standard deviation of a binomial distribution 2. Compute Z scores for specific points on a binomial distribution 3. Perform significance tests of a binomial proportion and of the difference between two binomial proportions 4. Calculate confidence intervals for a binomial proportion and for the difference between two proportions
INFERENCES WITH QUALITATIVE DATA A. Qualitative data – data for which individual quantitative measurements are not available but that relate to the presence or absence of some characteristic B. p the estimate of the true proportion, , of individuals who possess a certain characteristic C. To best understand the difference between the distribution of binomial events (x) and the distribution of binomial proportion (p) –1. Compare these distributions with those in the approximate analogous quantitative situation –2. The x’s of a binomial distribution with a mean and a standard error
MEAN AND STANDARD DEVIATION OF THE BINOMIAL DISTRIBUTION A.Probability of x successful outcomes in n independent trials is given by: –1. where P is the probability of a success in one individual trial will be used to designate the probability of x successful outcomes B.In a binomial distribution the mean for the number of successes, x, is and the standard deviation is
APPROXIMATION OF THE NORMAL TO THE BINOMIAL DISTRIBUTION A.The normal distribution is a reasonable approximation to the binomial distribution when n is large B. We can find a point on the Z distribution that corresponds to a point x on the binomial distribution by using
APPROXIMATION OF THE NORMAL TO THE BINOMIAL DISTRIBUTION C. Because we are using a normal (continuous) distribution to approximate a discrete one, we may apply the continuity correction to achieve an adjustment. The correction is made by subtracting ½ from the absolute value of the numerator, that is, D. When n is very large and is very small, another important distribution, the Poisson distribution, is a good approximation to the binomial
TEST OF SIGNIFICANCE OF A BINOMIAL PROPORTION A. The mean of the distribution of a binomial proportion p is given by the population parameter and the standard error of p is given by B. When p appears to be normally distributed, providing n is reasonably large, we can find the Z score corresponding to a particular p and perform a test of significance
TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN A. In order to compare proportions from two different samples we must: –1. assume that the proportions are equal, that is, in estimating –2. learn if, the proportion with the given characteristic in one sample differs significantly from, the proportion with the same characteristic in the second sample B. Three thing that must be know to determine if the proportions are significantly different –1. the distribution of the differences - –2. the mean - –3. the standard error of this distribution – (SE) C. Statisticians have shown that follows a nearly normal distribution
TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN D. The standard error is estimated by where and and
TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN Knowing the mean and the standard error of the distribution differences, we can calculate a Z score: If, the formula for is
CONFIDENCE INTERVALS A. Although hypothesis testing is useful, we often go a step further to learn: –1. the true proportion –2. the true difference in proportion between the baseline data and the revised data B. To answer these questions we compute confidence intervals for and for by employing a method to the one used for computing confidence intervals for and
CONFIDENCE INTERVALS C. Confidence interval for Chapter 8 version: Similar version This expression presents a dilemma: it requires that we know , which is an unknown. Solution is to have a sufficiently large sample size, permitting the use of p as an estimate of The expression then becomes
CONFIDENCE INTERVALS A. Confidence interval for The confidence interval for the difference of two means is: The confidence interval for the difference of two proportions is similar:
CONCLUSION The normal approximation to the binomial distribution is a useful statistical tool. It helps answer questions regarding qualitative data involving proportions where individuals are classified into two categories. With an understanding of the distribution of the binomial proportion p and of the distribution of the difference between two proportions we can perform tests of significance and calculate confidence intervals.