Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparing k Populations

Similar presentations


Presentation on theme: "Comparing k Populations"— Presentation transcript:

1 Comparing k Populations
Means – One way Analysis of Variance (ANOVA)

2 The F test – for comparing k means
Situation We have k normal populations Let mi and s denote the mean and standard deviation of population i. i = 1, 2, 3, … k. Note: we assume that the standard deviation for each population is the same. s1 = s2 = … = sk = s

3 We want to test against

4 A convenient method for displaying the calculations for the F-test
The ANOVA Table A convenient method for displaying the calculations for the F-test

5 Anova Table Mean Square F-ratio Between k - 1 SSBetween MSBetween
Source d.f. Sum of Squares Mean Square F-ratio Between k - 1 SSBetween MSBetween MSB /MSW Within N - k SSWithin MSWithin Total N - 1 SSTotal

6 To Compute F (and the ANOVA table entries):
1) 2) 3) 4) 5)

7 Then 1) 2) 3) 4)

8 The c2 test for independence

9 Situation We have two categorical variables R and C.
The number of categories of R is r. The number of categories of C is c. We observe n subjects from the population and count xij = the number of subjects for which R = I and C = j. R = rows, C = columns

10 Example Both Systolic Blood pressure (C) and Serum Cholesterol (R) were meansured for a sample of n = 1237 subjects. The categories for Blood Pressure are: < The categories for Cholesterol are: <

11 Table: two-way frequency

12 The c2 test for independence
Define = Expected frequency in the (i,j) th cell in the case of independence.

13 Justification - for Eij = (RiCj)/n in the case of independence
Let pij = P[R = i, C = j] = P[R = i] P[C = j] = rigj in the case of independence = Expected frequency in the (i,j) th cell in the case of independence.

14 H0: R and C are independent
Then to test H0: R and C are independent against HA: R and C are not independent Use test statistic Eij= Expected frequency in the (i,j) th cell in the case of independence. xij= observed frequency in the (i,j) th cell

15 Sampling distribution of test statistic when H0 is true
- c2 distribution with degrees of freedom n = (r - 1)(c - 1) Critical and Acceptance Region Reject H0 if : Accept H0 if :

16

17 Standardized residuals
Test statistic degrees of freedom n = (r - 1)(c - 1) = 9 Reject H0 using a = 0.05

18 Hypothesis testing and Estimation
Linear Regression Hypothesis testing and Estimation

19 Fitting the best straight line to “linear” data
The Least Squares Line Fitting the best straight line to “linear” data

20 The equation for the least squares line
Let

21 Computing Formulae:

22 Then the slope of the least squares line can be shown to be:

23 and the intercept of the least squares line can be shown to be:

24 The residual sum of Squares
Computing formula

25 Estimating s, the standard deviation in the regression model :
Computing formula This estimate of s is said to be based on n – 2 degrees of freedom

26 Sampling distributions of the estimators

27 The sampling distribution slope of the least squares line :
It can be shown that b has a normal distribution with mean and standard deviation

28 The sampling distribution intercept of the least squares line :
It can be shown that a has a normal distribution with mean and standard deviation

29 Estimating s, the standard deviation in the regression model :
Computing formula This estimate of s is said to be based on n – 2 degrees of freedom

30 (1 – a)100% Confidence Limits for slope b :
ta/2 critical value for the t-distribution with n – 2 degrees of freedom

31 (1 – a)100% Confidence Limits for intercept a :
ta/2 critical value for the t-distribution with n – 2 degrees of freedom

32 Example In this example we are studying building fires in a city and interested in the relationship between: X = the distance of the closest fire hall and the building that puts out the alarm and Y = cost of the damage (1000$) The data was collected on n = 15 fires.

33 The Data

34 Scatter Plot

35 Computations

36 Computations Continued

37 Computations Continued

38 Computations Continued

39 Least Squares Line y=4.92x+10.28

40 95% Confidence Limits for slope b :
4.07 to 5.77 t.025 = critical value for the t-distribution with 13 degrees of freedom

41 95% Confidence Limits for intercept a :
7.21 to 13.35 t.025 = critical value for the t-distribution with 13 degrees of freedom


Download ppt "Comparing k Populations"

Similar presentations


Ads by Google