Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applied Business Statistics, 7th ed. by Ken Black

Similar presentations


Presentation on theme: "Applied Business Statistics, 7th ed. by Ken Black"— Presentation transcript:

1 Applied Business Statistics, 7th ed. by Ken Black
Chapter 11 Analysis of Variance

2 Learning Objectives Understand the differences between various experimental designs and when to use them. Compute and interpret the results of a one-way ANOVA. Compute and interpret the results of a random block design. Compute and interpret the results of a two-way ANOVA. Understand and interpret interactions between variables. Know when and how to use multiple comparison techniques. 2

3 Introduction to Design of Experiments
Experimental Design A plan and a structure to test hypotheses in which the researcher controls or manipulates one or more variables.

4 Introduction to Design of Experiments
Independent Variable Treatment variable - one that the experimenter controls or modifies in the experiment. Classification variable - a characteristic of the experimental subjects that was present prior to the experiment, and is not a result of the experimenter’s manipulations or control. Levels or Classifications - the subcategories of the independent variable used by the researcher in the experimental design. Independent variables are also referred to as factors.

5 Independent Variable Manipulation of the independent variable depends on the concept being studied Researcher studies the phenomenon being studied under conditions of the aspects of the variable

6 Introduction to Design of Experiments
Dependent Variable the response to the different levels of the independent variables. Analysis of Variance (ANOVA) – a group of statistical techniques used to analyze experimental designs. ANOVA begins with notion that individual items being studied are all the same

7 Three Types of Experimental Designs
Completely Randomized Design – subjects are assigned randomly to treatments; single independent variable. Randomized Block Design – includes a blocking variable; single independent variable. Factorial Experiments – two or more independent variables are explored at the same time; every level of each factor are studied under every level of all other factors. 5

8 Completely Randomized Design
The completely randomized design contains only one independent variable with two or more treatment levels. If two treatment levels of the independent variable are present, the design is the same used to test the difference in means of two independent populations presented in chapter 10 which used the t test to analyze the data.

9 Completely Randomized Design
A technique has been developed that analyzes all the sample means at one time and precludes the buildup of error rate: ANOVA. A completely randomized design is analyzed by one way analysis of variance (One-Way Anova).

10 One-Way ANOVA: Procedural Overview
9

11 Analysis of Variance The null hypothesis states that the population means for all treatment levels are equal. Even if one of the population means is different from the other, the null hypothesis is rejected. Testing the hypothesis is done by portioning the total variance of data into the following two variances: Variance resulting from the treatment (columns) Error variance or that portion of the total variance unexplained by the treatment 8

12 One-Way ANOVA: Sums of Squares Definitions
10

13 Analysis of Variance The total sum of square of variation is portioned into the sum of squares of treatment columns and the sum of squares of error. ANOVA compares the relative sizes of the treatment variation and the error variation. The error variation is unaccounted for variation and can be viewed at the point as variation due to individual differences in the groups. If a significant difference in treatment is present, the treatment variation should be large relative to the error variation. 8

14 One-Way ANOVA: Computational Formulas
12

15 One-Way ANOVA: Computational Formulas
ANOVA is used to determine statistically whether the variance between the treatment level means is greater than the variances within levels (error variance) Assumptions underlie ANOVA Normally distributed populations Observations represent random samples from the population Variances of the population are equal

16 One-Way ANOVA: Computational Formulas
ANOVA is computed with the three sums of squares: Total – Total Sum of Squares (SST); a measure of all variations in the dependent variable Treatment – Sum of Squares Columns (SSC); measures the variations between treatments or columns since independent variable levels are present in columns Error – Sum of Squares of Error (SSE); yields the variations within treatments (or columns)

17 One-Way ANOVA: Preliminary Calculations
1 2 3 4 6.33 6.26 6.44 6.29 6.36 6.38 6.23 6.31 6.58 6.19 6.27 6.54 6.21 6.4 6.56 6.5 6.34 6.22 Tj T1 = 31.59 T2 = 50.22 T3 = 45.42 T4 = 24.92 T = nj n1 = 5 n2 = 8 n3 = 7 n4 = 4 N = 24 Mean 13

18 One-Way ANOVA: Sum of Squares Calculations
15

19 One-Way ANOVA: Sum of Squares Calculations
15

20 One-Way ANOVA: Computational Formulas
Other items MSC – Mean Squares of Columns MSE - Error MST - Total F value – determined by dividing the treatment variance (MSC) by the error variance (MSE) F value is a ratio of the treatment variance to the error variance

21 One-Way ANOVA: Mean Square and F Calculations
16

22 Analysis of Variance for Valve Openings
Source of Variance df SS MS F Between Error Total 17

23 F Table F distribution table is in Table A7.
Associated with every F table are two unique df variables: degrees of freedom in the numerator, and degrees of freedom in the denominator. Stat computer software packages for computing ANOVA usually give a probability for the F value, which allows hypothesis testing decisions for any values of alpha .

24 A Portion of the F Table for  = 0.05
df1 1 2 3 4 5 6 7 8 9 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 20 4.35 3.49 3.10 2.87 2.71 2.60 2.45 2.39 21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.37 df 2 df2

25 One-Way ANOVA: Procedural Summary
Rejection Region  Critical Value Non rejection Region . H0 reject , 10 3 > = F Since c 10.18 19

26 Excel Output for the Valve Opening Example
Anova: Single Factor SUMMARY Groups Count Sum Average Variance Operator 1 5 31.59 6.318 Operator 2 8 50.22 6.2775 Operator 3 7 45.42 Operator 4 4 24.92 6.23 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 3 Within Groups 20 Total 23

27 MINITAB Output for the Valve Opening Example
Copyright 2011 John Wiley & Sons, Inc.

28 F and t Values Analysis of variance can be used to test hypothesis about the difference in two means. Analysis of data from two samples by both a t test and an ANOVA shows that the observed F values equals the observed t value squared. F = t2 t test of independent samples actually is a special case of one way ANOVA when there are only two treatment levels.

29 Multiple Comparison Tests
ANOVA techniques useful in testing hypothesis about differences of means in multiple groups. Advantage: Probability of committing a Type I error is controlled. Multiple Comparison techniques are used to identify which pairs of means are significantly different given that the ANOVA test reveals overall significance.

30 Multiple Comparison Tests
Multiple comparisons are used when an overall significant difference between groups has been determined using the F value of the analysis of variance Tukey’s honestly significant difference (HSD) test requires equal sample sizes Takes into consideration the number of treatment levels, value of mean square error, and sample size

31 Multiple Comparison Tests
Tukey’s Honestly Significant Difference (HSD) – also known as the Tukey’s T method – examines the absolute value of all differences between pairs of means from treatment levels to determine if there is a significant difference. Tukey-Kramer Procedure is used when sample sizes are unequal.

32 Tukey’s Honestly Significant Difference (HSD) Test
23

33 Demonstration Example Problem
A company has three manufacturing plants, and company officials want to determine whether there is a difference in the average age of workers at the three locations. The following data are the ages of five randomly selected workers at each plant. Perform a one-way ANOVA to determine whether there is a significant difference in the mean ages of the workers at the three plants Use α = 0.01 and note that the sample sizes are equal.

34 Data from Demonstration Example
PLANT (Employee Age) Group Means nj C = 3 dfE = N - C = 12 MSE = 1.63 24

35 Tukey’s HSD test Since sample sizes are equal, Tukey’s HSD tests can be used to compute multiple comparison tests between groups. To compute the HSD, the values of MSE, n and q must be determined

36 q Values for  = 0.01 Degrees of Freedom 1 2 3 4 . 11 12 5 90 135 164
186 14 19 22.3 24.7 8.26 10.6 12.2 13.3 6.51 8.12 9.17 9.96 4.39 5.14 5.62 5.97 4.32 5.04 5.50 5.84 ... Number of Populations 25

37 Tukey’s HSD Test for the Employee Age Data
26

38 Tukey-Kramer Procedure: The Case of Unequal Sample Sizes
27

39 Freighter Example: Means and Sample Sizes for the Four Operators
A metal-manufacturing firm wants to test the tensile strength of a given metal under varying conditions of temperature. Suppose that in the design phase, the metal is processed under five different temperature conditions and that random samples of size five are taken under each temperature condition. The data follow.

40 Freighter Example: Means and Sample Sizes for the Four Operators
1 5 6.3180 2 8 6.2775 3 7 6.4886 4 6.2300

41 Tukey-Kramer Results for the Four Operators
Pair Critical Difference |Actual Differences| 1 and 2 .1405 .0405 1 and 3 .1443 .1706* 1 and 4 .1653 .0880 2 and 3 .1275 .2111* 2 and 4 .1509 .0475 3 and 4 .1545 .2586* *denotes significant at  .05

42 Randomized Block Design
Randomized block design - focuses on one independent variable (treatment variable) of interest. Includes a second variable (blocking variable) used to control for confounding or concomitant variables. Variables that are not being controlled by the researcher in the experiment Can have an effect on the outcome of the treatment being studied.

43 Randomized Block Design
Repeated measures design - is a design in which each block level is an individual item or person, and that person or item is measured across all treatments.

44 Randomized Block Design
The sum of squares in a completely randomized design is SST = SSC + SSE In a randomized block design, the sum of squares is SST = SSC + SSR + SSE SSR (blocking effects) comes out of the SSE Some error in variation in randomized design are due to the blocking effects of the randomized block design, as shown in the next slide

45 Randomized Block Design Treatment Effects: Procedural Overview
The observed F value for treatments computed using the randomized block design formula is tested by comparing it to a table F value. If the observed F value is greater than the table value, the null hypothesis is rejected for that alpha value. If the F value for blocks is greater than the critical F value, the null hypothesis that all block population means are equal is rejected.

46 Randomized Block Design Treatment Effects: Procedural Overview
32

47 Randomized Block Design: Computational Formulas
34

48 Randomized Block Design: Tread-Wear Example
As an example of the application of the randomized block design, consider a tire company that developed a new tire. The company conducted tread-wear tests on the tire to determine whether there is a significant difference in tread wear if the average speed with which the automobile is driven varies. The company set up an experiment in which the independent variable was speed of automobile. There were three treatment levels.

49 Randomized Block Design: Tread-Wear Example
Supplier 1 2 3 4 Slow Medium Fast Block Means ( ) 3.7 4.5 3.1 3.77 3.4 3.9 2.8 3.37 3.5 4.1 3.0 3.53 3.2 2.6 3.10 5 Treatment Means( ) 4.8 4.03 3.54 4.16 2.98 3.56 Speed n = 5 N = 15 C = 3 35

50 Randomized Block Design: Sum of Squares Calculations (Part 1)
36

51 Randomized Block Design: Sum of Squares Calculations (Part 2)

52 Randomized Block Design: Mean Square Calculations

53 Analysis of Variance for the Tread-Wear Example
Source of Variance SS df MS F Treatment Block Error Total 39

54 Randomized Block Design Treatment Effects: Procedural Summary
40

55 Randomized Block Design Blocking Effects: Procedural Overview
41

56 Randomized Block Design Blocking Effects: Procedural Overview
Because the observed value of F for treatment (97.45) is greater than this critical F value, the null hypothesis is rejected. At least one of the population means of the treatment levels is not the same as the others. There is a significant difference in tread wear for cars driven at different speeds The F value for treatment with the blocking was and without the blocking was 12.44 By using the random block design, a much larger observed F value was obtained.

57 Two-Way ANOVA: Hypotheses
44

58 Formulas for Computing a Two-Way ANOVA
45

59 A 2 x 3 Factorial Design: Data and Measurements for CEO Dividend Example
1.75 2.75 3.625 Location Where Company Stock is Traded How Stockholders are Informed of Dividends NYSE AMEX OTC Annual/Quarterly Reports 2 1 3 4 2.5 Presentations to Analysts 2.9167 Xj Xi X11=1.5 X23=3.75 X22=3.0 X21=2.0 X13=3.5 X12=2.5 N = 24 n = 4 X=2.7083 49

60 A 2 x 3 Factorial Design: Calculations for the CEO Dividend Example (Part 1)
50

61 A 2 x 3 Factorial Design: Calculations for the CEO Dividend Example (Part 2)

62 A 2 x 3 Factorial Design: Calculations for the CEO Dividend Example (Part 3)
52

63 Analysis of Variance for the CEO Dividend Problem
Source of Variance SS df MS F Row Column * Interaction Error Total *Denotes significance at = .01. 53

64 Excel Output for the CEO Dividend Example (Part 1)
Anova: Two-Factor With Replication SUMMARY NYSE ASE OTC Total AQReport Count 4 12 Sum 6 10 14 30 Average 1.5 2.5 3.5 Variance 0.3333 1 Presentation 8 15 35 2 3 3.75 2.9167 0.6667 0.25 0.9924 22 29 1.75 2.75 3.625 0.5 0.2679

65 Excel Output for the CEO Dividend Example (Part 2)
ANOVA Source of Variation SS df MS F P-value F crit Sample 1.0417 1 2.4194 0.1373 4.4139 Columns 14.083 2 7.0417 16.355 9E-05 3.5546 Interaction 0.0833 0.0417 0.0968 0.9082 Within 7.75 18 0.4306 Total 22.958 23

66 MINITAB Output for the Demonstration Problem 11.4:


Download ppt "Applied Business Statistics, 7th ed. by Ken Black"

Similar presentations


Ads by Google