Categorical Variables

Slides:



Advertisements
Similar presentations
Multiple-choice question
Advertisements

Week 2 – PART III POST-HOC TESTS. POST HOC TESTS When we get a significant F test result in an ANOVA test for a main effect of a factor with more than.
Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.
Inference for Regression
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
Design of Experiments and Analysis of Variance
Independent Sample T-test Formula
Part I – MULTIVARIATE ANALYSIS
Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Lesson #23 Analysis of Variance. In Analysis of Variance (ANOVA), we have: H 0 :  1 =  2 =  3 = … =  k H 1 : at least one  i does not equal the others.
Chapter 3 Analysis of Variance
Analysis of Variance Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
Analysis of Variance Introduction The Analysis of Variance is abbreviated as ANOVA The Analysis of Variance is abbreviated as ANOVA Used for hypothesis.
F-Test ( ANOVA ) & Two-Way ANOVA
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design.
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
ANOVA One Way Analysis of Variance. ANOVA Purpose: To assess whether there are differences between means of multiple groups. ANOVA provides evidence.
Chapter 10 Analysis of Variance.
Copyright © 2004 Pearson Education, Inc.
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Lecture 9-1 Analysis of Variance
Chapter 12: Analysis of Variance. Chapter Goals Test a hypothesis about several means. Consider the analysis of variance technique (ANOVA). Restrict the.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Environmental Modeling Basic Testing Methods - Statistics III.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Comparison of groups The purpose of analysis is to compare two or more population means by analyzing sample means and variances. One-way analysis is used.
The general linear test approach to regression analysis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
EDUC 200C Section 9 ANOVA November 30, Goals One-way ANOVA Least Significant Difference (LSD) Practice Problem Questions?
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview and One-Way ANOVA.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
Chapter 8 Analysis of METOC Variability. Contents 8.1. One-factor Analysis of Variance (ANOVA) 8.2. Partitioning of METOC Variability 8.3. Mathematical.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
1 Pertemuan 19 Analisis Varians Klasifikasi Satu Arah Matakuliah: I Statistika Tahun: 2008 Versi: Revisi.
Analyze Of VAriance. Application fields ◦ Comparing means for more than two independent samples = examining relationship between categorical->metric variables.
Chapter 11 Analysis of Variance
Comparing Multiple Groups:
Chapter 14 Introduction to Multiple Regression
Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
Chapter 10 Two-Sample Tests and One-Way ANOVA.
SEMINAR ON ONE WAY ANOVA
Week 2 – PART III POST-HOC TESTS.
Why is this important? Requirement Understand research articles
Factorial Experiments
ANOVA Econ201 HSTS212.
Basic Practice of Statistics - 5th Edition
Post Hoc Tests on One-Way ANOVA
Chapter 11 – Analysis of Variance
Statistics for Business and Economics (13e)
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Categorical Variables
Comparing k Populations
Comparing Several Means: ANOVA
Chapter 11 Analysis of Variance
Chapter 11: The ANalysis Of Variance (ANOVA)
Hypothesis testing and Estimation
Lecture Slides Elementary Statistics Eleventh Edition
One way ANALYSIS OF VARIANCE (ANOVA)
Comparing k Populations
Indicator Variables Response: Highway MPG
The Analysis of Variance
ANOVA: Analysis of Variance
ANalysis Of VAriance Lecture 1 Sections: 12.1 – 12.2
Presentation transcript:

Categorical Variables Response: Highway MPG Explanatory: Type of drive All Wheel Rear Wheel Front Wheel

Indicator Variables We have used indicator variables so that we can trick JMP into analyzing the data using multiple regression.

Categorical Variables There is a more straight forward analysis that can be done with categorical explanatory variables.

Categorical Variables The analysis is an extension of the two independent sample analysis we did at the beginning of the semester. Body mass index for men and women (Lectures 4 and 5).

JMP Response, Y: Highway MPG (numerical) Explanatory, X: Drive (categorical) Fit Y by X

JMP Fit Y by X From the red triangle pull down menu next to Oneway select Means/Anova Display options Uncheck Mean Diamonds Check Mean Lines

Analysis of Variance Response: numerical, Y Explanatory: categorical, X Total Sum of Squares

Analysis of Variance Partition the Total Sum of Squares into two parts. Due to differences among the sample means for the categories. Due to variation within categories, i.e error variation.

Sum of Squares Factor

Sum of Squares Error

Mean Square A mean square is the sum of squares divided by its associated degrees of freedom. A mean square is an estimate of variability.

Mean Square Factor The mean square factor estimates the variability due to differences among category sample means. If the mean square factor is large, this indicates the category sample means are quite different.

Mean Square Error The mean square error estimates the naturally occurring variability, i.e. the error variance, . This is the ruler used to assess the statistical significance of the differences among category sample means.

Test of Hypothesis H0: all the category population means are equal. HA: some of the category population means are not equal. Similar to the test of model utility.

Test Statistic F = MSFactor/ MSError P-value = Prob > F If the P-value is small, reject H0 and declare that at least two of the categories have different population means.

Analysis of Variance Source df SS MS F Factor (Model) k – 1 SSFactor MSFactor MSFactor/ MSError Error N – k SSError MSError Total N – 1 SSTotal

Analysis of Variance Source df SS MS F Factor (Model) 2 932.3 466.15 16.52 Error 97 2736.7 28.213 Total 99 3669.0

Test of Hypothesis F = 16.52, P-value < 0.0001 Because the P-value is so small, there are some categories that have different population means.

Test of Significance The test of significance, like the test of model utility, is very general. We know there are some categories with different population means but which categories are they?

Multiple Comparisons In ANOVA, a statistically significant F test is often followed up by a procedure for comparing the sample means of the categories.

Least Significant Difference One multiple comparison method is called Fisher’s Least Significant Difference, LSD. This is the smallest difference in sample means that would be declared statistically significant.

Least Significant Difference

Least Significant Difference When the number of observations in each category is the same, there is one value of LSD for all comparisons. When the numbers of observations in each category are different, there is one value of LSD for each comparison.

Compare All to Rear Wheel All Wheel: ni = 23 Rear Wheel: nj = 17 t* = 1.98472 RMSE = 5.311626

Compare All to Rear Wheel

Compare All to Rear Wheel All Wheel: mean = 22.6087 Rear Wheel: mean = 26.5294 Difference in means = 3.9207 3.921 is bigger than the LSD = 3.372, therefore the difference between All Wheel and Rear Wheel is statistically significant.

JMP – Fit Y by X From the red triangle pull down menu next to Oneway select Compare Means – Each Pair Student’s t.

Regression vs ANOVA Note that the P-values for the comparisons are the same as the P-values for the slope estimates in the regression on indicator variables.

Regression vs ANOVA Multiple regression with indicator variables and ANOVA give you exactly the same analysis.