Experimental Statistics - week 3

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 10 The Analysis of Variance.
Advertisements

Experimental Statistics - week 5
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
Multiple regression analysis
Independent Sample T-test Formula
Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven.
ANalysis Of VAriance (ANOVA) Comparing > 2 means Frequently applied to experimental data Why not do multiple t-tests? If you want to test H 0 : m 1 = m.
PSY 307 – Statistics for the Behavioral Sciences
Analysis of Variance Chapter 15 - continued Two-Factor Analysis of Variance - Example 15.3 –Suppose in Example 15.1, two factors are to be examined:
Lecture 9: One Way ANOVA Between Subjects
One-way Between Groups Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Inferences About Process Quality
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.
1 Experimental Statistics - week 3 Statistical Inference 2-sample Hypothesis Tests Review Continued Chapter 8: Inferences about More Than 2 Population.
1 Experimental Statistics - week 7 Chapter 15: Factorial Models (15.5) Chapter 17: Random Effects Models.
QNT 531 Advanced Problems in Statistics and Research Methods
ANOVA Greg C Elvers.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 Experimental Statistics - week 6 Chapter 15: Randomized Complete Block Design (15.3) Factorial Models (15.5)
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
STA305 week21 The One-Factor Model Statistical model is used to describe data. It is an equation that shows the dependence of the response variable upon.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
5-5 Inference on the Ratio of Variances of Two Normal Populations The F Distribution We wish to test the hypotheses: The development of a test procedure.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
The Completely Randomized Design (§8.3)
ANALYSIS OF VARIANCE (ANOVA) BCT 2053 CHAPTER 5. CONTENT 5.1 Introduction to ANOVA 5.2 One-Way ANOVA 5.3 Two-Way ANOVA.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
General Linear Model 2 Intro to ANOVA.
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Descriptive Statistics Used to describe a data set –Mean, minimum, maximum Usually include information on data variability (error) –Standard deviation.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
1 Experimental Statistics - week 9 Chapter 17: Models with Random Effects Chapter 18: Repeated Measures.
1 Experimental Statistics Spring week 6 Chapter 15: Factorial Models (15.5)
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.
1 Experimental Statistics - week 13 Multiple Regression Miscellaneous Topics.
Experimental Statistics - week 9
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
1 Experimental Statistics - week 8 Chapter 17: Mixed Models Chapter 18: Repeated Measures.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
1 Pertemuan 19 Analisis Varians Klasifikasi Satu Arah Matakuliah: I Statistika Tahun: 2008 Versi: Revisi.
Rancangan Acak Lengkap ( Analisis Varians Klasifikasi Satu Arah) Pertemuan 16 Matakuliah: I0184 – Teori Statistika II Tahun: 2009.
Six Easy Steps for an ANOVA 1) State the hypothesis 2) Find the F-critical value 3) Calculate the F-value 4) Decision 5) Create the summary table 6) Put.
1 Experimental Statistics - week 5 Chapter 9: Multiple Comparisons Chapter 15: Randomized Complete Block Design (15.3)
Basic Practice of Statistics - 5th Edition
Statistics Analysis of Variance.
Statistics for Business and Economics (13e)
Econ 3790: Business and Economic Statistics
Introduction to ANOVA.
Experimental Statistics - Week 4 (Lab)
Experimental Statistics - week 8
Presentation transcript:

Experimental Statistics - week 3 Chapter 8: Inferences about More Than 2 Population Central Values

PC SAS on Campus Library BIC Student Center SAS Learning Edition $125 http://support.sas.com/rnd/le/index.html

Hypothetical Sample Data Scenario A Scenario B Pop 1 Pop 2 5 8 7 9 6 6 3 8 4 9 Pop 1 Pop 2 3 7 10 4 3 12 1 4 8 13 For one scenario, | t | = 1.17 For the other scenario, | t | = 3.35

In general, for 2-sample t-tests: To show significance, we want the difference between groups to be ___________ compared to the variability within groups

Completely Randomized Design 1-Factor Analysis of Variance (ANOVA) Setting (Assumptions): - t populations - populations are normal denote the mean and variance of the ith population - mutually independent random samples are taken from the populations - the sample sizes to not have to all be equal

1-Factor ANOVA m1 s m2 s mk s . . .

Question: Notes: - not directional i.e. no “1-sided / 2-sided” issues - alternative doesn’t say that all means are distinct

Completely Randomized Design 1-Factor Analysis of Variance Example data setup where t = 5 and n = 4

Notation:

A Sum-of-Squares Identity Note: This is for the case in which all sample sizes are equal ( n ) The 3 sums of squares measure: - variability between samples - variability within samples - total variability Question: Which measures what?

In words: Total SS = SS between samples + within sample SS where TSS(total SS) = total sample variability SSB(SS between samples) = variability due to factor effects SSW(within sample SS) = variability due to uncontrolled error Note: Formula for unequal sample sizes given on page 388

Pop 1 5 5 5 5 Pop 2 9 9 9 9 Pop 3 7 7 7 7

Pop 1 4 8 3 9 Pop 2 6 10 2 6 Pop 3 5 8 7 4

Recall: For 2-sample t-test, we tested using To show significance, we want the difference between groups compared to the variability within groups

Note: Our test statistic for testing will be of the form This has an F distribution Question: What type of F values lead you to believe the null is NOT TRUE?

Analysis of Variance Table Note:

Note:

CAR DATA Example For this analysis, 5 gasoline types (A - E) were to be tested. Twenty cars were selected for testing and were assigned randomly to the groups (i.e. the gasoline types). Thus, in the analysis, each gasoline type was tested on 4 cars. A performance-based octane reading was obtained for each car, and the question is whether the gasolines differ with respect to this octane reading.     A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 means 91.10 91.35 91.55 91.85 92.70

ANOVA Table Output - car data   Source SS df MS F p-value Between 6.108 4 1.527 6.80 0.0025   samples Within 3.370 15 0.225 Totals 9.478 19

F-table -- p.1106

Extracted from From Ex. 8.2, page 390-391 3 Methods for Reducing Hostility 12 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method 1 96 79 91 85 Method 2 77 76 74 73 Method 3 66 73 69 66 Test:

ANOVA Table Output - hostility data   Source SS df MS F p-value Between   samples Within Totals

SPSS ANOVA Table for Hostility Data        SPSS ANOVA Table for Hostility Data

ANOVA Models Note: Example: Population has mean m = 5. Consider the random sample

For 1-factor ANOVA

Alternative form of the 1-Factor ANOVA Model General Form of Model: Alternative form of the 1-Factor ANOVA Model (pages 394-395) - random errors follow a Normal distribution, are independently distributed, and have zero mean and constant variance -- i.e. variability does not change from group to group

Analysis of Variance Table Recall: Note: - if no factor effects, we expect F _____ - if factor effects, we expect F _____

  The CAR data set as SAS needs to see it:   A 91.7 A 91.2 A 90.9 A 90.6 B 91.7 B 91.9 B 90.9 C 92.4 C 91.2 C 91.6 C 91.0 D 91.8 D 92.2 D 92.0 D 91.4 E 93.1 E 92.9 E 92.4

SAS file for CAR data Case 1: Data within SAS FILE : DATA one;   DATA one; INPUT gas$ octane; DATALINES; A 91.7 A 91.2 . E 92.4 ; PROC GLM; CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas; RUN; PROC MEANS mean var; class gas;

The SAS Output for CAR data: Gasoline Example - Completely Randomized Design   General Linear Models Procedure Dependent Variable: OCTANE Sum of Mean Source DF Squares Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.644440 0.516836 0.4739902 91.710000    Source DF Type I SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025 Textbook Format for ANOVA Table Output - car data   Source SS df MS F p-value Between 6.108 4 1.527 6.80 0.0025   samples Within 3.370 15 0.225 Totals 9.478 19

Problem 1. Descriptive Statistics for CAR Data   The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.7100000 0.7062876 90.6000000 93.1000000

Problem 3. Descriptive Statistics by Gasoline   ------------------------------------ gas=A -------------------------------------   The MEANS Procedure   Analysis Variable : octane   Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.1000000 0.4690416 90.6000000 91.7000000   ------------------------------------ gas=B ------------------------------------- 91.3500000 0.5259911 90.9000000 91.9000000   ------------------------------------ gas=C ------------------------------------- Mean Std Dev Minimum Maximum 91.5500000 0.6191392 91.0000000 92.4000000  ------------------------------------ gas=D ------------------------------------- Analysis Variable : octane 91.8500000 0.3415650 91.4000000 92.2000000 ------------------------------------ gas=E ------------------------------------- 92.7000000 0.3559026 92.4000000 93.1000000  

Question 1: Which gasolines are different? Question 2: Why didn’t we just do t-tests to compare all combinations of gasolines? i.e. compare A vs B A vs C . . . D vs E

Simulation: i.e. using computer to generate data under certain known conditions and observing the outcomes

Setting: Simulation Experiment: Question: Normal population with: m = 20 and s = 5 Simulation Experiment: Generate 2 samples of size n = 10 from this population and run t-test to compare sample means. i.e test: Question: What do we expect to happen?

(which is what we expected) Simulation Results: t-test procedure: (a = .05) Reject H0 if | t | > 2.101 1 21.6 4.0 2 21.1 5.4 t = .235 so we do not reject H0 (which is what we expected)

Simulation results: Now - suppose we obtain 10 samples and test 1 21.6 4.0 2 21.1 5.4 3 20.9 6.2 4 18.3 3.2 5 23.1 6.7 6 18.6 4.8 7 22.2 5.8 8 19.1 5.9 9 20.3 2.5 10 19.3 3.2 Note: Comparing means 4 vs 5 we get t = 2.33 -- i.e. we reject the null (but it’s true!!)

Suppose we run all possible t-tests at significance level a = Suppose we run all possible t-tests at significance level a = .05 to compare 10 sample means of size n = 10 from this population - it can be shown that there is a 63% chance that at least one pair of means will be declared significantly different from each other F-test in ANOVA controls overall significance level.

Probability of finding at least 2 of k means significantly different using multiple t-tests at the a = .05 level when all means are actually equal. k Prob. 2 .05 3 .13 4 .21 5 .29 10 .63 20 .92

Fisher’s Least Significant Difference (LSD) Protected LSD: Preceded by an F-test for overall significance. Only use the LSD if F is significant. X Unprotected: Not preceded by an F-test (like individual t-tests).

Gasoline Example - Completely Randomized Design -- All 5 Gasolines   The GLM Procedure Dependent Variable: octane Sum of Source DF Squares Mean Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square Coeff Var Root MSE octane Mean 0.644440 0.516836 0.473990 91.71000 Source DF Type I SS Mean Square F Value Pr > F gas 4 6.10800000 1.52700000 6.80 0.0025