5-3 Inference on the Means of Two Populations, Variances Unknown.

Slides:



Advertisements
Similar presentations
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Advertisements

Happiness comes not from material wealth but less desire. 1.
Z-test and t-test Xuhua Xia
Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.
Chapter 14 Comparing two groups Dr Richard Bußmann.
Confidence Interval and Hypothesis Testing for:
Multiple regression analysis
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Testing for differences between 2 means Does the mean weight of cats in Toledo differ from the mean weight of cats in Cleveland? Do the mean quiz scores.
Bivariate Statistics GTECH 201 Lecture 17. Overview of Today’s Topic Two-Sample Difference of Means Test Matched Pairs (Dependent Sample) Tests Chi-Square.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chapter 2 Simple Comparative Experiments
Inferences About Process Quality
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
5-3 Inference on the Means of Two Populations, Variances Unknown
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
1 The future is a vain hope, the past is a distracting thought. Uphold our loving kindness at this instant, and be committed to our duties and responsibilities.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
- Interfering factors in the comparison of two sample means using unpaired samples may inflate the pooled estimate of variance of test results. - It is.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
4-5 Inference on the Mean of a Population, Variance Unknown Hypothesis Testing on the Mean.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
For 95 out of 100 (large) samples, the interval will contain the true population mean. But we don’t know  ?!
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
5-5 Inference on the Ratio of Variances of Two Normal Populations The F Distribution We wish to test the hypotheses: The development of a test procedure.
T- and Z-Tests for Hypotheses about the Difference between Two Subsamples.
1 10 Statistical Inference for Two Samples 10-1 Inference on the Difference in Means of Two Normal Distributions, Variances Known Hypothesis tests.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
© Copyright McGraw-Hill 2000
Chapter 10 Statistical Inference for Two Samples Twice as much fun as one sample.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
Descriptive Statistics Used to describe a data set –Mean, minimum, maximum Usually include information on data variability (error) –Standard deviation.
8.2 Testing the Difference Between Means (Independent Samples,  1 and  2 Unknown) Key Concepts: –Sampling Distribution of the Difference of the Sample.
2-1 Data Summary and Display Population Mean For a finite population with N measurements, the mean is The sample mean is a reasonable estimate of.
Experimental Statistics - week 3
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Chapter 10 Statistical Inference for Two Samples More than one but less than three! Chapter 10B < X
Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.
Inference on Two Population DATA CARLSBAD; INPUT YEAR COUNT CARDS;
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
Homework and project proposal due next Thursday. Please read chapter 10 too…
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 3 Type I, II Error, Sample Size, and Power Henian Chen, M.D., Ph.D.
Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.
Comparing Two Means Ch. 13. Two-Sample t Interval for a Difference Between Two Means.
Sample Size Needed to Achieve High Confidence (Means)
T T Population Hypothesis Tests Purpose Allows the analyst to analyze the results of hypothesis testing of the difference of 2 population.
Chapter 17 Estimation and Hypothesis Tests: Two Populations.
Hypothesis Testing Start with a question:
Psychology 202a Advanced Psychological Statistics
Chapter 2 Simple Comparative Experiments
5-5 Inference on the Ratio of Variances of Two Normal Populations
The future is a vain hope, the past is a distracting thought
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
6-1 Introduction To Empirical Models
4-1 Statistical Inference
5-4 The Paired t-Test OPTIONS NOOVP NODATE NONUMBER ls=80;
2-1 Data Summary and Display 2-1 Data Summary and Display.
Introduction to SAS Essentials Mastering SAS for Data Analytics
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

5-3 Inference on the Means of Two Populations, Variances Unknown

5-3 Inference on the Means of Two Populations, Variances Unknown

5-3 Inference on the Means of Two Populations, Variances Unknown

5-3 Inference on the Means of Two Populations, Variances Unknown OPTIONS NOOVP NODATE NONUMBER LS=80; PROC FORMAT; VALUE MR 0='PHX' 1='RuralAZ'; DATA ARSENIC; INPUT AREA ARSENIC FORMAT AREA MR.; CARDS; PROC TTEST DATA=ARSENIC; CLASS AREA; VAR ARSENIC; TITLE 'EXAMPLE 5-5'; RUN; QUIT;

5-3 Inference on the Means of Two Populations, Variances Unknown EXAMPLE 5-5 The TTEST Procedure Variable: ARSENIC AREA N Mean Std Dev Std Err Minimum Maximum PHX RuralAZ Diff (1-2) AREA Method Mean 95% CL Mean Std Dev PHX RuralAZ Diff (1-2) Pooled Diff (1-2) Satterthwaite AREA Method 95% CL Std Dev PHX RuralAZ Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > |t| Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F

5-3 Inference on the Means of Two Populations, Variances Unknown Hypothesis Testing on the Difference in Means

5-3 Inference on the Means of Two Populations, Variances Unknown

5-3 Inference on the Means of Two Populations, Variances Unknown Type II Error and Choice of Sample Size

Standardized Difference, d (a) OC Curves for a Two—Sided t—Test (α = 0.05 ) Chart VOperating Characteristic Curves for the t-Test

Standardized Difference, d (b) OC Curves for a Two-Sided t—Test (α = 0.01)

5-3 Inference on the Means of Two Populations, Variances Unknown Confidence Interval on the Difference in Means

5-3 Inference on the Means of Two Populations, Variances Unknown Confidence Interval on the Difference in Means

5-3 Inference on the Means of Two Populations, Variances Unknown Confidence Interval on the Difference in Means

5-3 Inference on the Means of Two Populations, Variances Unknown OPTIONS NOOVP NODATE NONUMBER LS=80; DATA EX520; INPUT TYPE TEMP CARDS; PROC SORT; BY TYPE; PROC UNIVARIATE NORMAL PLOT; VAR TEMP; BY TYPE; TITLE 'NORMALITY CHECK'; PROC TTEST DATA=EX520 SIDES=U; CLASS TYPE; VAR TEMP; TITLE 'EXERCISE 520'; RUN; QUIT; EX 5-20 (P235)

5-3 Inference on the Means of Two Populations, Variances Unknown NORMALITY CHECK TYPE= UNIVARIATE 프로시저 변수 : TEMP 적률 N 15 가중합 15 평균 관측치 합 2946 표준편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준오차 정규성 검정 검정 ---- 통계량 p- 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D Cramer-von Mises W-Sq Pr > W-Sq Anderson-Darling A-Sq Pr > A-Sq NORMALITY CHECK TYPE= UNIVARIATE 프로시저 변수 : TEMP 적률 N 15 가중합 15 평균 관측치 합 2881 표준편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준오차 정규성 검정 검정 ---- 통계량 p- 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq >0.2500

5-3 Inference on the Means of Two Populations, Variances Unknown 줄기 잎 # 상자그림 | | | 19 | + | *-----* | | 값 : ( 줄기. 잎 )*10**+1 정규 확률도 *++++* | * * * *+++ | | +++** * * | * +*+* * | * 줄기 잎 # 상자그림 | | | *--+--* | | 값 : ( 줄기. 잎 )*10**+1 정규 확률도 *++ | * *++*+ | **+* *+*+++ | *+*+* | +++* *++++* TYPE= TYPE=

5-3 Inference on the Means of Two Populations, Variances Unknown Variable: TEMP TYPE N Mean Std Dev Std Err Minimum Maximum Diff (1-2) TYPE Method Mean 95% CL Mean Std Dev Diff (1-2) Pooled Infty Diff (1-2) Satterthwaite Infty TYPE Method 95% CL Std Dev Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > t Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F or

Inference on Two Population H 0 : m 1 = m 2 Both s ’s Known Both n’s Large Z –Test Normal Distribution Use S for s If s unknown t –Test Pooled Variance Wilcoxon-Mann- Whitney Test t –Test Satterthwaite s 1 = s 2 F Test Both X’s Normal YES NO

Inference on Two Population Visitors; Week of July 4, 2009 Visitors; Week of July 4,

Inference on Two Population X1X1 R1R1 X2X2 R2R S R1 2 = , S R1 = 3.29 S R2 2 = , S R2 = 5.11

Inference on Two Population DATA CARLSBAD; INPUT YEAR COUNT CARDS; PROC UNIVARIATE DATA=CARLSBAD NORMAL; VAR COUNT; BY YEAR; TITLE 'PROBLEM ASSUMING NORMALITY'; PROC TTEST DATA=CARLSBAD; CLASS YEAR; VAR COUNT; PROC RANK DATA=CARLSBAD OUT=RANKED; VAR COUNT; PROC TTEST DATA=RANKED; CLASS YEAR; VAR COUNT; TITLE 'Problem using Wilcoxon-Mann-Whitney test'; RUN;QUIT;

Inference on Two Population PROBLEM ASSUMING NORMALITY YEAR= UNIVARIATE 프로시저 변수 : COUNT 적률 N 7 가중합 7 평균 관측치 합 2764 표준편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준오차 정규성 검정 검정 ---- 통계량 p- 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq Anderson-Darling A-Sq Pr > A-Sq YEAR= UNIVARIATE 프로시저 변수 : COUNT 적률 N 7 가중합 7 평균 관측치 합 2613 표준편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준오차 정규성 검정 검정 ---- 통계량 p- 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D Cramer-von Mises W-Sq Pr > W-Sq Anderson-Darling A-Sq Pr > A-Sq <0.0050

Inference on Two Population The TTEST Procedure Variable: COUNT YEAR N Mean Std Dev Std Err Minimum Maximum Diff (1-2) YEAR Method Mean 95% CL Mean Std Dev 95% CL Std Dev Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > |t| Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F _____________________________________________________________________________________ Problem using Wilcoxon-Mann-Whitney test The TTEST Procedure Variable: COUNT (Values of COUNT Were Replaced by Ranks) YEAR N Mean Std Dev Std Err Minimum Maximum Diff (1-2) YEAR Method Mean 95% CL Mean Std Dev 95% CL Std Dev Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > |t| Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F

5-4 The Paired t-Test A special case of the two-sample t-tests of Section 5- 3 occurs when the observations on the two populations of interest are collected in pairs. Each pair of observations, say (X 1j, X 2j ), is taken under homogeneous conditions, but these conditions may change from one pair to another. The test procedure consists of analyzing the differences between hardness readings on each specimen.

5-4 The Paired t-Test

OPTIONS NOOVP NODATE NONUMBER LS=80; DATA STRENGTH; INPUT K L DIFF = K-L; CARDS; PROC UNIVARIATE DATA=STRENGTH NORMAL; VAR DIFF; TITLE 'PAIRED T-TEST BY PROC UNIVARIATE'; PROC TTEST DATA=STRENGTH; PAIRED K*L; TITLE 'PAIRED TTEST BY PROC TTEST'; RUN; QUIT;

5-4 The Paired t-Test PAIRED T-TEST BY PROC UNIVARIATE UNIVARIATE 프로시저 변수 : DIFF 적률 N 9 가중합 9 평균 관측치 합 표준편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준오차 위치모수 검정 : Mu0=0 검정 -- 통계량 p- 값 스튜던트의 t t Pr > |t| 정규성 검정 검정 ---- 통계량 p- 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > PAIRED TTEST BY PROC TTEST The TTEST Procedure Difference: K - L N Mean Std Dev Std Err Minimum Maximum Mean 95% CL Mean Std Dev 95% CL Std Dev DF t Value Pr > |t|

5-4 The Paired t-Test

Paired Versus Unpaired Comparisons

5-4 The Paired t-Test Confidence Interval for  D

5-4 The Paired t-Test

FirstSecondD Sample Example: An insurance adjuster wants to compare estimates from two different repair garages for minor repairs on automobiles. Thirteen pairs of estimated are available. (a)State the appropriate null and alternative hypothesis to see if there is any difference in the mean estimated of the two garages. Let a =0.05 and test the null hypothesis with the Wilcoxon signed ranks test. State the p-value. (b)Check the differences in estimates from the two garages for normality. (c)Based on the results of part (b), the paired t test should not be applied to these data: however, compute the paired t test to test the null hypothesis on part (a) and compare it with the results of the Wilcoxon signed ranks test.

5-4 The Paired t-Test FirstSecondD|D||R|R S R = 6.63

5-4 The Paired t-Test OPTIONS NOOVP NODATE NONUMBER LS=80; DATA INSURE; INPUT FIRST SECOND DIFF=FIRST-SECOND; IF DIFF<0 THEN IND=1; ELSE IND=0; ABSDIFF=ABS(DIFF); CARDS; PROC UNIVARIATE DATA=INSURE NORMAL; VAR DIFF; TITLE 'normality check and t-test'; PROC RANK DATA=INSURE OUT=RINSURE; VAR ABSDIFF; DATA RINSURE; SET RINSURE; IF IND=1 THEN ABSDIFF=-ABSDIFF; PROC UNIVARIATE DATA=RINSURE; VAR ABSDIFF; TITLE 'Wilcoxon Signed Ranks Test'; RUN;QUIT;

5-4 The Paired t-Test normality check and t-test UNIVARIATE 프로시저 변수 : DIFF 적률 N 13 가중합 13 평균 7 관측치 합 91 표준편차 분산 136 왜도 첨도 제곱합 2269 수정 제곱합 1632 변동계수 평균의 표준오차 위치모수 검정 : Mu0=0 검정 -- 통계량 p- 값 스튜던트의 t t Pr > |t| 부호 M 3 Pr >= |M| 부호 순위 S 27 Pr >= |S| 정규성 검정 검정 ---- 통계량 p- 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D < Cramer-von Mises W-Sq Pr > W-Sq < Anderson-Darling A-Sq Pr > A-Sq <0.0050

5-4 The Paired t-Test Wilcoxon Signed Ranks Test UNIVARIATE 프로시저 변수 : ABSDIFF (Values of ABSDIFF Were Replaced by Ranks) 적률 N 13 가중합 13 평균 관측치 합 61 표준편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준오차 위치모수 검정 : Mu0=0 검정 -- 통계량 p- 값 스튜던트의 t t Pr > |t| 부호 M 3.5 Pr >= |M| 부호 순위 S 30.5 Pr >= |S|