Introduction to SAS Essentials Mastering SAS for Data Analytics

Slides:



Advertisements
Similar presentations
One-sample T-Test Matched Pairs T-Test Two-sample T-Test
Advertisements

5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
AP Statistics – Chapter 9 Test Review
PSY 307 – Statistics for the Behavioral Sciences
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Chapter 11: Inference for Distributions
Inferences About Process Quality
Nonparametric or Distribution-free Tests
Chapter 12: Analysis of Variance
Experimental Statistics - week 2
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Introduction to SAS Essentials Mastering SAS for Data Analytics
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
Introduction to SAS Essentials Mastering SAS for Data Analytics
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
T-TEST Statistics The t test is used to compare to groups to answer the differential research questions. Its values determines the difference by comparing.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
© Copyright McGraw-Hill 2000
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
T tests comparing two means t tests comparing two means.
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Of the following situations, decide which should be analyzed using one-sample matched pair procedure and which should be analyzed using two-sample procedures?
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 7 l Hypothesis Tests 7.1 Developing Null and Alternative Hypotheses 7.2 Type I & Type.
The Practice of Statistics, 5 th Edition1 Check your pulse! Count your pulse for 15 seconds. Multiply by 4 to get your pulse rate for a minute. Write that.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative
Chapter 8: Estimating with Confidence
Warm up The mean salt content of a certain type of potato chips is supposed to be 2.0mg. The salt content of these chips varies normally with standard.
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Nonparametric Tests IPS Chapter : The Wilcoxon Rank Sum Test
Section 9.5 Day 3.
Two-Sample Hypothesis Testing
Lecture Slides Elementary Statistics Twelfth Edition
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
3. The X and Y samples are independent of one another.
Unit 5: Hypothesis Testing
CHAPTER 10 Comparing Two Populations or Groups
This Week Review of estimation and hypothesis testing
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Lecture Slides Elementary Statistics Twelfth Edition
Elementary Statistics
CHAPTER 21: Comparing Two Means
AP Stats Check In Where we’ve been…
Chapter 9 Hypothesis Testing.
Elementary Statistics
Chapter 9 Hypothesis Testing.
Problems: Q&A chapter 6, problems Chapter 6:
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Unit 5: Hypothesis Testing
Comparing Two Populations
CHAPTER 10 Comparing Two Populations or Groups
Introduction to SAS Essentials Mastering SAS for Data Analytics
Hypothesis Testing: The Difference Between Two Population Means
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
8.3 Estimating a Population Mean
Section 10.2 Comparing Two Means.
CHAPTER 10 Comparing Two Populations or Groups
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Presentation transcript:

Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward

Chapter 11: COMPARING MEANS USING T-TESTS SAS ESSENTIALS -- Elliott & Woodward

LEARNING OBJECTIVES • To understand the use of the one-sample t-test using PROC UNIVARIATE and PROC TTEST • To understand the use of the two-sample t-test using PROC TTEST • To understand the use of the paired t-test using PROC MEANS and PROC TTEST SAS ESSENTIALS -- Elliott & Woodward

11.1 PERFORMING A ONE-SAMPLE T-TEST A one-sample t-test is often used to compare an observed mean with a known or "gold standard“ value. In general, for a one-sample t-test you obtain a random sample from some population and then compare theobserved sample mean to some fixed value. The typical hypotheses for a one-sample t-test are as follows: H0 : m = m0 The population mean is equal to a hypothesized value, m0 Ha: m  m0 The population mean is not equal to m0 SAS ESSENTIALS -- Elliott & Woodward

t-test Assumptions The key assumption underlying the one-sample t-test is that the population from which the random sample is selected is normal. If the data are non-normal, then nonparametric tests such as the sign test and the signed rank test are available (see Chapter 15). However, because of the central limit theorem, whenever the sample size is sufficiently large, the distribution of the sample mean is often approximately normal even when the population is non-normal. (See next slide) SAS ESSENTIALS -- Elliott & Woodward

Sample Size Recommendations The following are general guidelines (see Moore, McCabe, and Craig, 2014). • Small sample size (N < 15): You should not use the one- sample t-test if the data are clearly skewed or if outliers are present • Moderate sample size (N> 15): The one-sample t-test can be safely used except when there are severe outliers. • Large sample size (N> 40): The one-sample t-test can be safely used without regard to skewness or outliers. SAS ESSENTIALS -- Elliott & Woodward

Running the One-Sample t-Test in SAS Here are two ways to perform a one-sample t-test: First, using PROC UNIVARIATE, specify the value of m0 for the test reported in the ''Tests for Location.” For example, PROC UNIVARIATE MU0=4 ;VAR LENGTH ;RUN; does a t-test of the null hypothesis that m0 = 4. A second method in SAS for performing this one-sample t-test is to use the PROC TTEST procedure. For this procedure, the corresponding SAS code is PROC TTEST H0=4;VAR LENGTH; RUN; Do Hands On Exercise p 272 (ATTEST1.SAS) SAS ESSENTIALS -- Elliott & Woodward

Example Code for One Sample t-test Title 'Using PROC UNIVARIATE'; PROC UNIVARIATE DATA=ONESAMPLE MU0=4; VAR LENGTH; RUN; Title 'Using PROC TTEST'; PROC TTEST DATA=ONESAMPLE H0=4 ; Note indication of hypothesized value Note indication of hypothesized value SAS ESSENTIALS -- Elliott & Woodward

Results for a One-Sample t-test – Two Ways PROC UNIVARIATE RESULTS – See the line for Student’s t where p=0.1759 PROC TTEST RESULTS – See where p=0.1759 SAS ESSENTIALS -- Elliott & Woodward

11.2 PERFORMING A TWO-SAMPLE T-TEST The SAS PROC TTEST procedure is used to test for the equality of means for a two-sample (independent group) t-test. The purpose of the two-sample t-test is to determine whether your data provide you with enough evidence to conclude that there is a difference in means. For a two-sample t-test you obtain independent random samples of size N1 and N2 from the two populations, and compare the observed sample means. The typical hypotheses for a two-sample t-test are as follows: H0:μ1 = μ2: The population means of the two groups are equal. Ha:μ1  μ2: The population means are not equal. SAS ESSENTIALS -- Elliott & Woodward

Key Assumptions for a Two-Sample t-test Key assumptions underlying the two-sample t-test are 1. random samples are independent 2. populations are normally distributed with equal variances. 3. If the data are non-normal, then nonparametric tests such as the Mann- Whitney U are available (see Chapter 15: Nonparametric Statistics). SAS ESSENTIALS -- Elliott & Woodward

Guidelines regarding normality and equal variance assumptions Normality: As in the one-sample case, rules of thumb are available to help you determine whether to go ahead and trust the results of a two-sample t-test even when your data are non-normal. The sample size guidelines given earlier in this chapter for the one-sample test can be used in the two-sample case by replacing N with N1 + N2. Equal variances: There are two t-tests reported by SAS in this setting: one based on the assumption that the variances of the two groups are equal (and thus using a pooled estimate of the common variance) and one (Satterthwaite) not making that assumption. Both methods make the same assumptions about normality. There is no universal agreement concerning which t-test to use. See text for more guidelines. SAS ESSENTIALS -- Elliott & Woodward

Running the Two-Sample t-Test in SAS Two sample t-tests can be obtained using the PROC TTEST procedure which was previously introduced in the context of a one-sample test In this section, we'll describe PROC TTEST more completely. The syntax for the TTEST procedure is as follows: PROC TTEST <options>; CLASS variable; <Statements>; SAS ESSENTIALS -- Elliott & Woodward

Common Options for PROC TTEST Table 11.3 Common Options for PROC TTEST Option Explanation DATA = datasetname Specifies which data set to use. COCHRAN Use Cochran and Cox probability approximation for unequal variances. H0=n Specifies the hypothesized value under H0. WEIGHT var; Count observations a number of times according the WEIGHT variable. SAS ESSENTIALS -- Elliott & Woodward

Common statements for PROC TTEST Common statements for PROC TTEST (Table 11.3 Continued) CLASS variables; For a two-sample t-test, specify the grouping variable for the analysis. PROC TTEST; CLASS GROUP; VAR SCORE; VAR variables; Specify observed variables for test: CLASS GROUP; VAR SCORE WEIGHT HEIGHT; PAIRED x*y; Specifies that a paired t-test is to be performed and which variables to use. PAIRED BEFORE*AFTER; BY, FORMAT, LABEL, WHERE These statements are common to most procedures, and may be used here. Do Hands On Example p 276 (ATTEST2.SAS) SAS ESSENTIALS -- Elliott & Woodward

Performing a Two-Sample t-test PROC TTEST; CLASS BRAND; VAR HEIGHT; Title 'Independent Group t-Test Example'; RUN; The CLASS statement identifies the grouping variable The VAR statement identifies the observed variable. SAS ESSENTIALS -- Elliott & Woodward

Understanding the output THIRD– using either t-test, the p-value is small enough to reject the null hypothesis and conclude that there is a difference in means. SECOND - If variances are equal, do the standard (Pooled) t-test. If you consider variances unequal, do the unequal (Satterthwaite) version of the t-test FIRST… if there is a concern about equal variances, use this test to determine if your variances “equal.” A low p value p<.05 indicates that variances are unequal. SAS ESSENTIALS -- Elliott & Woodward

Graphical Results from Two Sample t-test Graphical results provide a visual confirmation that the mean HEIGHT of group B is larger than the mean of group A SAS ESSENTIALS -- Elliott & Woodward

Using PROC BOXPLOT A complementary graphical procedure that can be used to compare means visually is PROC BOX PLOT. Although box plots are produced by default as shown in the preceding example, you might want to create a plot that includes only the side-by-side boxplots. You can use PROC BOXPLOT to create this type of graph (see Chapter 18). SAS ESSENTIALS -- Elliott & Woodward

11.3 PERFORMING A PAIRED T-TEST To perform a paired t-test to compare two repeated measures (such as in a before–after situation) where both observations are taken from the same or matched subjects, use PROC TTEST with the PAIRED statement. Suppose your data contain the variables WBEFORE and WAFTER (before and after weight on a diet) for eight subjects. The hypotheses for this test are: H0:μLoss = 0: The population average weight loss is zero. Ha: μLoss  0: The population average weight loss is not zero. Do Hands on Example P 279 (ATTEST3.SAS) SAS ESSENTIALS -- Elliott & Woodward

Example code for a Paired t-test PROC TTEST; PAIRED WBEFORE*WAFTER; TITLE 'Paired t-test example'; RUN; Note the PAIRED statement – which indicates the two paired variables WBEFORE and WAFTER. SAS ESSENTIALS -- Elliott & Woodward

Understanding the Paired t-test output Note that the test is performed on the difference of the paired variables. The results of the t-test reports (in this case) a p-value p=0.0056 which indicates to reject the null hypothesis of no difference and conclude that there was statistically significant weight loss. SAS ESSENTIALS -- Elliott & Woodward

Graphical Results for Paired t-test Notice that the 95% confidence interval (indicated by the green highlight) does not include 0, which is another way to conclude that the mean difference is greater than 0. SAS ESSENTIALS -- Elliott & Woodward

11.4 SUMMARY This chapter illustrated how to perform tests on one or two means using PROC MEANS. In particular, the chapter emphasizes the fact that it is important to keep in mind the distinction between an independent group's comparison and a repeated (paired) observation's comparison. Continue to Chapter 12: CORRELATION AND REGRESSION SAS ESSENTIALS -- Elliott & Woodward

These slides are based on the book: Introduction to SAS Essentials Mastering SAS for Data Analytics, 2nd Edition By Alan C, Elliott and Wayne A. Woodward Paperback: 512 pages Publisher: Wiley; 2 edition (August 3, 2015) Language: English ISBN-10: 111904216X ISBN-13: 978-1119042167 These slides are provided for you to use to teach SAS using this book. Feel free to modify them for your own needs. Please send comments about errors in the slides (or suggestions for improvements) to acelliott@smu.edu. Thanks. SAS ESSENTIALS -- Elliott & Woodward