Comparing Two Groups’ Means or Proportions: Independent Samples t-tests.

Slides:



Advertisements
Similar presentations
The t Test for Two Independent Samples
Advertisements

Introductory Mathematics & Statistics for Business
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Chapter 7 Sampling and Sampling Distributions
Hypothesis Test II: t tests
You will need Your text Your calculator
Hypothesis Testing W&W, Chapter 9.
Elementary Statistics
HYPOTHESIS TESTING. Purpose The purpose of hypothesis testing is to help the researcher or administrator in reaching a decision concerning a population.
The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.
5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
Chapter 7, Sample Distribution
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 22 Comparing Two Proportions.
Hypothesis Tests: Two Independent Samples
Chapter 4 Inference About Process Quality
Estimation of Means and Proportions
Small differences. Two Proportion z-Interval and z-Tests.
Statistical Analysis SC504/HS927 Spring Term 2008
Comparing Two Population Parameters
Module 16: One-sample t-tests and Confidence Intervals
McGraw-Hill, Bluman, 7th ed., Chapter 9
Please enter data on page 477 in your calculator.
STATISTICAL ANALYSIS. Your introduction to statistics should not be like drinking water from a fire hose!!
Chi-square and F Distributions
Inferential Statistics
Statistical Inferences Based on Two Samples
Correlation and Regression
Sociology 5811: T-Tests for Difference in Means
Topics
Simple Linear Regression Analysis
Multiple Regression and Model Building
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Chapter 16 Inferential Statistics
Objective: To test claims about inferences for two proportions, under specific conditions Chapter 22.
Statistics Review.
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Comparing Two Groups’ Means or Proportions Independent Samples t-tests.
Inference1 Data Analysis Inferential Statistics Research Methods Gail Johnson.
1. Estimation ESTIMATION.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
The one sample t-test November 14, From Z to t… In a Z test, you compare your sample to a known population, with a known mean and standard deviation.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Comparing Two Groups’ Means or Proportions
AM Recitation 2/10/11.
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
Lecture 14 Testing a Hypothesis about Two Independent Means.
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Comparing Two Proportions
Two Sample Tests Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Chapter 9: Testing Hypotheses
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Chapter 22: Comparing Two Proportions. Yet Another Standard Deviation (YASD) Standard deviation of the sampling distribution The variance of the sum or.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
CHAPTER 7: TESTING HYPOTHESES Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
 Confidence Intervals  Around a proportion  Significance Tests  Not Every Difference Counts  Difference in Proportions  Difference in Means.
Hypothesis Testing: One Sample Cases
Hypothesis Testing Review
Chapter 8 Hypothesis Testing with Two Samples.
Hypothesis Testing: Two Sample Test for Means and Proportions
Elementary Statistics
What are their purposes? What kinds?
Statistical Inference for the Mean: t-test
Presentation transcript:

Comparing Two Groups’ Means or Proportions: Independent Samples t-tests

Review Confidence Interval for a Mean Slap a sampling distribution * over a sample mean to determine a range in which the population mean has a particular probability of being—such as 95% CI. If our sample is one of the middle 95%, we know that the mean of the population is within the CI. Y-bar 95% CI: Y-bar +/ *(s.e.) 2.5% Significance Test for a Mean Slap a sampling distribution * over a guess of the population mean to determine if the sample has a very low probability of having come from a population where the guess is true—such as α-level =.05. If our sample mean is in the outer 5%, we know to reject the guess, our sample has a low chance of having come from a population with the mean we guessed. *sampling distribution: the way a statistic from samples of a certain size are distributed after all possible random samples with replacement are collected µ? 2.5% Y-bar? µ o =guess z or t = (Y-bar - µ o )/ s.e. Y-bar? -1.96z+1.96z -1.96z+1.96z X H

Review Let’s collect some data on educational aspirations and produce a 95% confidence interval to tell us what the population parameter likely is, and then let’s do a significance test, guessing that average aspiration will be 16 years. We have a sample size of 625 kids who reported their educational aspirations (where 12 = high school, 16 equals 4 years of college and so forth). The sample mean is 15 years with a standard deviation of 2 years. 95% confidence interval = Sample Mean +/- z * s.e. 1.Calculate the standard error (s.e.) of the sampling distribution: s.e. = s /  n = 2/√625 = 2/25 = Build the width of the Interval, using s.e. and the z that corresponds with the percent confidence. 95% corresponds with a z of +/ Interval = +/- z * s.e. = +/ * 0.08 = +/ Center the interval width on the mean (add to and subtract from the mean): 95% CI = Sample Mean +/- z * s.e. = 15 +/ The 95% CI:14.84 to We are 95% confident that the population mean falls between these values. (What does this say about my guess???)

Review Let’s collect some data on educational aspirations and produce a 95% confidence interval to tell us what the population parameter likely is, and then let’s do a significance test, guessing that average aspiration will be 16 years. We have a sample size of 625 kids who reported their educational aspirations (where 12 = high school, 16 equals 4 years of college and so forth). The sample mean is 15 years with a standard deviation of 2 years. Significance Test z or t = (Y-bar - µ o ) / s.e. 1.Decide  -level (  =.05) and nature of test (two-tailed) 2.Set critical z or t: (+/- 1.96) 3.Make guess or null hypothesis, H o :  = 16 H a :   16 4.Collect and analyze data 5.Calculate Z or t: z or t = Y-bar -  o (s.e. = s/√n = 2/√625 = 2/25 =.08) s.e. z or t = (15 – 16)/.08 = -1/.08 = Make a decision about the null hypothesis (reject the null: < -1.96) 7.Find the P-value (look up 12.5 in z or t table). P <.0001 It is extremely unlikely that our sample came from a population where the mean is 16.

Other Probability Distributions  A Note: Not all theoretical probability distributions are Normal. One example of many is the binomial distribution.  The binomial distribution gives the discrete probability distribution of obtaining exactly n successes out of N trials where the result of each trial is true with known probability of success and false with the inverse probability.  The binomial distribution has a formula and changes shape with each probability of success and number of trials.  However, in this class the normal probability distribution is the most useful! Successes: a binomial distribution, used with proportions

Why t instead of z?  We use t instead of z to be more accurate:  t curves are symmetric and bell- shaped like the normal distribution. However, the spread is more than that of the standard normal distribution—the tails are fatter. “t is number of standard errors on a t distribution” Tea Tests? df = 1, 2, 5, and so on, approaching normal as df exceeds 120.

t  The reason for using t is due to the fact that we use sample standard deviation (s) rather than population standard deviation (σ) to calculate standard error. Since s, standard deviations, will vary from sample to sample, the variability in the sampling distribution ought to be greater than in the normal curve. t has a larger spread, more accurately reflecting the likelihood of extreme samples, especially when sample size is small.  The larger the degrees of freedom (n – 1 when estimating the mean), the closer the t curve is to the normal curve. This reflects the fact that the standard deviation s approaches σ for large sample size n.  Even though z-scores based on the normal curve will work for larger samples (n > 120) SPSS uses t for all tests because it works for small samples and large samples alike. (df = the number of scores that are free to vary when calculating a statistic... n - ?) Tea Tests?

Comparing Two Groups We’re going to move forward to more sophisticated statistics, building on what we have learned about confidence intervals and significance tests. Social scientists look for relationships between concepts in the social world. For example: Does one’s sex affect income? Focus on the relationship between the concepts: Sex and Income Does one’s race affect educational attainment? Focus on the relationship between the concepts: Race and Educational Attainment I love sophisticated statistics!

Comparing Two Groups In this section of the course, you will learn ways to infer from a sample whether two concepts are related in a population. Independent variable (X): That which causes another variable to change when it changes. Dependent variable (Y): That which changes in response to change in another variable. X  Y (X= Sex or Race) (Y= Income or Education) The statistical technique you use will depend of the level of measurement of your independent and dependent variables—the statistical test must match the variables! Levels of Measurement: Nominal, Ordinal, Interval-Ratio Causality: Three necessary conditions: 1.Association 2.Time order 3.Nonspuriousness

Comparing Two Groups The test you choose depends on level of measurement: IndependentDependentStatistical Test DichotomousInterval-ratio Independent Samples t-test Dichotomous NominalNominalCross TabsOrdinalDichotomous NominalInterval-ratioANOVA OrdinalDichotomous Dichotomous Interval-ratioInterval-ratioCorrelation and OLS Regression Dichotomous

Comparing Two Groups IndependentDependentStatistical Test DichotomousInterval-ratio Independent Samples t-test Dichotomous An independent samples t-test is concerned with whether a mean or proportion is equal between two groups. For example, does sex affect income? Women’s mean = Men’s Mean ??? ♀ Income ♂ Income µµ

Comparing Two Groups Independent Samples t-tests: Earlier, our focus was on the mean. We used the mean of the sample (statistic) to infer a range for what our population mean (parameter) might be (confidence interval) or whether it was like some guess or not (significance test). Now, our focus is on the difference in the mean for two groups. We will use the difference in the sample means (statistic) to infer a range for what our population difference in means (parameter) might be (confidence interval) or whether it is like some guess (significance test).

Comparing Two Groups The difference will be calculated as such: D-bar = Y-bar 2 – Y-bar 1 For example: Average Difference in Income by Sex = Male Average Income – Female Average Income (What would it mean if men’s income minus women’s income equaled zero?)

Comparing Two Groups Like the mean, if one were to take random sample after random sample from two groups—with normal population distributions—and calculate and record the difference between groups each time, one would see the formation of a Sampling Distribution for D-bar that was normal and centered on the two populations’ difference. Z % Range Sampling Distribution of D-bar average difference between two groups’ samples =

Comparing Two Groups So the rules and techniques we learned for means apply to the differences in groups’ means. One creates sampling distributions to create confidence intervals and do significance tests in the same ways. However, the standard error of D-bar has to be calculated slightly differently. For Means: (s 1 ) 2 (s 2 ) 2 s.e. (s.d. of the sampling distribution) = n 1 + n 2 (assumes equal sample size) For Proportions: s.e. =  1 (1 -  1 )  2 (1 -  2 ) n 1 + n 2 df = n1 + n2 - 2 Variance Sum Law: variance of difference between two independent variables is the sum of their variances

Comparing Two Groups When variances are assumed to be equal, and sample sizes differ, we use the pooled estimate of variance for the standard error. Estimated Standard error pooled: Start with a pooled variance. Then: For Means: (s p ) 2 (s p ) 2 s.e.= n 1 + n 2 (assumes equal variance) df = n1 + n2 - 2

Comparing Two Groups Calculating a Confidence Interval for the Difference between Two Groups’ Means By slapping the sampling distribution for the difference over our sample’s difference between groups, D-bar, we can find the values between which the population difference is likely to be. 95% C.I. = D-bar +/ * (s.e.)Remember: When = (Y-bar 2 – Y-bar 1 ) +/ * (s.e.)sample sizes are Or = (  2 –  1 ) +/ * (s.e.)small, t ≠ z, and +/ may not be 99% C.I. = D-bar +/ * (s.e.)appropriate. = (Y-bar 2 – Y-bar 1 ) +/ * (s.e.) Or = (  2 –  1 ) +/ * (s.e.)

Comparing Two Groups Confidence Interval Example: We want to know what the likely difference is between male and female GPAs in a population of college students with 95% confidence. Sample: 50 men, average gpa = 2.9, s.d. = 0.5 (To confuse you, equal sample sizes, ergo 50 women, average gpa = 3.1, s.d. = 0.4 standard error formula not pooled) 95% C.I. = Y-bar 2 – Y-bar 1 +/ * s.e. 1.Find the standard error of the sampling distribution: s.e. =  (.5) 2 / 50 + (.4) 2 /50 =  = .008 = Build the width of the Interval. 95% corresponds with a z or t of +/ /- z * s.e = +/ * = +/ Insert the mean difference to build the interval: 95% C.I. = (Y-bar 2 – Y-bar 1 ) +/ * s.e. = / = 0.2 +/ The interval: to We are 95% confident that the difference between men’s and women’s GPAs in the population is between.026 and If we had guessed zero difference, would the difference be a significant difference?

Comparing Two Groups We can also use the standard error (standard deviation of the sampling distribution for differences between means) to conduct a t-test. Independent Samples t-test: Y 1 - Y 2 t = For Means: (s p ) 2 (s p ) 2 n 1 + n 2 n1 + n2 - 2

Comparing Two Groups Conducting a Test of Significance for the Difference between Two Groups’ Means By slapping the sampling distribution for the difference over a guess of the difference between groups, H o, we can find out whether our sample could have been drawn from a population where the difference is equal to our guess. 1.Two-tailed significance test for  -level =.05 2.Critical z or t = +/ To find if there is a difference in the population, H o :  2 -  1 = 0 H a :  2 -  1  0 4.Collect Data 5.Calculate z or t: z or t = (Y-bar 2 – Y-bar 1 ) – (  2 - µ 1 ) s.e. 6.Make decision about the null hypothesis (reject or fail to reject) 7.Report P-value

Comparing Two Groups Significance Test Example: We want to know whether there is a difference in male and female GPAs in a population of college students. 1.Two-tailed significance test for  -level =.05 2.Critical z or t = +/ To find if there is a difference in the population, H o :  2 -  1 = 0 H a :  2 -  1  0 4.Collect Data Sample: 50 men, average gpa = 2.9, s.d. = 0.5 (To confuse you, equal sample sizes, ergo 50 women, average gpa = 3.1, s.d. = 0.4 standard error formula not pooled) s.e. =  (.5) 2 / 50 + (.4) 2 /50 =  = .008 = Calculate z or t: z or t = 3.1 – 2.9 – 0 = 0.2 = Make decision about the null hypothesis: Reject the null. There is enough difference between groups in our sample to say that there is a difference in the population > Find P-value: p or (sig.) =.0122 x2 (table gives one-tail only) =.0244 We have a 2.4 % chance that the difference in our sample could have come from a population where there is no difference between men and women. That chance is low enough to reject the null, for sure!

Comparing Two Groups The steps outlined above for Confidence intervals And Significance tests for differences in means are the same you would use for differences in proportions. Just note the difference in calculation of the standard error for the difference.

Comparing Two Groups The steps outlined above for Confidence intervals And Significance tests for independent groups are the same you would use for differences between dependent groups. Just note the difference in calculation of the standard error for the difference. s.e. = S D / n

Comparing Two Groups  Now let’s do an example with SPSS, using the General Social Survey.