Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.

Slides:



Advertisements
Similar presentations
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Advertisements

Is it statistically significant?
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Statistical Issues in Research Planning and Evaluation
PSY 307 – Statistics for the Behavioral Sciences
Behavioural Science II Week 1, Semester 2, 2002
Lecture 8 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Repeated-Measures ANOVA.
Topic 2: Statistical Concepts and Market Returns
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Overview of Lecture Parametric Analysis is used for
Intro to Statistics for the Behavioral Sciences PSYC 1900
Intro to Statistics for the Behavioral Sciences PSYC 1900
Don’t spam class lists!!!. Farshad has prepared a suggested format for you final project. It will be on the web
Lecture 9: One Way ANOVA Between Subjects
T-Tests Lecture: Nov. 6, 2002.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Inferences About Process Quality
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 11: Power.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
5-3 Inference on the Means of Two Populations, Variances Unknown
Major Points Formal Tests of Mean Differences Review of Concepts: Means, Standard Deviations, Standard Errors, Type I errors New Concepts: One and Two.
The t Tests Independent Samples.
Non-parametric statistics
Relationships Among Variables
Standard error of estimate & Confidence interval.
Chapter 7 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 7: The t Test for Two Independent Sample Means To conduct a t test for.
Standard Error of the Mean
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 8 Introduction to the t Test.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Chapter 9 Two-Sample Tests Part II: Introduction to Hypothesis Testing Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
Chapter Eleven Inferential Tests of Significance I: t tests – Analyzing Experiments with Two Groups PowerPoint Presentation created by Dr. Susan R. Burns.
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Education 793 Class Notes T-tests 29 October 2003.
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
Sampling Distribution of the Mean Central Limit Theorem Given population with and the sampling distribution will have: A mean A variance Standard Error.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Psy B07 Chapter 7Slide 1 HYPOTHESIS TESTING APPLIED TO MEANS.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
The t Tests Independent Samples. The t Test for Independent Samples Observations in each sample are independent (not from the same population) each other.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Hypothesis Testing Using the Two-Sample t-Test
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Stats Lunch: Day 4 Intro to the General Linear Model and Its Many, Many Wonders, Including: T-Tests.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Inference for 2 Proportions Mean and Standard Deviation.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Prepared by Samantha Gaies, M.A.
Chapter 12 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 12: One-Way Independent ANOVA What type of therapy is best for alleviating.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Chapter 10 The t Test for Two Independent Samples
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 13 Understanding research results: statistical inference.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
SPSS Problem and slides Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of times would result in.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Independent Samples: Comparing Means Lecture 39 Section 11.4 Fri, Apr 1, 2005.
Exploring Group Differences
Dependent-Samples t-Test
3. The X and Y samples are independent of one another.
Hypothesis Testing: The Difference Between Two Population Means
Presentation transcript:

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples

Clarification of Estimating Standard Errors The sample sd is an unbiased estimator of the population sd, but any single sample sd is likely to underestimate the population sd. The sample sd is an unbiased estimator of the population sd, but any single sample sd is likely to underestimate the population sd. Standard error calculations using the sample sd will usually produce probability values that are too low (i.e., z scores that are too high). Standard error calculations using the sample sd will usually produce probability values that are too low (i.e., z scores that are too high). Consequently, we use the t distribution, as opposed to the normal, to adjust for this bias. Consequently, we use the t distribution, as opposed to the normal, to adjust for this bias.

One More Example for When Population Mean is Known One case where this is quite common is testing whether participants’ responses are greater than chance. One case where this is quite common is testing whether participants’ responses are greater than chance. For example, can participants identify subliminally- presented stimuli. For example, can participants identify subliminally- presented stimuli. The comparison mean would be.50 The comparison mean would be.50 Number of trials where responses are scored 0 or 1 for incorrect or correct. Number of trials where responses are scored 0 or 1 for incorrect or correct.

One More Example for When Population Mean is Known We do not however know what the population variance is. We do not however know what the population variance is. We must estimate it using the sample variance. We must estimate it using the sample variance. When we do this, we underestimate it, resulting in lower standard errors and higher z-scores (i.e. Type I errors). When we do this, we underestimate it, resulting in lower standard errors and higher z-scores (i.e. Type I errors). Therefore, we will use the t distribution. Therefore, we will use the t distribution.

One More Example for When Population Mean is Known Let’s assume the sample is 25 people, the mean accuracy =.56, and the sample sd=.09. Let’s assume the sample is 25 people, the mean accuracy =.56, and the sample sd=.09.

Confidence Limits What is the 95% confidence interval for accuracy? What is the 95% confidence interval for accuracy?

Comparing Means from Related Samples A more frequent case found in behavioral research is the comparison of two sets of scores that are related (i.e., not independent). A more frequent case found in behavioral research is the comparison of two sets of scores that are related (i.e., not independent). Pre-test / post-test designs Pre-test / post-test designs Dyads Dyads Dependence implies that knowing a score in one distribution allows you better than chance prediction about the related score in the other distribution. Dependence implies that knowing a score in one distribution allows you better than chance prediction about the related score in the other distribution.

Comparing Means from Related Samples The null hypothesis in all cases is: The null hypothesis in all cases is: This can be recast using difference scores. This can be recast using difference scores. Difference scores are calculated as the difference between the subjects’ performance on two occasions (or the difference between related data points) Difference scores are calculated as the difference between the subjects’ performance on two occasions (or the difference between related data points)

Comparing Means from Related Samples Once we do this, we are again working with a “single” sample with a known prediction for the mean. Once we do this, we are again working with a “single” sample with a known prediction for the mean. Thus, we can use a t test as we did previously, with minor modifications. Thus, we can use a t test as we did previously, with minor modifications. We simply calculate the sd of the distribution of difference scores and then use it to estimate the associated standard error. We simply calculate the sd of the distribution of difference scores and then use it to estimate the associated standard error. Note that df’s again = N-1. Note that df’s again = N-1.

Advantages and Disadvantages of Using Related Samples Greatly reduces variability Greatly reduces variability Variability is only with respect to change in dv Variability is only with respect to change in dv Provides perfect control for extraneous variables Provides perfect control for extraneous variables Control group is perfect Control group is perfect Require fewer participants Require fewer participants Problems of order and carry-over effects Problems of order and carry-over effects Experience at time 1 may alter scores at time 2 irrespective of any manipulations Experience at time 1 may alter scores at time 2 irrespective of any manipulations

Effect Size Can we use p-values to quantify the magnitude of an effect? Can we use p-values to quantify the magnitude of an effect? No, as any given difference between means will be more or less significant as a function of sample size (all else being equal). No, as any given difference between means will be more or less significant as a function of sample size (all else being equal). We need a measure of the magnitude of the differences that is separate from sample size. We need a measure of the magnitude of the differences that is separate from sample size.

Effect Size Cohen’s d is a common effect size measure for comparing two means. Cohen’s d is a common effect size measure for comparing two means. By convention: d=.2 small, d=.5 medium, d=.8 large By convention: d=.2 small, d=.5 medium, d=.8 large Can be interpreted as “non-overlap” of distributions. Can be interpreted as “non-overlap” of distributions.

Comparing Means from Independent Samples This represents one of the most frequent cases encountered in behavioral research. This represents one of the most frequent cases encountered in behavioral research. No specific information about the population mean or variance is know. No specific information about the population mean or variance is know. We randomly sample two groups and provide one with a relevant manipulation. We randomly sample two groups and provide one with a relevant manipulation. We then wish to determine whether any differences in group means is more likely attributable to the manipulation or to sampling error. We then wish to determine whether any differences in group means is more likely attributable to the manipulation or to sampling error.

Comparing Means from Independent Samples In this case, we have two independent distributions, each with its own mean and variance. In this case, we have two independent distributions, each with its own mean and variance. We can easily determine what the difference is between the two means, but we will need a measure of sampling error with which to compare it. We can easily determine what the difference is between the two means, but we will need a measure of sampling error with which to compare it. Unlike previous examples, we will need a standard error for the difference between two means. Unlike previous examples, we will need a standard error for the difference between two means.

Standard Errors for Mean Differences Between Independent Samples The logic is similar to what we have done before. The logic is similar to what we have done before. Assume two distinct population distributions. Then, sample pairs of means from each. Assume two distinct population distributions. Then, sample pairs of means from each. The distribution of the mean differences constitutes the appropriate sampling distribution. The distribution of the mean differences constitutes the appropriate sampling distribution. Its sd is the standard error for the t test. Its sd is the standard error for the t test. The variance sum law dictates that the variance of the sum (or difference) of two independent variables is equal to the sum of their variances. The variance sum law dictates that the variance of the sum (or difference) of two independent variables is equal to the sum of their variances.

The means and sd’s for the distributions and their differences are calculated as at right. We know from the central limit theorem that the resulting sampling distributions will be normal. But, the problem of not knowing what the true population sd is arises. To deal with this problem, we must again use the t as opposed to normal distribution to calculate standard errors.

t Tests for Independent Samples The formula is a generalization of the previous formula. The null is that the mean difference between the samples is zero. The formula is a generalization of the previous formula. The null is that the mean difference between the samples is zero. df’s = (n 1 -1)+(n 2 -1)=n 1 +n df’s = (n 1 -1)+(n 2 -1)=n 1 +n 2 -2.

t Tests for Independent Samples with Unequal n’s In the previous formula, we assumed equal condition n’s. Sometimes, however, the n of one sample exceeds the other, in which case its variance is a better approximation of the population variance. In such cases, we pool the variances using a weighted average. In the previous formula, we assumed equal condition n’s. Sometimes, however, the n of one sample exceeds the other, in which case its variance is a better approximation of the population variance. In such cases, we pool the variances using a weighted average.

Assumptions for t Tests Homogeneity of Variance Homogeneity of Variance The population variances of the two distributions are equal The population variances of the two distributions are equal Implies that the variance of the two samples should be relatively equal Implies that the variance of the two samples should be relatively equal Heterogeneity is usually not a problem unless the variance of one sample is greater than 3 times that of the other. Heterogeneity is usually not a problem unless the variance of one sample is greater than 3 times that of the other. If this occurs, SPSS and other programs will provide an both a normal and adjusted t value. If this occurs, SPSS and other programs will provide an both a normal and adjusted t value. The adjustment lowers the df’s which reduces chances for a type I error. The adjustment lowers the df’s which reduces chances for a type I error.

Assumptions for t Tests Normality of Distributions Normality of Distributions We assume that the sampled data are normally distributed. We assume that the sampled data are normally distributed. They need not be exactly normal, but should be unimodal and symmetric. They need not be exactly normal, but should be unimodal and symmetric. Really only a problem for small samples, as the CLT applies everywhere for large samples. Really only a problem for small samples, as the CLT applies everywhere for large samples.

Effect Size Cohen’s d is also used for independent samples. Cohen’s d is also used for independent samples. The only difference is that we use the pooled sd term. The only difference is that we use the pooled sd term.

Confidence Limits What is the 95% confidence interval for accuracy? What is the 95% confidence interval for accuracy?