Statistics for the Social Sciences Psychology 340 Spring 2005 Sampling distribution
Statistics for the Social Sciences Outline Review 138 stuff: –What are sample distributions –Central limit theorem –Standard error (and estimates of) –Test statistic distributions as transformations
Statistics for the Social Sciences Flipping a coin example HHH HHT HTH HTT THH THT TTH TTT Number of heads n2n = 2 3 = 8 total outcomes
Statistics for the Social Sciences Flipping a coin example Number of heads Xfp Number of heads probability Distribution of possible outcomes (n = 3 flips)
Statistics for the Social Sciences Hypothesis testing Can make predictions about likelihood of outcomes based on this distribution. Distribution of possible outcomes (of a particular sample size, n) In hypothesis testing, we compare our observed samples with the distribution of possible samples (transformed into standardized distributions) This distribution of possible outcomes is often Normally Distributed
Statistics for the Social Sciences Distribution of sample means Comparison distributions considered so far were distributions of individual scores Mean of a group of scores –Comparison distribution is distribution of means
Statistics for the Social Sciences Distribution of sample means A simple case –Population: –All possible samples of size n = Assumption: sampling with replacement
Statistics for the Social Sciences Distribution of sample means A simpler case –Population: –All possible samples of size n = mean There are 16 of them
Statistics for the Social Sciences Distribution of sample means mean means In long run, the random selection of tiles leads to a predictable pattern
Statistics for the Social Sciences Distribution of sample means means Xfp Sample problem: –What’s the probability of getting a sample with a mean of 6 or more? P(X > 6) = = Same as before, except now we’re asking about sample means rather than single scores
Statistics for the Social Sciences Distribution of sample means Distribution of sample means is a “virtual” distribution between the sample and population PopulationDistribution of sample meansSample
Statistics for the Social Sciences Properties of the distribution of sample means Shape –If population is Normal, then the dist of sample means will be Normal Population Distribution of sample means N > 30 –If the sample size is large (n > 30), regardless of shape of the population
Statistics for the Social Sciences Properties of the distribution of sample means –The mean of the dist of sample means is equal to the mean of the population PopulationDistribution of sample means same numeric value different conceptual values Center
Statistics for the Social Sciences Properties of the distribution of sample means Center –The mean of the dist of sample means is equal to the mean of the population –Consider our earlier example 2468 Population = = 5 Distribution of sample means means = = 5
Statistics for the Social Sciences Properties of the distribution of sample means Spread –The standard deviation of the distribution of sample mean depends on two things Standard deviation of the population Sample size
Statistics for the Social Sciences Properties of the distribution of sample means Spread Standard deviation of the population X 1 X 2 X 3 X 1 X 2 X 3 The smaller the population variability, the closer the sample means are to the population mean
Statistics for the Social Sciences Properties of the distribution of sample means Spread Sample size n = 1 X
Statistics for the Social Sciences Properties of the distribution of sample means Spread Sample size n = 10 X
Statistics for the Social Sciences Properties of the distribution of sample means Spread Sample size n = 100 X The larger the sample size the smaller the spread
Statistics for the Social Sciences Properties of the distribution of sample means Spread Standard deviation of the population Sample size –Putting them together we get the standard deviation of the distribution of sample means –Commonly called the standard error
Statistics for the Social Sciences Standard error The standard error is the average amount that you’d expect a sample (of size n) to deviate from the population mean –In other words, it is an estimate of the error that you’d expect by chance (or by sampling)
Statistics for the Social Sciences Distribution of sample means Keep your distributions straight by taking care with your notation Sample s X Population Distribution of sample means
Statistics for the Social Sciences Properties of the distribution of sample means All three of these properties are combined to form the Central Limit Theorem –For any population with mean and standard deviation , the distribution of sample means for sample size n will approach a normal distribution with a mean of and a standard deviation of as n approaches infinity (good approximation if n > 30).
Statistics for the Social Sciences Performing your statistical test What are we doing when we test the hypotheses? –Computing a test statistic: Generic test Could be difference between a sample and a population, or between different samples Based on standard error or an estimate of the standard error
Statistics for the Social Sciences Hypothesis Testing With a Distribution of Means It is the comparison distribution when a sample has more than one individual Find a Z score of your sample’s mean on a distribution of means
Statistics for the Social Sciences “Generic” statistical test An example: One sample z-test Memory example experiment: We give a n = 16 memory patients a memory improvement treatment. How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? After the treatment they have an average score of = 55 memory errors. Step 1: State your hypotheses H0:H0: the memory treatment sample are the same (or worse) as the population of memory patients. HA:HA: Their memory is better than the population of memory patients Treatment > pop > 60 Treatment < pop < 60
Statistics for the Social Sciences “Generic” statistical test An example: One sample z-test Memory example experiment: We give a n = 16 memory patients a memory improvement treatment. How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? After the treatment they have an average score of = 55 memory errors. Step 2: Set your decision criteria H 0 : Treatment > pop > 60 H A : Treatment < pop < 60 = 0.05 One -tailed
Statistics for the Social Sciences “Generic” statistical test An example: One sample z-test Memory example experiment: We give a n = 16 memory patients a memory improvement treatment. How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? After the treatment they have an average score of = 55 memory errors. = 0.05 One -tailed Step 3: Collect your data H 0 : Treatment > pop > 60 H A : Treatment < pop < 60
Statistics for the Social Sciences “Generic” statistical test An example: One sample z-test Memory example experiment: We give a n = 16 memory patients a memory improvement treatment. How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? After the treatment they have an average score of = 55 memory errors. = 0.05 One -tailed Step 4: Compute your test statistics = -2.5 H 0 : Treatment > pop > 60 H A : Treatment < pop < 60
Statistics for the Social Sciences “Generic” statistical test An example: One sample z-test Memory example experiment: We give a n = 16 memory patients a memory improvement treatment. How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? After the treatment they have an average score of = 55 memory errors. = 0.05 One -tailed Step 5: Make a decision about your null hypothesis 5% Reject H 0 H 0 : Treatment > pop > 60 H A : Treatment < pop < 60
Statistics for the Social Sciences “Generic” statistical test An example: One sample z-test Memory example experiment: We give a n = 16 memory patients a memory improvement treatment. How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? After the treatment they have an average score of = 55 memory errors. = 0.05 One -tailed Step 5: Make a decision about your null hypothesis - Reject H 0 - Support for our H A, the evidence suggests that the treatment decreases the number of memory errors H 0 : Treatment > pop > 60 H A : Treatment < pop < 60