Sampling Distributions Suppose I throw a dice 10000 times and count the number of times each face turns up: Each score has a similar frequency (uniform.

Slides:



Advertisements
Similar presentations
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 6 Hypothesis Tests with Means.
Advertisements

Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
THE CENTRAL LIMIT THEOREM
Probability and Samples: The Distribution of Sample Means
One sample means Testing a sample mean against a population mean.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Sampling distributions. Example Take random sample of students. Ask “how many courses did you study for this past weekend?” Calculate a statistic, say,
QUANTITATIVE DATA ANALYSIS
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Topics: Inferential Statistics
Lecture 9: One Way ANOVA Between Subjects
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Chapter 7 Probability and Samples: The Distribution of Sample Means
Central Tendency and Variability
Chapter 11: Random Sampling and Sampling Distributions
Chapter 4 SUMMARIZING SCORES WITH MEASURES OF VARIABILITY.
QUIZ CHAPTER Seven Psy302 Quantitative Methods. 1. A distribution of all sample means or sample variances that could be obtained in samples of a given.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ μ.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Random Sampling, Point Estimation and Maximum Likelihood.
Individual values of X Frequency How many individuals   Distribution of a population.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
Chi Square Analysis  Use to see if the observed value varies from the expected value.  Null Hypothesis – There is no difference between the observed.
Sampling W&W, Chapter 6. Rules for Expectation Examples Mean: E(X) =  xp(x) Variance: E(X-  ) 2 =  (x-  ) 2 p(x) Covariance: E(X-  x )(Y-  y ) =
9.3: Sample Means.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Anthony J Greene1 Where We Left Off What is the probability of randomly selecting a sample of three individuals, all of whom have an I.Q. of 135 or more?
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 7 - Sampling Distribution of Means.
Wednesday, October 17 Sampling distribution of the mean. Hypothesis testing using the normal Z-distribution.
Chapter 9 Probability. 2 More Statistical Notation  Chance is expressed as a percentage  Probability is expressed as a decimal  The symbol for probability.
Chapter 7 Probability and Samples: The Distribution of Sample Means.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
Quick and Simple Statistics Peter Kasper. Basic Concepts Variables & Distributions Variables & Distributions Mean & Standard Deviation Mean & Standard.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Chapter 7 Statistical Inference: Estimating a Population Mean.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 6 Hypothesis Tests with Means.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
Statistics for Political Science Levin and Fox Chapter Seven
© 2010 Pearson Prentice Hall. All rights reserved Chapter Sampling Distributions 8.
From the population to the sample The sampling distribution FETP India.
Distributions of Sample Means. z-scores for Samples  What do I mean by a “z-score” for a sample? This score would describe how a specific sample is.
366_7. T-distribution T-test vs. Z-test Z assumes we know, or can calculate the standard error of the distribution of something in a population We never.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Sample Means. Parameters The mean and standard deviation of a population are parameters. Mu represents the population mean. Sigma represents the population.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Sampling Distributions
Outline Sampling Measurement Descriptive Statistics:
SUR-2250 Error Theory.
Chapter 7 Review.
1. According to ______ the larger the sample, the closer the sample mean is to the population mean. (p. 251) Murphy’s law the law of large numbers the.
INF397C Introduction to Research in Information Studies Spring, Day 12
Sampling distribution of the mean.
Distribution of the Sample Means
Central Tendency and Variability
Summary descriptive statistics: means and standard deviations:
Wednesday, October 14 Sampling distribution of the mean.
Summary descriptive statistics: means and standard deviations:
Chapter 7: The Distribution of Sample Means
Skills 5. Skills 5 Standard deviation What is it used for? This statistical test is used for measuring the degree of dispersion. It is another way.
Presentation transcript:

Sampling Distributions Suppose I throw a dice times and count the number of times each face turns up: Each score has a similar frequency (uniform distribution)

Sampling Distributions If instead you throw the dice 10 times (or throw ten dice) and take the average score each time, you get something like this: } { 10

Sampling Distributions Compare averaging 10 vs 20 throws each go:

10 x 20 x Note what happens to the spread and shape of the distribution of average scores 1 x

1.Central Limit Theorem This is a theorem of statistics and probability that implies that the distribution of a sum (or average) of any set of scores approaches a Normal Distribution as the number of scores involved in the sum (or average) gets larger and larger. Single ThrowsAverage of N Throws Light Bulb Life Average Life

2. Relation between the variation between individual scores and the variation between the averages of several scores. If the individual scores (values) in a population have a Variance of X then the variance of the averages of samples of size n has a variance of X/10. This is intuitive – think of individual heights:

Population distribution of Individual Heights  5’5” 4’0” 7’0”   (Population SD) approx = 7”

Population distribution of raw scores 68% of scores lie within 1 standard deviation Of the mean 5’5”  4’0”7’0”  68% of people have a height between 4’10” and 6’0” 6’0”4’10”

Suppose we take a random sample of 10 people and measure their heights:  5’5” 4’0” 7’0” The mean of the sample (x ) will tend to be quite close to the average height: x

Keep taking samples of 10 people and measure average height:  5’5” 4’0” 7’0” x Back to17

Keep taking samples of 50 people and measure average height:  5’5” 4’0” 7’0” x Back to17

Distribution of Sample Means x cluster around the population mean  more closely than the raw scores do:  5’5” 4’0” 7’0”

The degree of spread (standard deviation of the sample means) around the population mean depends on the number (n) in each sample.  5’5” 4’0” 7’0” n=10 n=20 n=30

Variance and SD As we observed before the Variance of sample means is the variance of the population of individual scores divided by the sample size. Because the Standard Deviation is the square root of the Variance, the Standard Deviation of the sample means is equal to the Standard Deviation of the individual scores divided by the square root of the sample size.

The amount of variation (standard deviation of the sample means) around the population mean depends on the number (n) in each sample. The standard deviation of sample means of size n around the population mean  is equal to the population standard deviation divided by √n and is called the standard error of the mean (se)

 5’5” 4’0” 7’0” Raw scores SD= 7” Samples of size 10 SD of the sample means = 7/sqrt(10) = 7/3.16 = 2.2 7” 2.2”

Quick Summary We get an idea of the amount of variation in the population of individual scores from the variation within our sample (i.e. the data). Given that our sample average is from x number of scores we know how the sample averages would be expected to vary from one sample to the next.

T-Test The T-Test works by assuming the data collected in two conditions is equivalent to collecting two samples from the same ‘parent’ population (this is the null hypothesis). The variation within the data is a good estimate of the variation in the parent population. This, together with the size of the samples, allows one to predict how much variation to expect in the means of one sample to the next. E.g.

T Test If the two sample means obtained in the experiment conditions vary by more than we’d expect from this simple relation between the variation of individual scores and sample averages then it is unlikely that the data in the two conditions is equivalent to two samples from the same parent population. It is more likely they reflect two samples from different parent populations (i.e. one’s with different means)

I.e. if the data does reflect samples from the same population we expect our samples, say of size 10, to cluster around the population mean quite closely:  5’5” 4’0” 7’0” Parent population of individual scores Expected variation of samples of size 10

Not:  5’5” 4’0” 7’0” Expected variation of samples of size 10 Parent population of individual scores

It is more likely that the real situation is that the two samples come from different parent populations:   5’5” 4’0” 7’0”   6’5”

So an experiment selects 8 babies at random and feeds half Marmite and half Bovril. Heights measured at 20 years. Vs.

  6’5” It is more likely that the real situation is that the two samples come from different parent populations:   5’5” 4’0” 7’0”

T-Test & ANOVA The T-Test works by computing the likelihood of getting a certain difference between two sample means. If you have experiments with more than 2 conditions there is no single distance between two means. Instead you can examine the ‘average’ distance or variation between them. The Variance of those condition means is just such a measure. ANOVA works out how likely it is to get the observed amount of variation (Variance) between several sample means if they really had been drawn from the same parent population.

In a nutshell The data from the conditions of an experiment can be conceptualised as samples from a parent population. The null hypothesis assumes that these samples have been drawn from a single population. If the variation (or just difference in the case of a T-Test) between the means of these ‘samples’ is greater than we would expect given the samples size used, then we conclude that it is unlikely that they can be thought of as having been drawn from a single population but instead come from separate ones (i.e. ones that have different means).

Some minor details: The T-test actually works out the sampling distribution of the difference between two means. When the probability of getting the observed difference is less than 5% H0 is rejected – i.e. the two populations from which the means were drawn are assumed not to be equal. ANOVA works out: 1. How the sample means vary and 2. How they should vary given their size and the individual variation If these two estimates differ widely then H0 is rejected.