Lecture 6: Let’s Start Inferential Stats Probability and Samples: The Distribution of Sample Means.

Slides:



Advertisements
Similar presentations
Chapter 7: The Distribution of Sample Means
Advertisements

Probability and Samples: The Distribution of Sample Means
Chapter 6 Sampling and Sampling Distributions
A Sampling Distribution
Sampling: Final and Initial Sample Size Determination
Statistics and Quantitative Analysis U4320
Sampling Distributions
Central Limit Theorem.
Sampling Distributions. Review Random phenomenon Individual outcomes unpredictable Sample space all possible outcomes Probability of an outcome long-run.
Research Methods in MIS: Sampling Design
Chapter 5 Introduction to Inferential Statistics.
Topics: Inferential Statistics
Chapter 7 Sampling and Sampling Distributions
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
Today’s Agenda Review Homework #1 [not posted]
Chapter Sampling Distributions and Hypothesis Testing.
Sampling Distributions
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Part III: Inference Topic 6 Sampling and Sampling Distributions
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 5 Chicago School of Professional Psychology.
BCOR 1020 Business Statistics
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Chapter 7 Probability and Samples: The Distribution of Sample Means
Chapter 11: Random Sampling and Sampling Distributions
Probability and the Sampling Distribution Quantitative Methods in HPELS 440:210.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Hypothesis Testing:.
From Last week.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 7. Using Probability Theory to Produce Sampling Distributions.
EDUC 200C Friday, October 26, Goals for today Homework Midterm exam Null Hypothesis Sampling distributions Hypothesis testing Mid-quarter evaluations.
LECTURE 16 TUESDAY, 31 March STA 291 Spring
Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Understanding the scores from Test 2 In-class exercise.
Outline I.What are z-scores? II.Locating scores in a distribution A.Computing a z-score from a raw score B.Computing a raw score from a z- score C. Using.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Sampling distributions chapter 7 ST210 Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Chapter 7 Probability and Samples: The Distribution of Sample Means
The Normal Curve Theoretical Symmetrical Known Areas For Each Standard Deviation or Z-score FOR EACH SIDE:  34.13% of scores in distribution are b/t the.
Chapter 9 Probability. 2 More Statistical Notation  Chance is expressed as a percentage  Probability is expressed as a decimal  The symbol for probability.
Determination of Sample Size: A Review of Statistical Theory
Chapter 7 Probability and Samples: The Distribution of Sample Means.
Distributions of the Sample Mean
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Chapter 10: Introduction to Statistical Inference.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
© aSup-2007 THE DISTRIBUTION OF SAMPLE MEANS   1 Chapter 7 THE DISTRIBUTION OF SAMPLE MEANS.
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 16, 2009.
From the population to the sample The sampling distribution FETP India.
Sampling Distributions: Suppose I randomly select 100 seniors in Anne Arundel County and record each one’s GPA
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Chapter 7: The Distribution of Sample Means
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Lecture 13 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
The Statistical Imagination Chapter 7. Using Probability Theory to Produce Sampling Distributions.
And distribution of sample means
Chapter 7 Probability and Samples
Probability and the Sampling Distribution
Chapter 7: The Distribution of Sample Means
Chapter 4 (cont.) The Sampling Distribution
Presentation transcript:

Lecture 6: Let’s Start Inferential Stats Probability and Samples: The Distribution of Sample Means

Let’s Do an Experiment Imagine a jar filled with marbles. 2/3 of the marbles are one color and the remaining 1/3 is a different color. –Sample 1: N = 5; red = 4, white = 1 (80% red). –Sample 2: N = 20; red = 12, white = 8 (60% red). Which sample are you more confident came from a population of 2/3 red and 1/3 white balls? Why? Tversky and Kahneman (1974) found that most people tend to focus on the sample proportion than the sample size, but when asked how many balls they would like to select to make their decision people preferred the opportunity to select 20 v. 5.

Today’s Goal First from z-scores and probabilities we KNOW: –how scores relate to each other in a distribution –how an individual score relates to its population –Where scores fit into their distributions (probabilities) Are they representation Are they extreme? To understand the relationship between samples and populations But…we only know about samples that are made up of a single individual score –Most researcher take much larger samples E.g. 100 specimen, 30 dogs, 50 math scores

Populations Members must share at least 1 trait The more traits, –the lower the ability to generalize –the smaller the population size Samples Greater the n, the more accurate the parameter estimate (more chances you’ve got to accurately represent the population) Representative Sample: sample which possess all the defining characteristics of the population from which it was drawn

For Example: Say we want to learn about college students at the UA. We randomly choose 30 students at the UA. –We chose our sample randomly it should be pretty representative of the population of students at the UA, but we may be missing some segments of the population (e.g. what if by chance our sample includes no Christians, or any international students?) –Any corresponding stats we compute for the sample will also not be identical to the corresponding parameters –What if we choose another random sample?

How do we know how closely our sample represents our population? Z-scores: where a single score lies in its population AND where a sample mean lies in its population. Samples give use an incomplete and often inaccurate picture of our population, so we keep track of sampling error: –Sampling error: the discrepancy or amount of error between a sample statistic and its population parameter.

Sampling Error Samples never precisely reflect the population The difference between the parameter & statistic is sampling error (  - M) Sampling error is expected & normal p (+ sampling error) = p (- sampling error) –Some samples overestimate and some underestimate –This error should be random f

Distribution of Sample Means 2 samples taken from the same population will probably be different –Different individuals -- Different means –Different scores-- Different standard deviations Given that we can take some extremely large # of samples…what pattern might these samples show? Distribution of Sample Means (or sampling distribution) - all the possible random samples of a particular size (n) that can be taken from a population So, we can compute probabilities p(particular sample) = # particular sample/all samples

Distribution of Sample Means Sampling distribution - is a distribution of statistics (means of samples). Consider a population of 4 scores: 2, 4, 6, 8 f * See also…Box 7.1 in the Book…page 205

Predictions What would we expect if we created a distribution of all the possible n = 2 samples of our data set –Sample means won’t always be perfect, but should pile around pop. mean –Should start to form a normal distribution b/c most of the sample means should pile around the pop. Mean, only a few should be extreme –Larger the sample size the closer the sample mean should be to the population mean b/c a larger sample should be more representative

f * If we chose samples of n = 2, then we can have a total of 16 different possible samples  = 5 Sample1st score 2nd score M f (1) Sample means pile around pop. mean. (they are representative.) (2) Distribution is ~normal (3) We can use this sample distribution to answer probability questions. e.g. What is the probability of obtaining a sample less than 3? P (M < 3) = 1/16 or 0.06

Central Limit Theorem Not reasonable to take all the possible samples in a pop. Usually we just take one. Central Limit Theorem - general characteristics about the sample mean –For any population with mean  and standard deviation , the distribution of sample means for sample size n will have a mean of  and a standard deviation of  /  n, and will approach a normal distribution as n approaches infinity.

Central Limit Theorem Perks: Describes the distribution of sample means for any population regardless of original shape, mean or standard deviation –Shape –Central tendency –Variability Important mathematical finding: –Sampling distribution of mean has a mean = population mean and variance = population variance/n

A little more about the shape and mean of the distribution of sample means Shape: –Normal if the samples come from a population that is normal –Normal if the number of scores (n) in each sample is around 30 or more. –What does this mean for research Mean: –Average of all sample means = population mean. –This mean value is call the expected value of M. –  M (b/c this value will always be equal to , this book will just use  to refer to the mean for both the pop. and the mean for the distribution of sample means

Standard Error of M Standard deviation for a distribution of sample means is called standard error of M. Just as individual scores vary from the sample mean (standard deviation), the sample means comprising the sampling distribution vary from  –This measure the average error between the sample and the population. IMPORTANT! –Standard error =  M = standard distance btwn M and .

Standard Error How do we determine standard error (SE)? (1) Sample size: law of large numbers = the larger the sample size (n), the more probable the sample mean will be close to the population mean. –The > sample the SE (2) Standard deviation: standard error =  x =  /  n =  2 /  n –By definition  is the standard distance between X and , so when n = 1 the SD and SE are the same. –So SE should be the starting point for standard error. When n = 1 SD = SE and as n SE

More about Standard Error In sum the standard error provides a way to measure the “average” distance between the sample mean and the population mean Research: –Typically uses only 1 sample –What if we had chosen a different sample, see standard error as a measure of reliability –When we calculate the standard error of that sample mean we can get an idea of how closely our sample means represents the population mean. –Critical element in the inferential process

Illustration - Let’s do it! We start with IQ, a population that is normally distributed with a  = 100 and a  = samples are taken where: –n = 1 –n = 5 –n= 30 –n = 100 First calculate the SE =  /  n

Let’s Do it! n = 1; SE = 15/ = 15 n = 5; SE = 15/ = 6.7 n = 30; SE = 15/ = 2.74 n = 100; SE = 15/ = 1.5 Volunteer to come up and sketch each of these normal distributions with a line denoting the mean and a line denoting standard error. What does this tell us about how researchers should select their samples? 11  5  100  30

Z-scores for sample means We can compute the probability of obtaining a particular sample mean using z-scores Just like last week only now with samples located in a distribution of sample of means Z = ______  M M - 

Let’s Do it The sampling distribution of home prices for metro Boston yields a mean price of $250K, with an standard deviation $100K. What is the probability of randomly selecting a sample of n = 4 houses whose mean price is below $185K? * Remember probability is equivalent to proportion…use your z unit table. * Also, start by drawing the distribution.

One More… An automobile manufacturer claims that a newly introduced model will average mean MPG of 45 with a standard deviation of 2. A sample of n = 4 cars is tested and averages on M = 42 MPA. Is this sample mean likely to occur if the manufacturer’s claim is true? More specifically is the sample mean within the range of values that would be expected 95% of the time?

Z - Scores and Sample Means It is possible to find the probability associated with any specific sample mean. We can make quantitative predictions about the kinds of samples obtained from any population. You could use Z-scores for your projects: –Look up the average price of a car of a particular year on the net. Go car hunting and find a few cars made in that year, what was the probability of finding a car of that price. –Do people in my family (using a sample of grandparents and great-grandparents ) live longer than the average (look up average lifespan on the net)

Further application of Z-tests z problems can be used in conjunction with probability estimates to determine if a sample is from a known population. The.05 convention in statistics states that if there is 5% or less chance that a sample comes from a particular population, then it can be concluded that this sample does not represent this population

Class Problem The sampling distribution of IQ scores has a  = 100 &  = 15. You have randomly sampled 50 students and found their mean IQ = 107. Are these students smarter than average? Was this sample larger than the mean due to chance or does this sample represent students that are truly smarter.

In the Literature Because standard error plays an important role in inferential stats you’ll see it reported in scientific papers: Symbols: SE and SEM Tables: n Mean SE Control Treatment

In the Literature

Homework: Chapter 7 3, 5, 6, 7, 9, 10, 14, 16, 20, 24, 25, 26 Please also read Box 7.3 for more on the difference between standard error and standard deviation. Page 211.