Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
Announcements (1of2) Special Lecture Thurs March 27th: Reviewing: Random Variables & Distributions Mandatory for invited students Everyone is welcome No OH; Go to lab for Qs/help. Target Dates: Should have completed Distributions and CLT before Spring Break Be well through Confidence Intervals and Hypothesis Tests (Target date April 7th)
Announcements (2of2) Assessment Next Week Same procedure as last time. AL1: Monday, Rm 289 between 9-5 BL1: Wednesday, Rm 289 between 9-5 Can schedule a specific time by attending lab section or contacting TA Remember: Bring photo ID Questions?
Also… Thank you to those who completed feedback forms. We’ll all be meeting to discuss what changes to the course we can implement immediately. More details on that soon…
So where were we?? We’d been discussing random variables and distributions. Last week, you talked about confidence intervals. Today, we’re going to start talking about hypothesis testing. But first let’s remember what we know about populations and sampling distributions.
Population Sample Sampling Distribution size = n (of the mean) We know from CLT, no matter what the underlying distribution, a sampling distribution will approach normal with a large N. This is an important underlying assumption that allows most of inferential statistics.
So far… When talking about distributions, we did problems like… Given a certain population, what’s probability of getting a sample statistic above/below a certain value? Population--->Sample Now we’ve shifted to… Using our Sample to reason about the POPULATION? Sample--->Population We’ve made the jump to Inferential Statistics!
But, there’s a small problem… Do you remember what it was? The sample statistic is a point estimate of the population parameter Because it ’ s just an estimator, it could be off by a little … or a lot.
Population Sample Sampling Distribution size = n (of the mean) We only have one sample statistic. And we don’t know where in here it falls.
Confidence Intervals One way to resolve this is by creating a Confidence Interval. An interval around the sample statistic that would capture the true population parameter a certain percent of the time (e.g., 95%) in the long run. (i.e., over all samples of the same size, from the same population)
CONFIDENCE INTERVAL (1 - )% confidence interval for a population parameter Point estimate ± critical value Std. dev. of point estimate · p( C. I. encloses true population parameter ) = 1 - sample statistic Margin of Error (aka “Standard Error”) standard deviation of sampling distribution or Note: This is the structure of all CI. Just plug in the appropriate critical value and St. Dev
For example, For the Population Mean ± · Margin of Error When known. CI using Standard Normal Distribution:
Decision Tree for Confidence Intervals Population Standard Deviation known? Yes No Pop. Distribution normal? n large? (CLT) Yes No Yes No Yes No Yes No z-score Can’t do it t-score Critical Score Standard normal distribution t distribution
Hypothesis Testing With Confidence Intervals, we used the sample statistic to create a range within which the true population mean was likely to fall. In hypothesis testing, we pose a question (usually whether a sample falls within a specified population or belongs to a different population), set a decision criterion, complete calculations, and then reject or accept our hypothesis
Defining our hypothesis H 0 = the Null hypothesis Usually designed to be the situation of no difference The hypothesis we test. H 1 = the alternative hypothesis Usually the research related hypothesis That’s funny. The hypothesis we test is the opposite of what we hope to show. We can never prove something to be true, only false (Everyone has 2 arms?) It also provides a standard starting point.
For example, What is the Null hypothesis in the following situations? A principle believes that an extra 20 minutes of recess will make her 3rd graders perform better on standardized tests than other third graders. You want to test whether people living west of the Mississippi are taller than those living east.
Then, Once you’ve chosen your hypothesis you determine a decision criterion or cutoff value ( c ). If your observed mean falls beyond that criterion then you reject the null. But how do we determine the criterion?
Decision Criterion You can just choose a cutoff value that is reasonable or makes sense. However, typically we select a level of and calculate what the cutoff value should be given our distribution (in many ways this is a lot like finding a z-score at a certain p-value) Quick check… What does mean, again?
Example: An experimenter wants to know if individuals’ pupils are more dilated when viewing pictures of the opposite sex. He knows that the average pupil dilation is 4 millimeters with a standard deviation of 1.5 mm. The experimenter decides to set a cutoff value so that 99% of observations fall below that cutoff value. How would we set this problem up? Define Hypothesis One-tailed or Two? Determine cutoff value
But what about the other 5% If you noticed, 5% of the time observations that are in the population will fall above the cutoff value. This is a type of error in statistical tests… and it is very very important to keep in mind. (Its also why we try to limit the number of statistical tests we perform, and why we replicate studies.)
ERRORS Type I errors ( ): rejecting the null hypothesis given that it is actually true; e.g., A court finding a person guilty of a crime that they did not actually commit. Type II errors ( ): failing to reject the null hypothesis given that the alternative hypothesis is actually true; e.g., A court finding a person not guilty of a crime that they did actually commit.
Type I and Type II errors Power (1-
Possible Outcomes of the Decision-Making Process DecisionH 0 TrueH 0 False Reject H 0 Type I Error p= Correct decision P = 1- = Power Don ’ t reject H 0 Correct Decision p= 1 - Type II Error P = State of the True World
Calculating p, instead of cutoff If we wanted to find the probability of obtaining a specific sample mean from a population, we just calculate that t or z value as before… - √ n Z =
Selecting a distribution
Remember Assessment Special Lecture Contact us with problems Good luck everyone!