Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.

Slides:



Advertisements
Similar presentations
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 10/23/12 Sections , Single Proportion, p Distribution (6.1)
Advertisements

Hypothesis Testing: Intervals and Tests
INFERENCE: SIGNIFICANCE TESTS ABOUT HYPOTHESES Chapter 9.
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 Randomization distribution p-value.
Statistics: Unlocking the Power of Data Lock 5 Inference Using Formulas STAT 101 Dr. Kari Lock Morgan Chapter 6 t-distribution Formulas for standard errors.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 10/25/12 Sections , Single Mean t-distribution (6.4) Intervals.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
PSY 307 – Statistics for the Behavioral Sciences
Today Today: Chapter 10 Sections from Chapter 10: Recommended Questions: 10.1, 10.2, 10-8, 10-10, 10.17,
Inferences About Means of Single Samples Chapter 10 Homework: 1-6.
STAT 101 Dr. Kari Lock Morgan Exam 2 Review.
Chapter 11: Inference for Distributions
5-3 Inference on the Means of Two Populations, Variances Unknown
Quantitative Business Methods for Decision Making Estimation and Testing of Hypotheses.
Inference for Categorical Variables 2/29/12 Single Proportion, p Distribution Intervals and tests Difference in proportions, p 1 – p 2 One proportion or.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Confidence Intervals and Hypothesis Tests
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
4.1 Introducing Hypothesis Tests 4.2 Measuring significance with P-values Visit the Maths Study Centre 11am-5pm This presentation.
Overview Definition Hypothesis
More Randomization Distributions, Connections
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution Central limit theorem Normal.
Normal Distribution Chapter 5 Normal distribution
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
More About Significance Tests
Essential Synthesis SECTION 4.4, 4.5, ES A, ES B
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Comparing Two Population Means
Inference for Means (C23-C25 BVD). * Unless the standard deviation of a population is known, a normal model is not appropriate for inference about means.
Estimation: Sampling Distribution
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/18/12 Confidence Intervals: Bootstrap Distribution SECTIONS 3.3, 3.4 Bootstrap.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
Inference for Quantitative Variables 3/12/12 Single Mean, µ t-distribution Intervals and tests Difference in means, µ 1 – µ 2 Distribution Matched pairs.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
The Practice of Statistics Third Edition Chapter 13: Comparing Two Population Parameters Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
7. Comparing Two Groups Goal: Use CI and/or significance test to compare means (quantitative variable) proportions (categorical variable) Group 1 Group.
Lesson Comparing Two Means. Knowledge Objectives Describe the three conditions necessary for doing inference involving two population means. Clarify.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/6/12 Synthesis Big Picture Essential Synthesis Bayesian Inference (continued)
Monday, October 22 Hypothesis testing using the normal Z-distribution. Student’s t distribution. Confidence intervals.
Inferring the Mean and Standard Deviation of a Population.
Statistics: Unlocking the Power of Data Lock 5 Section 4.5 Confidence Intervals and Hypothesis Tests.
Applied Quantitative Analysis and Practices LECTURE#14 By Dr. Osman Sadiq Paracha.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 2.
Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Monday, October 21 Hypothesis testing using the normal Z-distribution. Student’s t distribution. Confidence intervals.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Estimation: Confidence Intervals SECTION 3.2 Confidence Intervals (3.2)
Two Sample Problems  Compare the responses of two treatments or compare the characteristics of 2 populations  Separate samples from each population.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 9.1 Inference for correlation Inference for.
Statistics: Unlocking the Power of Data Lock 5 Section 6.11 Confidence Interval for a Difference in Means.
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter 9 Roadmap Where are we going?.
Section 4.5 Making Connections.
Paired Difference in Means
Lesson Comparing Two Means.
Chapter 11: Inference About a Mean
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution Formulas for standard errors t based inference

Statistics: Unlocking the Power of Data Lock 5 Central Limit Theorem For a sufficiently large sample size, the distribution of sample statistics for a mean or a proportion is normal For means, “sufficiently large” is often n ≥ 30 If the data are normal, smaller n will be sufficient If the data are skewed and/or have outliers, n may have to be much higher than 30

Statistics: Unlocking the Power of Data Lock 5 ParameterDistributionStandard Error Proportion Normal Difference in Proportions Normal Meant, df = n – 1 Difference in Meanst, df = min(n 1, n 2 ) – 1 Standard Error Formulas

Statistics: Unlocking the Power of Data Lock 5 The standard error for a sample mean can be calculated by SE of a Mean

Statistics: Unlocking the Power of Data Lock 5 The standard deviation of the population is a) σ b) s c) Standard Deviation

Statistics: Unlocking the Power of Data Lock 5 The standard deviation of the sample is a) σ b) s c) Standard Deviation

Statistics: Unlocking the Power of Data Lock 5 The standard deviation of the sample mean is a) σ b) s c) Standard Deviation

Statistics: Unlocking the Power of Data Lock 5 For quantitative data, we use a t-distribution instead of the normal distribution This arises because we have to estimate the standard deviations The t distribution is very similar to the standard normal, but with slightly fatter tails (to reflect the uncertainty in the sample standard deviations) t-distribution

Statistics: Unlocking the Power of Data Lock 5 The t-distribution is characterized by its degrees of freedom (df) Degrees of freedom are based on sample size Single mean: df = n – 1 Difference in means: df = min(n 1, n 2 ) – 1 Correlation: df = n – 2 The higher the degrees of freedom, the closer the t-distribution is to the standard normal Degrees of Freedom

Statistics: Unlocking the Power of Data Lock 5 t-distribution

Statistics: Unlocking the Power of Data Lock 5 Aside: William Sealy Gosset

Statistics: Unlocking the Power of Data Lock 5 Question of the Day How do pheromones in female tears affect men?

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears Tears were collected from human females (they watched a sad movie in isolation) Cotton pads were created that had either real female tears or a salt solution that had been dripped down the same female’s face 50 men had a pad attached to their upper lip twice, once with tears and once without, order randomized. (matched pairs design!) Many variables were measured; we will look first at testosterone level Gelstein, et. al. (2011) “Human Tears Contain a Chemosignal," Science, 1/6/11.Human Tears Contain a Chemosignal

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears

Statistics: Unlocking the Power of Data Lock 5 Matched Pairs For a matched pairs experiment, we look at the differences for each pair, and do analysis on this one quantitative variable Inference for a single mean (mean difference)

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears The average difference in testosterone levels between tears and no tears was pg/ml The average level before sniffing the pads (tears or saline) was 155 pg/ml The standard deviation of these differences was 46.5 pg/ml The sample size was 50 men “pg” = picogram = nanogram = gram

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears Do tears lower testosterone levels?  Hypothesis test! By how much?  Confidence interval!

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears: Test This provides strong evidence that female tears decrease testosterone levels in men, on average. 1. State hypotheses: 2. Check conditions: 3. Calculate standard error: 5. Compute p-value: 5. Interpret in context: 4. Calculate t-statistic:

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears: CI We are 95% confident that female tears on a cotton pad on a man’s upper lip decrease testosterone levels between 8.54 and pg/ml, on average. 1. Check conditions: 2. Find t * : 3. Compute the confidence interval: 4. Interpret in context: t* = 2

Statistics: Unlocking the Power of Data Lock 5 Pheromones in Tears Can we conclude that something in the tears (as opposed to just saline trickled down the face) causes this decrease in testosterone levels? a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Sexual Attraction They also had 24 men rate faces on two attributes: sad and sexually arousing For sexual arousal, the mean was 439 VAS (visual-analog scale) for tears and 463 VAS for saline The standard deviation of the differences was 47 VAS Is this evidence that tears change sexual arousal ratings? (a)Yes(b) No Give a 90% CI.

Statistics: Unlocking the Power of Data Lock 5 Sexual Arousal: Test

Statistics: Unlocking the Power of Data Lock 5 Sexual Arousal: CI

Statistics: Unlocking the Power of Data Lock 5 Testosterone Revisited What if we had (accidentally) ignored the paired structure of the data, and analyzed it as two separate groups? Two separate groups: one quantitative and one categorical (look at difference in means) Paired data: one quantitative (differences), look at mean difference How does this affect inference?

Statistics: Unlocking the Power of Data Lock 5 Paired vs Unpaired If we were to mistakenly analyze this data as two separate groups rather than paired data, we would expect the sample statistic to a) Increase b) Decrease c) Not change

Statistics: Unlocking the Power of Data Lock 5 Paired vs Unpaired If we were to mistakenly analyze this data as two separate groups rather than paired data, we would expect the standard error to a) Increase b) Decrease c) Not change

Statistics: Unlocking the Power of Data Lock 5 Paired vs Unpaired If we were to mistakenly analyze this data as two separate groups rather than paired data, we would expect the p-value to a) Increase b) Decrease c) Not change

Statistics: Unlocking the Power of Data Lock 5 Paired vs Unpaired If we were to mistakenly analyze this data as two separate groups rather than paired data, we would expect the width of the confidence interval to a) Increase b) Decrease c) Not change

Statistics: Unlocking the Power of Data Lock 5 PairedUnpaired This provides strong evidence that female tears decrease testosterone levels in men. that female tears decrease testosterone levels in men. ≠ df = 49 p-value = df = 49 p-value = Test

Statistics: Unlocking the Power of Data Lock 5 Paired Unpaired statistic ± t* × SE 95%: statistic ± 2 × SE ± 2 × 6.58 (-34.86, -8.54) ± 2 × Confidence Interval

Statistics: Unlocking the Power of Data Lock 5 Paired vs Unpaired What if we had ignored the paired structure of the data, and analyzed it as two separate groups? How does this affect inference? If a study is paired, analyze the differences!

Statistics: Unlocking the Power of Data Lock 5 To Do Read Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 Do HW 6 (due Friday, 11/6)

Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Intervals and Tests Section 4.5 Connecting intervals and tests

Statistics: Unlocking the Power of Data Lock 5 Bootstrap and Randomization Distributions Bootstrap DistributionRandomization Distribution Our best guess at the distribution of sample statistics Our best guess at the distribution of sample statistics, if H 0 were true Centered around the observed sample statistic Centered around the null hypothesized value Simulate sampling from the population by resampling from the original sample Simulate samples assuming H 0 were true Big difference: a randomization distribution assumes H 0 is true, while a bootstrap distribution does not

Statistics: Unlocking the Power of Data Lock 5 Which Distribution?

Statistics: Unlocking the Power of Data Lock 5 Which Distribution? Intro stat students are surveyed, and we find that 152 out of 218 are female. Let p be the proportion of intro stat students at that university who are female. A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2. Which distribution is the randomization distribution?

Statistics: Unlocking the Power of Data Lock 5 Intervals and Tests A confidence interval represents the range of plausible values for the population parameter If the null hypothesized value IS NOT within the CI, it is not a plausible value and should be rejected If the null hypothesized value IS within the CI, it is a plausible value and should not be rejected

Statistics: Unlocking the Power of Data Lock 5 Intervals and Tests If a 95% CI misses the parameter in H 0, then a two-tailed test should reject H 0 at a 5% significance level. If a 95% CI contains the parameter in H 0, then a two-tailed test should not reject H 0 at a 5% significance level.

Statistics: Unlocking the Power of Data Lock 5 Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged in 2010 who say yes. A 95% CI for p is (0.487, 0.573). Testing H 0 : p = 0.5 vs H a : p ≠ 0.5 with α = 0.05, we a) Reject H 0 b) Do not reject H 0 c) Reject H a d) Do not reject H a millennials-parenthood-trumps-marriage/#fn

Statistics: Unlocking the Power of Data Lock 5 Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged in 1997 who say yes. A 95% CI for p is (0.533, 0.607). Testing H 0 : p = 0.5 vs H a : p ≠ 0.5 with α = 0.05, we a) Reject H 0 b) Do not reject H 0 c) Reject H a d) Do not reject H a millennials-parenthood-trumps-marriage/#fn

Statistics: Unlocking the Power of Data Lock 5 Intervals and Tests Confidence intervals are most useful when you want to estimate population parameters Hypothesis tests and p-values are most useful when you want to test hypotheses about population parameters Confidence intervals give you a range of plausible values; p-values quantify the strength of evidence against the null hypothesis

Statistics: Unlocking the Power of Data Lock 5 Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? On average, how much more do adults who played sports in high school exercise than adults who did not play sports in high school? a) Confidence interval b) Hypothesis test c) Statistical inference not relevant

Statistics: Unlocking the Power of Data Lock 5 Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? Do a majority of adults take a multivitamin each day? a) Confidence interval b) Hypothesis test c) Statistical inference not relevant

Statistics: Unlocking the Power of Data Lock 5 Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? Did the Penn State football team score more points in 2014 or 2013? a) Confidence interval b) Hypothesis test c) Statistical inference not relevant

Statistics: Unlocking the Power of Data Lock 5 To Do HW Intervals & Tests (due Friday, 11/6)