7.1 Lecture 10/29.

Slides:



Advertisements
Similar presentations
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Advertisements

Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Inference for distributions: - for the mean of a population IPS chapter 7.1 © 2006 W.H Freeman and Company.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
Inference for distributions: - Comparing two means IPS chapter 7.2 © 2006 W.H. Freeman and Company.
BCOR 1020 Business Statistics Lecture 17 – March 18, 2008.
Inference for Distributions - for the Mean of a Population IPS Chapter 7.1 © 2009 W.H Freeman and Company.
Chapter 7 Sampling and Sampling Distributions
Tests of significance Confidence intervals are used when the goal of our analysis is to estimate an unknown parameter in the population. A second goal.
Lecture 6 Outline: Tue, Sept 23 Review chapter 2.2 –Confidence Intervals Chapter 2.3 –Case Study –Two sample t-test –Confidence Intervals Testing.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Chapters 9 (NEW) Intro to Hypothesis Testing
Inference for Distributions - for the Mean of a Population
8-4 Testing a Claim About a Mean
Horng-Chyi HorngStatistics II41 Inference on the Mean of a Population - Variance Known H 0 :  =  0 H 0 :  =  0 H 1 :    0, where  0 is a specified.
BPS - 5th Ed. Chapter 171 Inference about a Population Mean.
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
Two-sample problems for population means BPS chapter 19 © 2006 W.H. Freeman and Company.
5-3 Inference on the Means of Two Populations, Variances Unknown
Hypothesis Testing Using The One-Sample t-Test
Lecture Unit 5 Section 5.7 Testing Hypotheses about Means 1.
Chapter 8 Testing Hypotheses about Means 1. Sweetness in cola soft drinks Cola manufacturers want to test how much the sweetness of cola drinks is affected.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Statistical inference: confidence intervals and hypothesis testing.
+ DO NOW What conditions do you need to check before constructing a confidence interval for the population proportion? (hint: there are three)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.
The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
Chapter 11 Inference for Distributions AP Statistics 11.1 – Inference for the Mean of a Population.
IPS Chapter 7 © 2012 W.H. Freeman and Company  7.1: Inference for the Mean of a Population  7.2: Comparing Two Means  7.3: Optional Topics in Comparing.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Mid-Term Review Final Review Statistical for Business (1)(2)
1 Happiness comes not from material wealth but less desire.
Inference for a population mean BPS chapter 18 © 2006 W.H. Freeman and Company.
Inference for a population mean BPS chapter 16 © 2006 W.H. Freeman and Company.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Chapter 20 Confidence Intervals and Hypothesis Tests for a Population Mean  ; t distributions t distributions confidence intervals for a population mean.
Essential Statistics Chapter 161 Inference about a Population Mean.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 7 Inference for Distributions.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Inference for Distributions - for the Mean of a Population IPS Chapter 7.1 © 2009 W.H Freeman and Company.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 11.1 Estimating a Population Mean.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Lecture Slides Elementary Statistics Twelfth Edition
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Inference for distributions: - Comparing two means.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Chapter 7: The Distribution of Sample Means
+ Unit 5: Estimating with Confidence Section 8.3 Estimating a Population Mean.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates.
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter 9 Introduction to the t Statistic
Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 © 2009 W.H Freeman and Company.
Inference for distributions: - for the mean of a population IPS chapter 7.1 © 2006 W.H Freeman and Company.
Objectives 7.1 Inference for the mean of a population
Presentation transcript:

7.1 Lecture 10/29

When n is very large, s is a very good estimate of s, and the corresponding t distributions are very close to the normal distribution. The t distributions become wider for smaller sample sizes, reflecting the lack of precision in estimating s from s.

Standardizing the data before using Table D As with the normal distribution, the first step is to standardize the data. Then we can use Table D to obtain the area under the curve. t(m,s/√n) df = n − 1 t(0,1) df = n − 1 1 s/√n m t Here, m is the mean (center) of the sampling distribution, and the standard error of the mean s/√n is its standard deviation (width). You obtain s, the standard deviation of the sample, with your calculator.

Table D When σ is unknown, we use a t distribution with “n−1” degrees of freedom (df). Table D shows the z-values and t-values corresponding to landmark P-values/ confidence levels. When σ is known, we use the normal distribution and the standardized z-value.

Table A vs. Table D Table A gives the area to the LEFT of hundreds of z-values. It should only be used for Normal distributions. (…) (…) Table D gives the area to the RIGHT of a dozen t or z-values. It can be used for t distributions of a given df and for the Normal distribution. Table D Table D also gives the middle area under a t or normal distribution comprised between the negative and positive value of t or z.

The one-sample t-confidence interval The level C confidence interval is an interval with probability C of containing the true population parameter. We have a data set from a population with both m and s unknown. We use to estimate m and s to estimate s, using a t distribution (df n−1). Practical use of t : t* C is the area between −t* and t*. We find t* in the line of Table D for df = n−1 and confidence level C. The margin of error m is: C t* −t* m m

!!! Warning: do not use the function =CONFIDENCE(alpha, stdev, size) Excel Menu: Tools/DataAnalysis: select “Descriptive statistics” s/√n m !!! Warning: do not use the function =CONFIDENCE(alpha, stdev, size) This assumes a normal sampling distribution (stdev here refers to σ) and uses z* instead of t* !!!

The P-value is the probability, if H0 is true, of randomly drawing a sample like the one obtained or more extreme, in the direction of Ha. The P-value is calculated as the corresponding area under the curve, one-tailed or two-tailed depending on Ha: One-sided (one-tailed) Two-sided (two-tailed)

Table D For df = 9 we only look into the corresponding row. thus 0.02 > upper tail p > 0.01 For df = 9 we only look into the corresponding row. The calculated value of t is 2.7. We find the 2 closest t values. For a one-sided Ha, this is the P-value (between 0.01 and 0.02); for a two-sided Ha, the P-value is doubled (between 0.02 and 0.04).

Table D For df = 9 we only look into the corresponding row. thus 0.02 > upper tail p > 0.01 For df = 9 we only look into the corresponding row. The calculated value of t is 2.7. We find the 2 closest t values. For a one-sided Ha, this is the P-value (between 0.01 and 0.02); for a two-sided Ha, the P-value is doubled (between 0.02 and 0.04).

Excel TDIST(x, degrees_freedom, tails) TDIST = P(X > x) for a random variable X following the t distribution (x positive). Use it in place of Table C or to obtain the p-value for a positive t-value. X   is the standardized value at which to evaluate the distribution (i.e., “t”). Degrees_freedom   is an integer indicating the number of degrees of freedom. Tails   specifies the number of distribution tails to return. If tails = 1, TDIST returns the one-tailed p-value. If tails = 2, TDIST returns the two-tailed p-value. TINV(probability,degrees_freedom) Gives the t-value (e.g., t*) for a given probability and degrees of freedom. Probability   is the probability associated with the two-tailed t distribution. Degrees_freedom   is the number of degrees of freedom of the t distribution.

Does lack of caffeine increase depression? How many subjects should we include in our new study? Would 16 subjects be enough? Let’s compute the power of the t-test for against the alternative µ = 3. For a significance level α 5%, the t-test with n observations rejects H0 if t exceeds the upper 5% significance point of t(df:15) = 1.729. For n = 16 and s = 7: H0: mdifference = 0 ; Ha: mdifference > 0 The power for n = 16 would be the probability that ≥ 1.068 when µ = 3, using σ = 7. Since we have σ, we can use the normal distribution here: The power would be about 86%.

Inference for non-normal distributions What if the population is clearly non-normal and your sample is small? If the data are skewed, you can attempt to transform the variable to bring it closer to normality (e.g., logarithm transformation). The t- procedures applied to transformed data are quite accurate for even moderate sample sizes. A distribution other than a normal distribution might describe your data well. Many non-normal models have been developed to provide inference procedures too. You can always use a distribution-free (“nonparametric”) inference procedure (see chapter 15) that does not assume any specific distribution for the population. But it is usually less powerful than distribution-driven tests (e.g., t test).

Normal quantile plots for 46 car CO emissions Transforming data The most common transformation is the logarithm (log), which tends to pull in the right tail of a distribution. Instead of analyzing the original variable X, we first compute the logarithms and analyze the values of log X. However, we cannot simply use the confidence interval for the mean of the logs to deduce a confidence interval for the mean µ in the original scale. Normal quantile plots for 46 car CO emissions

Nonparametric method: the sign test A distribution-free test usually makes a statement of hypotheses about the median rather than the mean (e.g., “are the medians different”). This makes sense when the distribution may be skewed. A simple distribution-free test is the sign test for matched pairs. Calculate the matched difference for each individual in the sample. Ignore pairs with difference 0. The number of trials n is the count of the remaining pairs. The test statistic is the count X of pairs with a positive difference. P-values for X are based on the binomial B(n, 1/2) distribution. H0: population median = 0 vs. Ha: population median > 0 H0: p = 1/2 vs. Ha: p > 1/2