Dr. Deshi Ye yedeshi@zju.edu.cn Ch7 Statistics Dr. Deshi Ye yedeshi@zju.edu.cn
Outline State what is estimated Compute sample size Point estimation Interval estimation Compute sample size Tests of Hypotheses Null Hypothesis and Tests of Hypotheses Hypotheses concern one mean
Why Statistics? The purpose of most statistical investigations is to generalize from information contained in random samples about the populations from which the samples were obtained. How: estimation and tests of Hypothesis
Thinking Suppose you’re interested in the average amount of money that students in this class (the population) have on them. How would you find out?
Introduction to Estimation
Statistical Methods
I am 95% confident that is between 40 & 60. Estimation Process Population Random Sample I am 95% confident that is between 40 & 60. Mean X= 50 Mean, , is unknown Sample 9
Estimation Methods Estimation Point Interval Estimation Estimation 14
Point estimation We want to know the mean of a population. However, it is unavailable. Hence, we choose a sample data, and calculate from the choosing sample data. Then estimate that the population is also the same mean. Point estimation: concerns the choosing of a statistic, that is, a single number calculated from sample data for which we have some expectation, or assurance, that is reasonably close the parameter it is supposed to estimate.
Point Estimation 1. Provides Single Value Based on Observations from 1 Sample 2. Gives No Information about How Close Value Is to the Unknown Population Parameter 3. Example: Sample MeanX = 3 Is Point Estimate of Unknown Population Mean
Point estimation of a mean Parameter: Population mean Data: A random sample Estimator: Estimate of standard error:
EX Scientists need to be able to detect small amounts of contaminants in the environment. Sample data is listed as follows: 2.4 2.9 2.7 2.6 2.9 2.0 2.8 2.2 2.4 2.4 2.0 2.5
EX of point estimation Compute the point estimator and estimate its standard deviation (also called the estimated standard error of ). Solution: Hence the estimated standard deviation is
Bias or Unbias Question: is the estimation good enough? EX. In previous ex, as a check on the current capabilities, the measurements were made on test specimens spiked with a known concentration 1.25 ug/l of lead. That is the readings should average 1.25 if there is no background lead in the samples. There appears to be either a bias due to laboratory procedure or some lead already in the samples before they were spiked.
Unbiased estimator Let be the parameter of interest and be a statistic. A statistic is said to be an unbiased estimator, or its value an unbiased estimate, if and only if the mean of the sampling distribution of the estimator equals , whatever the value of .
More efficient unbiased estimator Estimator is not unique: for example: it can be shown that for a random sample of size n=2, the mean as well as the weighted mean where a, b are positive constants, are unbiased estimates of the mean of the population. A statistics is said to be a more efficient unbiased estimator of the parameter than the statistics if 1. and are both unbiased estimators of 2. the variance of the sampling distribution of the first estimator is no larger than that of the second and is smaller for at least one value of .
Finding Sample Sizes I don’t want to sample too much or too little! 9 88
The error The error between the estimator and the quantity it is supposed to estimate is: is a random variable having approximately the standard normal distribution We could assert with probability that the inequality Remind that
Maximum error of estimate The error will be less than with probability . Specially,
EX An industrial engineer intends to use the mean of a random sample of size n=150 to estimate the average mechanical aptitude of assembly line workers in a large industry. If, on the basis of experience, the engineer can assume that for such data, what can he assert with probability 0.99 about the maximum size of his error. Thus, the engineer can assert with probability 0.99 that his error is at most 1.3
Determine of sample size Suppose that we want to use the mean of a large random sample to estimate the mean of a population, and want to be able to assert with probability that the error will be at most some prescribed quantity E. As before, we get
EX A research worker want to determine the average time it takes a mechanic to rotate the tires of a car, and she wants to be able to assert with 95% confidence that the mean of her sample is off by at most 0.5 minute. If she can presume from past experience that minutes, how large a sample will she have to take?
Continuous The method discussed so far requires be known or it can be approximated with the sample standard deviation s, thus requiring that n be large. Another approach: if it is reasonable to assume that we are sampling from a normal population, we get is a random variable having the t distribution with n-1 degree of freedom.
EX The first example, n=12, For n=11 degrees of freedom Thus, one can assert with 98% confidence that the maximum error is within 0.2
Estimation Methods Estimation Point Interval Estimation Estimation 14
Interval Estimation 1. Provides Range of Values Based on Observations from 1 Sample 2. Gives Information about Closeness to Unknown Population Parameter Stated in terms of Probability 3. Example: Unknown Population Mean Lies Between 50 & 70 with 95% Confidence
Interval Estimation Sample statistic (point estimate) 24
Confidence limit (lower) Confidence limit (upper) Interval Estimation Sample statistic (point estimate) Confidence interval Confidence limit (lower) Confidence limit (upper) 25
Confidence limit (lower) Confidence limit (upper) Interval Estimation A probability that the population parameter falls somewhere within the interval. Sample statistic (point estimate) Confidence interval Confidence limit (lower) Confidence limit (upper) 26
Confidence Interval Estimates Intervals Mean Proportion Variance Known Unknown 43
Confidence Interval Mean ( Known) Assumptions Population standard deviation is known Population is normally distributed If not normal, can be approximated by normal distribution (n 30) Confidence Interval Estimate
Interval Estimation Interval estimation: with intervals for which we can assert with a reasonable degree of certainty that they will contain the parameter under consideration. For a large random sample (n > 30) from a population with the unknown mean and the known variance. When the observed value becomes available, we obtain
Confidence interval We can claim with confidence that the interval Contains It is customary to refer to an interval of this kind as a confidence interval for having the degree of confidence
Confidence Interval Estimates Intervals Mean Proportion Variance Known Unknown 43
Solution for Small Samples 1. Assumptions Population of X Is Normally Distributed Use Student’s t Distribution Define variable T has the Student distribution with n -1 degrees of freedom (When X is normally distributed) There’s a different Student distribution for different degrees of freedom As n gets large, Student distribution approximates a normal distribution with mean = 0 and sigma = 1
Small sample (n<30) Small sample and we assume to get sampling from a normal distribution population. We get the confidence interval formula
EX. The mean weight loss of n=16 grinding balls after a certain length of time in mill slurry is 3.42 grams with a standard deviation of 0.68 gram. Construct a 99% confidence interval for the true mean weight loss of such grinding balls under the stated condition.
Confidence Depends on Interval (z) X= ± Zx x _ X -2.58x -1.65x +1.65x +2.58x -1.96x +1.96x 90% Samples 95% Samples 99% Samples 33
Confidence Level 1. Probability that the Unknown Population Parameter Falls Within Interval 2. Denoted (1 - Is Probability That Parameter Is Not Within Interval 3. Typical Values Are 99%, 95%, 90%
Intervals & Confidence Level Sampling Distribution of Mean Notice that the interval width is only determined by 1- in the sampling distribution. Intervals extend from X - ZX to X + ZX (1 - ) % of intervals contain . % do not. Intervals derived from many samples 35
Confidence interval & level It is useful to think confidence intervals as a range of "plausible" values for the parameter. confidence interval is different from interpreting the confidence level suppose we've taken a random sample of 10 ice-cream cones, and determined that a 95% confidence interval for the mean caloric contents of a single scoop of ice-cream is (260,310). Interpret the confidence level: If we repeatedly took samples of size 10 and then formed confidence intervals, we would expect 95% of them to contain the true (but unknown) mean. Interpret this particular confidence interval: we are 95% confident that the true mean caloric content lies between 260 and 310.
Confidence interval & level The wider the confidence interval you are willing to accept, the more certain you can be that the whole population answers would be within that range. For example, if you asked a sample of 1000 people in a city which brand of cola they preferred, and 60% said Brand A, you can be very certain that between 40 and 80% of all the people in the city actually do prefer that brand, but you cannot be so sure that between 59 and 61% of the people in the city prefer the brand.
Factors Affecting Interval Width 1. Data Dispersion Measured by 2. Sample Size 3. Level of Confidence (1 - ) Affects Z Intervals Extend from X - ZX toX + ZX © 1984-1994 T/Maker Co.
Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics Hypothesis Estimation Testing 5
I believe the population Reject hypothesis! Not close. Hypothesis Testing I believe the population mean age is 50 (hypothesis). Reject hypothesis! Not close. Population Mean X = 20 Random sample
What’s a Hypothesis? A belief about a population parameter Parameter is Population mean, proportion, variance Must be stated before analysis I believe the mean GPA of this class is 3.8! © 1984-1994 T/Maker Co.
Tests of Hypotheses Suppose that a consumer protection agency wants to test a paint manufacturer’s claim that the average drying time of his new “fast-drying” paint is 20 minutes. It instructs a member of its research staff to paint each of 36 boards using a different 1-gallon can of the paint, with the intention of rejecting the claim if the mean of the drying time exceeds 20.75 minutes. Otherwise, it will accept the claim. Question: Is it a infallible criterion for accepting or rejecting the claim?
EX cont. Assuming that it is known from past experience that the standard deviation The probability of erroneously rejecting the hypothesis that Let us investigate the possibility that the sample may exceed 20.75 minutes even though the true mean is 20 minutes
Another possibility The procedure fails to detect that Suppose that the true mean of drying time is
Type of errors H: the hypothesis. Ex. Accept H Reject H H is true Correct decision Type I error H is false Type II error
Type I error: If hypotheses is true but rejected. Denoted by the letter EX. Type II error: If hypotheses is false but not rejected. Denoted by the letter
7.4 Null Hypotheses Question: Can we formulate minutes, where can take on more than one possible value? Null Hypotheses : (Pronounced H-nought) Usually require that we hypothesize the opposite of what we hope to prove. EX. If we want to show that one method of teaching computer programming is more efficient than another, we hypothesize that the two methods are equally effective. The null hypothesis proposes something initially presumed true. It is rejected only when it becomes evidently false. Null hypotheses can arise for consideration in a number of different ways, the main ones being as follows; • H0 may corresponds to the prediction of some scientific(economic) theory or to some model of the system thought quite likely to be true or nearly so. • H0 may represent some simple set of circumstances which, in the absence of evidence to the contrary, we wish to assume holds. For example, the null hypothesis might assert the ineffectiveness of newlydeveloped medicine for AIDS. We want to play safe by assuming ineffectiveness unless we can find a significant evidence against our presumption. • H0 may assert complete absence of structure in some sense. So long as the data are consistent with the null hypothesis it can not be justified to claim that the data provide clear evidence in favor of some particular kind of structure. Testing joint significance of slope coefficients in linear regression model is an example.
Null Hypothesis 1. What is tested 2. has serious outcome if incorrect decision made 3. Designated H0 4. Specified as H0: Some Numeric Value Specified with = Sign Even if , or Example, H0: 3
Alternative Hypothesis 1. Opposite of Null Hypothesis 2. Always Has Inequality Sign: ,, or 3. Designated Ha 4. Specified Ha: < Some Value Example, Ha: < 3 will lead to two-sided tests <, > will lead to one-sided tests
Alternative One-sided alternative: In the drying time example, the null hypothesis is minutes and the alternative hypothesis is Two-sided alternative: where is the value assumed under the null hypothesis.
Selecting the null hypothesis Guideline for selecting the null hypothesis: When the goal of an experiment is to establish an assertion, the negation of the assertion should be taken as the null hypothesis. The assertion becomes the alternative hypothesis.
Identifying Hypotheses Steps 1. Example Problem: Test That the Population Mean Is Not 3 2. Steps State the Question Statistically ( 3) State the Opposite Statistically ( = 3) Must Be Mutually Exclusive & Exhaustive Select the Alternative Hypothesis ( 3) Has the , <, or > Sign State the Null Hypothesis ( = 3)
What Are the Hypotheses? Is the population average amount of TV viewing 12 hours? State the question statistically: = 12 State the opposite statistically: 12 Select the alternative hypothesis: Ha: 12 State the null hypothesis: H0: = 12
What Are the Hypotheses? Is the population average amount of TV viewing different from 12 hours? State the question statistically: 12 State the opposite statistically: = 12 Select the alternative hypothesis: Ha: 12 State the null hypothesis: H0: = 12
What Are the Hypotheses? Is the average cost per hat less than or equal to $20? State the question statistically: 20 State the opposite statistically: 20 Select the alternative hypothesis: Ha: 20 State the null hypothesis: H0: 20
What Are the Hypotheses? Is the average amount spent in the bookstore greater than $25? State the question statistically: 25 State the opposite statistically: 25 Select the alternative hypothesis: Ha: 25 State the null hypothesis: H0: 25
Decision Making Risks 9
Errors in making decision 1. Type I Error Reject True Null Hypothesis Has Serious Consequences Probability of Type I Error Is (Alpha) Called Level of Significance 2. Type II Error Do Not Reject False Null Hypothesis Probability of Type II Error Is (Beta)
Decision Results H0: Innocent
Decision Results H0: Innocent
& have an inverse relationship You can’t reduce both errors simultaneously!
Factors Affecting 1. True Value of Population Parameter Increases When Difference With Hypothesized Parameter Decreases 2. Significance Level, Increases When Decreases 3. Population Standard Deviation, Increases When Increases 4. Sample Size, n Increases When n Decreases
I believe the population Reject hypothesis! Not close. Hypothesis Testing I believe the population mean age is 50 (hypothesis). Reject hypothesis! Not close. Population Mean X = 20 Random sample
Basic Idea
Sampling Distribution Basic Idea Sampling Distribution H0
Sampling Distribution Basic Idea Sampling Distribution It is unlikely that we would get a sample mean of this value ... 20 H0
Sampling Distribution Basic Idea Sampling Distribution It is unlikely that we would get a sample mean of this value ... ... if in fact this were the population mean 20 H0
Sampling Distribution Basic Idea Sampling Distribution It is unlikely that we would get a sample mean of this value ... ... therefore, we reject the hypothesis that = 50. ... if in fact this were the population mean 20 H0
Hypothesis testing 1. We formulate a null hypothesis and an appropriate alternative hypothesis which we accept when the null hypothesis must be rejected. 2. We specify the probability of a Type I error. If possible, desired, or necessary, we may also specify the probabilities of Type II errors for particular alternatives. 3. Based on the sampling distribution of an appropriate statistic, we construct a criterion for testing the null hypothesis against the given alternative. 4. We calculate from the data the value of the statistic on which the decision is to be based. 5. We decide whether to reject the null hypothesis or whether to fail to reject it.
Level of Significance 1. Probability 2. Defines Unlikely Values of Sample Statistic if Null Hypothesis Is True Called Rejection Region of Sampling Distribution 3. Designated (alpha) 4. Selected by Researcher at Start
Level of significance The probability of a Type I error is also called the level of significance. Usually, we set . Step 2 can often be performed even when the null hypothesis specifies a range of values for the parameter. Ex. The null hypothesis: Then we can claim that In general, we can only specify the maximum probability of a Type I error, and by again the worst possibility.
Criterion One-sided criterion (one-sided test or one-tailed test): ex. One-sided alternative Two-sided criterion (two-sided test or two-tailed test): ex. two-sided alternative
Rejection Region (One-Tail Test) Rejection region does NOT include critical value.
Rejection Region (One-Tail Test) Sampling Distribution Level of Confidence Rejection Region Rejection region does NOT include critical value. 1 - a Nonrejection Region H0 Sample Statistic Critical value value
Rejection Region (One-Tail Test) Sampling Distribution Level of Confidence Rejection region does NOT include critical value. 1 - Observed sample statistic
Rejection Region (One-Tail Test) Sampling Distribution Level of Confidence Rejection region does NOT include critical value. 1 - Observed sample statistic
Rejection Region (Two-Tailed Test) Rejection region does NOT include critical value.
Rejection Region (Two-Tailed Test) Sampling Distribution Level of Confidence Rejection Rejection Region Region Rejection region does NOT include critical value. 1 - 1/2 a 1/2 a Nonrejection Region Ho Sample Statistic Critical Value Critical Value Value
Rejection Region (Two-Tailed Test) Sampling Distribution Level of Confidence Rejection Rejection Region Region Rejection region does NOT include critical value. 1 - 1/2 a 1/2 a Nonrejection Region Ho Sample Statistic Critical Value Critical Value Value Observed sample statistic
Rejection Region (Two-Tailed Test) Sampling Distribution Level of Confidence Rejection Rejection Region Region Rejection region does NOT include critical value. 1 - 1/2 a 1/2 a Nonrejection Region Ho Sample Statistic Critical Value Critical Value Value Observed sample statistic
Rejection Region (Two-Tailed Test) Sampling Distribution Level of Confidence Rejection Rejection Region Region Rejection region does NOT include critical value. 1 - 1/2 a 1/2 a Nonrejection Region Ho Sample Statistic Critical Value Critical Value Value Observed sample statistic
H0 Testing Steps State H0 State Ha Choose Choose n Choose test Set up critical values Collect data Compute test statistic Make statistical decision Express decision
One Population Tests One Population Mean Proportion Variance Z Test t Test Z Test c 2 Test (1 & 2 (1 & 2 (1 & 2 (1 & 2 tail) tail) tail) tail)
7.5 Hypothesis concerning one mean Suppose we want to test on the basis of n=35 determinations and at the 0.05 level of significance whether the thermal conductivity of a certain kind of cement brick is 0.340, as has been claimed. We can expect that the variability of such determinations is given by
Solution 1) Null hypothesis: alternative hypothesis: 2) Level of significance: 3) Criterion:
Criterion Region for testing (Normal population and known) Alternative hypothesis Reject null hypothesis if
Critical values One-sided alternatives Two-sided alternatives -1.645 -1.96 1.96 -2.33 2.33 -2.575 2.575
EX. Cont. 3) Criterion: Reject the null hypothesis if 4) Calculations: where 4) Calculations: 5) Decision: The null hypothesis cannot be rejected.
P-value P value (or tail probability): the probability of getting difference between and greater than or equal to that actually observed. EX. In above example ½ P-value ½ P-value
P-Value 1. Probability of Obtaining a Test Statistic More Extreme (or than Actual Sample Value Given H0 Is True 2. Called Observed Level of Significance Smallest Value of H0 Can Be Rejected 3. Used to Make Rejection Decision If p-Value , Do Not Reject H0 If p-Value < , Reject H0
P value P value for a given test statistic and null hypothesis: The P value is the probability of obtaining a value for the test statistic that is as extreme or more extreme than the value actually observed. Probability is calculated under the null hypothesis.
EX A process for producing vinyl floor covering has been stable for a long period of time, and the surface hardness measurement of the flooring produced has a normal distribution with mean 4.5 and standard deviation 1.5. A second shift has been hired and trained and their production needs to be monitored. Consider testing the hypothesis
A random sample of hardness measurements is made of n=25 vinyl specimens produced by the second shift. Calculate the P value when using the test statistic If
Solution The observed value of the test statistic is Since the alternative hypothesis is two-sided, we must consider large negative value for Z as well as large positive values Consequently, the P value is The small P value suggests the mean of the second shift is not at the target value of 4.5
P value To understand P values, you have to understand fixed level testing. With fixed level testing, a null hypothesis is proposed (usually, specifying no treatment effect) along with a level for the test, usually 0.05. All possible outcomes of the experiment are listed in order to identify extreme outcomes that would occur less than 5% of the time in aggregate if the null hypothesis were true. This set of values is known as the critical region. They are critical because if any of them are observed, something extreme has occurred. Data are now collected and if any one of those extreme outcomes occur the results are said to be significant at the 0.05 level. The null hypothesis is rejected at the 0.05 level of significance and one star (*) is printed somewhere in a table. Some investigators note extreme outcomes that would occur less than 1% of the time and print two stars (**) if any of those are observed. Many researchers quickly realized the limitations of reporting only whether a result achieved the 0.05 level of significance. Was a result just barely significant or wildly so? Would data that were significant at the 0.05 level be significant at the 0.01 level? At the 0.001 level? Even if the result are wildly statistically significant, is the effect large enough to be of any practical importance?
P value Observed significance level (or P value)--the smallest fixed level at which the null hypothesis can be rejected. If your personal fixed level is greater than or equal to the P value, you would reject the null hypothesis. If your personal fixed level is less than to the P value, you would fail to reject the null hypothesis. For example, if a P value is 0.027, the results are significant for all fixed levels greater than 0.027 (such as 0.05) and not significant for all fixed levels less than 0.027 (such as 0.01). A person who uses the 0.05 level would reject the null hypothesis while a person who uses the 0.01 level would fail to reject it.
P-Value Thinking Challenge You’re an analyst for Ford. You want to find out if the average miles per gallon of Escorts is at least 32 mpg. Similar models have a standard deviation of 3.8 mpg. You take a sample of 60 Escorts & compute a sample mean of 30.7 mpg. What is the value of the observed level of significance (p-Value)? .
p-Value Solution* p-Value is P(Z -2.65) = .004. p-Value < ( = .01). Reject H0. .5000 - .4960 .0040 Use alternative hypothesis to find direction .4960 Z value of sample statistic From Z table: lookup 2.65
One Population Tests One Population Mean Proportion Variance Z Test t Test Z Test c 2 Test (1 & 2 (1 & 2 (1 & 2 (1 & 2 tail) tail) tail) tail)
t Test for Mean ( Unknown) 1. Assumptions Population Is Normally Distributed If Not Normal, Only Slightly Skewed & Large Sample (n 30) Taken 2. Parametric Test Procedure
Statistic for small sample The test of null hypothesis on the statistic
Criterion Region for testing (Statistic for small sample ) Alternative hypothesis Reject null hypothesis if
Two-Tailed t Test Finding Critical t Values Given: n = 3; = .10
Two-Tailed t Test Finding Critical t Values Given: n = 3; = .10 /2 = .05 /2 = .05
Two-Tailed t Test Finding Critical t Values Given: n = 3; = .10 df = n - 1 = 2 /2 = .05 /2 = .05
Two-Tailed t Test Finding Critical t Values Given: n = 3; = .10 Critical Values of t Table (Portion) df = n - 1 = 2 /2 = .05 /2 = .05
Two-Tailed t Test Finding Critical t Values Given: n = 3; = .10 Critical Values of t Table (Portion) df = n - 1 = 2 /2 = .05 /2 = .05
One-Tailed t Test You’re a marketing analyst for Wal-Mart. Wal-Mart had teddy bears on sale last week. The weekly sales ($ 00) of bears sold in 10 stores was: 8 11 0 4 7 8 10 5 8 3. At the .05 level, is there evidence that the average bear sales per store is more than 5 ($ 00)? Assume that the population is normally distributed. Allow students about 10 minutes to solve this.
One-Tailed t Test Solution* H0: Ha: = df = Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant.
One-Tailed t Test Solution* H0: = 5 Ha: > 5 = df = Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant.
One-Tailed t Test Solution* H0: = 5 Ha: > 5 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant.
One-Tailed t Test Solution* H0: = 5 Ha: > 5 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant.
One-Tailed t Test Solution* H0: = 5 Ha: > 5 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant.
One-Tailed t Test Solution* H0: = 5 Ha: > 5 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant. Do not reject at = .05
One-Tailed t Test Solution* H0: = 5 Ha: > 5 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: Conclusion: Note: More than 5 have been sold (6.4), but not enough to be significant. Do not reject at = .05 There is no evidence average is more than 5