Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING

Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
Statistical inference draws conclusions about a population [i.e., probability density function (PDF) ] from a random sample that has supposedly been drawn from that population.

5.1 THE MEANING OF STATISTICAL INFERENCE
Statistical inference: the study of the relationship between a population and a sample drawn for that population. The process of generalizing from the sample value ( ) to the population value E(X) is the essence of statistical inference.

5.2 ESTIMATION AND HYPOTHESIS TESTING: TWIN BRANCHES OF STATISTICAL INFERENCE
Estimation: the first step in statistical inference. : an estimator/statistic of the population parameter E(X), estimate: the particular numerical value of the estimator sampling variation /sampling error: the variation in estimation from sample to sample. 2.Hypothesis testing In hypothesis testing we may have a prior judgment or expectation about what value a particular parameter may assume.

5.3 ESTIMATION OF PARAMETERS
The usual procedure of estimation: —— to assume that we have a random sample of size n from the known probability distribution and use the sample to estimate the unknown parameters, that is, use the sample mean as an estimate of the population mean (or expected value) and the sample variance as an estimate of the population variance. 1. Point estimate A point estimator, or a statistic, is an r.v., its value will vary from sample to sample. How can we rely on just one estimate of the true population mean.

2. Interval estimate Although is the single “best” guess of the true population mean, the interval, say, from 8 to 14, most likely includes the true μχ ? This is interval estimation. Sampling or probability distribution: P(-n-1 ≤t≤n-1)=1－α critical t values:±n-1 confidence interval: (lower limit-upper limit) confidence coefficient: 1－α level of significance/the prob. of committing type I error: α ) , ( ~ 2 n N X x s m ) 1 , ( ~ / N n X Z s m - = ) 1 ( ~ / - = n X t S m

FIGURE 5-1 The t distribution

Note: The interval is random, and not the parameterμx.
The confidence interval: a random interval, because it is based on and which will vary from sample to sample. The population mean: although unknown, is some fixed number and it is not random. You should not say: the probability is 0.95(1－α) that μx lies in this interval. You should say: the probability is 0.95 that the random interval, contains the trueμx.

Interval estimation, in contrast to point estimation, provides a range of values that will include the true value with a certain degree of confidence or probability (such as 0.95). P(L≤μx≤U)=1-α <α<1 That is, the prob. is（1-α） that the random interval from L to U contains the trueμx. If we construct a confidence interval with a confidence coefficient of 0.95, then in repeated such constructions 95 out of 100 intervals can be expected to include the true μx.

The sample mean is the most frequently used measure of the population mean because it satisfies several properties that statisticians deem desirable. 1. Linearity An estimator if said to be a linear estimator if it is a linear function of the sample observations. 2. Unbiasedness An estimator is an unbiased estimator ofμx if If we draw repeated samples of size n from the normal population and compute for each sample, then on the average will coincide with μx. The unbiasedness is a repeated sampling property. 3. Efficiency If we consider only unbiased estimators of a parameter, the one with the smallest variance is called best, or efficient, estimator.

4. Best Linear Unbiased Estimator(BLUE)
If an estimator is linear, is unbiased, and has a minimum variance in the class of all linear unbiased estimators of a parameter, it is called a best linear unbiased estimator. 5. Consistency An estimator (e.g., X*) is said to be a consistent estimator if it approaches the true value of the parameter as the sample size gets larger and larger.

5.5 STATISTICAL INFERENCE: HYPOTHESIS TESTING
Hypothesis testing: Instead of establishing a confidence interval, in hypothesis testing, we hypothesize that the trueμx takes a particular numerical value, e.g., μx=13. Our task is to “test” this hypothesis. Null hypothesis（H0）: the hypothesis we hypothesize, e.g.μx=13. Alternative hypothesis（H1）: the hypothesis used to test the null hypothesis. H1: μx>13, one-sided alternative hypothesis H1: μx<13, one-sided alternative hypothesis H1: μx≠13, two-sided alternative hypothesis

1. The Confidence Interval Approach to Hypothesis Testing
In hypothesis testing, the 95% confidence interval is called the acceptance region and the area outside the acceptance region is called the critical region/the region of rejection, of the null hypothesis. The boundaries of the acceptance region are called critical values. The null hypothesis is rejected if the value of the parameter under the null hypothesis either exceeds the upper critical value or is less than the lower critical value of the acceptance region.

2. Type I and Type II Errors: A Digression
Type I error: the error of rejecting a hypothesis when it is true. Type II error: the error of accepting a false hypothesis. Type I error=α=prob.(rejecting H0 |H0 is true) Type II error=β=prob.(accepting H0 |H0 is false) The classical approach to deal with type I, type II problems: ——To assume a type I error is more serious than a type II error, try to keep the prob. of committing a type I error at a fairly low level, and then minimize a type II error as much as possible. That is, simply specifies the value ofαwithout worrying too much aboutβ. The decision to accept or reject a null hypothesis depends critically on both the d.f. and the probability of committing a type I error. A 95% confidence coefficient/a 5% level of significance/a 95% level or degree of confidence: we are prepared to accept at the most a 5 percent probability of committing a type I error.

3. The Test of Significance Approach to Hypothesis Testing
If the difference between and μx is small (in absolute terms), then the |t| value will also be small. If =μx, t will be zero, then we can accept the null hypothesis. As the |t| value deviates from zero, increasingly we will tend to reject the null hypothesis. If the computed t value lies in either of the rejection regions, we can reject the null hypothesis. When we reject the null hypothesis, we say that: our finding is statistically significant. when we do not reject the null hypothesis, we say that: our finding is not statistically significant. ) 1 ( ~ / - = n X t S m

4. A Word on Choosing the Level of Significance, α，and the p Value
p value: the exact significance level, of the test statistic, the lowest significance level at which a null hypothesis can be rejected. The smaller the p value, the stronger the evidence against the null hypothesis. 5. The test of significance. (1) The confidence interval approach: establish a (1-α) % confidence interval for the true but unknown σ2 using the distribution. (2) Hypothesis testing approach: Just compute value and test its significance against the critical value. 6. The F test of significance.

Conclusion: ——Summarizing the steps involved in testing a statistical hypothesis: Step 1: State the null hypothesis H0 and the alternative hypothesis H1 e.g., H0:μX=13 and H1:μX≠13 . Step 2: Select the test statistic (e.g., ) Step 3: Determine the probability distribution of the test statistic (e.g., . Step 4: Choose the level of significance α, that is, the probability of committing a type I error. (Keep in mind our discussion about the p value.) Step 5: Choose the confidence interval or the test of significance approach.

Step 6: Accept or refuse the null hypothesis?
(1)The confidence interval approach: ——Using the probability distribution of the test statistic, establish a 100(1-α)% confidence interval. If this interval (the acceptance region) includes the null-hypothesized value, do not reject the null hypothesis. If this interval does not include it, reject the null hypothesis. (2)The test of significance approach: ——Obtaining the relevant test statistic (e.g., the t statistic) under the null hypothesis and find out the probability of obtaining a specific value of the test statistic from the appropriate probability distribution The probability is less than the prechosen value of α, reject the null hypothesis. The probability is greater than α, do not reject it. If you do not want to preselect α, just present the p value of the statistic. Note: Whether you choose the confidence interval or the test of significance approach, keep in mind that in rejecting or not rejecting a null hypothesis you are taking a chance of being wrongα% of the time.

Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING

Similar presentations

Presentation on theme: "Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING

Similar presentations

Presentation on theme: "Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING"— Presentation transcript:

Similar presentations

About project

Feedback