4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population. These methods utilize the information contained in a random sample from the population in drawing conclusions.
4-1 Statistical Inference
4-2 Point Estimation
4-2 Point Estimation
4-2 Point Estimation
4-2 Point Estimation
4-2 Point Estimation
4-2 Point Estimation
4-3 Hypothesis Testing 4-3.1 Statistical Hypotheses We like to think of statistical hypothesis testing as the data analysis stage of a comparative experiment, in which the engineer is interested, for example, in comparing the mean of a population to a specified value (e.g. mean pull strength).
4-3 Hypothesis Testing 4-3.1 Statistical Hypotheses Two-sided Alternative Hypothesis One-sided Alternative Hypotheses
4-3 Hypothesis Testing 4-3.1 Statistical Hypotheses Test of a Hypothesis A procedure leading to a decision about a particular hypothesis Hypothesis-testing procedures rely on using the information in a random sample from the population of interest. If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false.
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Sometimes the type I error probability is called the significance level, or the -error, or the size of the test.
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Reduce α push critical regions further Increasing sample size increasing Z value
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Increasing the sample size results in a decrease in β. error when μ =52 and n=16.
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses β increases as the true value of μ approaches the hypothesized value. error when μ =50.5 and n=10.
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses
4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses The power is computed as 1 - b, and power can be interpreted as the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power properties. For example, consider the propellant burning rate problem when we are testing H 0 : m = 50 centimeters per second against H 1 : m not equal 50 centimeters per second . Suppose that the true value of the mean is m = 52. When n = 10, we found that b = 0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when m = 52.
4-3 Hypothesis Testing 4-3.3 P-Values in Hypothesis Testing P-value carries info about the weight of evidence against H0. The smaller the p-value, the greater the evidence against H0.
4-3 Hypothesis Testing 4-3.4 One-Sided and Two-Sided Hypotheses Two-Sided Test: One-Sided Tests:
4-3 Hypothesis Testing 4-3.5 General Procedure for Hypothesis Testing
4-3 Hypothesis Testing
4-3 Hypothesis Testing OPTIONS NODATE NONUMBER; DATA EX415; mu_h0=12; mu_h1=11.25; sigma=0.5; n=4; alpha=0.05; xbar=11.7; sem=sigma/sqrt(n); /* sem = mean standard error */ Z=(xbar-mu_h0)/sem; z_beta= probit(alpha); xbar_beta=mu_h0+z_beta*sem; z_beta=(xbar_beta-mu_h1)/sem; PVAL=PROBNORM(Z); beta=1-probnorm(z_beta); power=1-beta; PROC PRINT; VAR PVAL beta power; TITLE '4-15 (page 168)'; RUN; QUIT; 4-15 (page 168) OBS PVAL beta power 1 0.11507 0.087685 0.91231
4-4 Inference on the Mean of a Population, Variance Known Assumptions
4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean We wish to test: The test statistic is:
4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean
4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean Reject H0 if the observed value of the test statistic z0 is either: or Fail to reject H0 if
4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean
4-4 Inference on the Mean of a Population, Variance Known
4-4 Inference on the Mean of a Population, Variance Known OPTIONS NODATE NONUMBER; DATA EX43; MU=50; SD=2; N=25; ALPHA=0.05; XBAR=51.3; SEM=SD/SQRT(N); /*MEAN STANDARD ERROR*/ Z=(XBAR-MU)/SEM; PVAL=2*(1-PROBNORM(Z)); PROC PRINT; VAR Z PVAL ALPHA; TITLE 'TEST OF MU=50 VS NOT=50'; RUN; QUIT; TEST OF MU=50 VS NOT=50 OBS Z PVAL ALPHA 1 3.25 .001154050 0.05
4-4 Inference on the Mean of a Population, Variance Known 4-4.2 Type II Error and Choice of Sample Size Finding The Probability of Type II Error
4-4 Inference on the Mean of a Population, Variance Known 4-4.2 Type II Error and Choice of Sample Size Finding The Probability of Type II Error
4-4 Inference on the Mean of a Population, Variance Known 4-4.2 Type II Error and Choice of Sample Size Sample Size Formulas
4-4 Inference on the Mean of a Population, Variance Known 4-4.2 Type II Error and Choice of Sample Size Sample Size Formulas
4-4 Inference on the Mean of a Population, Variance Known 4-4.2 Type II Error and Choice of Sample Size
4-4 Inference on the Mean of a Population, Variance Known 4-4.2 Type II Error and Choice of Sample Size
4-4 Inference on the Mean of a Population, Variance Known 4-4.3 Large Sample Test In general, if n 30, the sample variance s2 will be close to σ2 for most samples, and so s can be substituted for σ in the test procedures with little harmful effect.
4-4 Inference on the Mean of a Population, Variance Known 4-4.4 Some Practical Comments on Hypothesis Testing The Seven-Step Procedure Only three steps are really required:
4-4 Inference on the Mean of a Population, Variance Known 4-4.4 Some Practical Comments on Hypothesis Testing Statistical versus Practical Significance
4-4 Inference on the Mean of a Population, Variance Known 4-4.4 Some Practical Comments on Hypothesis Testing Statistical versus Practical Significance
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean Two-sided confidence interval: One-sided confidence intervals: Confidence coefficient:
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean
4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean Relationship between Tests of Hypotheses and Confidence Intervals If [l,u] is a 100(1 - a) percent confidence interval for the parameter, then the test of significance level a of the hypothesis will lead to rejection of H0 if and only if the hypothesized value is not in the 100(1 - a) percent confidence interval [l, u].
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean Confidence Level and Precision of Estimation The length of the two-sided 95% confidence interval is whereas the length of the two-sided 99% confidence interval is
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean Choice of Sample Size
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean Choice of Sample Size
4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Confidence Interval on the Mean One-Sided Confidence Bounds
4-4 Inference on the Mean of a Population, Variance Known 4-4.6 General Method for Deriving a Confidence Interval
4-4 Inference on the Mean of a Population, Variance Known Ex4-32 in pp.185
4-4 Inference on the Mean of a Population, Variance Known Ex4-37 in pp.185
4-4 Inference on the Mean of a Population, Variance Known Ex4-37 in pp.185
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean Calculating the P-value
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean Fixed Significance Level Approach
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown OPTIONS NODATE NONUMBER; DATA ex47; xbar=0.83725; sd=0.02456; mu=0.82; n=15;alpha=0.05; df=n-1; t0=(xbar-mu)/(sd/sqrt(n)); pval = 1- PROBT(t0, df); /* PROBT function returns the probability from a t distribution less than or equal to x */ cv=tinv(1-alpha, df); /* critical value */ /* TINV function returns a quantile from the t distribution */ PROC PRINT; var t0 pval cv; title 'Test of mu=0.82 vs mu>0.82'; title2 'Ex. 4-7 in pp193'; RUN; QUIT;
4-5 Inference on the Mean of a Population, Variance Unknown OPTIONS NOOVP NODATE NONUMBER; DATA ex47; input coeffs @@; cards; 0.8411 0.8191 0.8182 0.8125 0.8750 0.8580 0.8532 0.8483 0.8276 0.7983 0.8042 0.8730 0.8282 0.8359 0.8660 proc univariate data=ex47 mu0=0.82 plot normal; var coeffs; title 'using proc uinivariate for t-test for one population mean'; title2 “Example 4-7 in page 191”; RUN; QUIT;
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown OPTIONS NOOVP NODATE NONUMBER LS=80; DATA ex47; input coeffs @@; cards; 0.8411 0.8191 0.8182 0.8125 0.8750 0.8580 0.8532 0.8483 0.8276 0.7983 0.8042 0.8730 0.8282 0.8359 0.8660 ods graphics on; proc ttest data=ex47 h0=0.82 sides=u; /* h0는 귀무가설, sides는 양측일 경우 2, lower 단측 검정일경우 L, upper 단측 검정일 경우 u*/ var coeffs; title 'using t-test for one population mean'; title “Example 4-7 in page 191”; RUN; ods graphics off; QUIT;
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.2 Type II Error and Choice of Sample Size Fortunately, this unpleasant task has already been done, and the results are summarized in a series of graphs in Appendix A Charts Va, Vb, Vc, and Vd that plot for the t-test against a parameter d for various sample sizes n.
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.2 Type II Error and Choice of Sample Size These graphics are called operating characteristic (or OC) curves. Curves are provided for two-sided alternatives on Charts Va and Vb. The abscissa scale factor d on these charts is defined as
4-5 Inference on the Mean of a Population, Variance Unknown Chart V (c) in pp. 496
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.3 Confidence Interval on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.3 Confidence Interval on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown 4-5.4 Confidence Interval on the Mean
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown OPTIONS NOOVP NODATE NONUMBER LS=80; ods pdf file='ols_out.pdf'; DATA ex60; input diam @@; cards; 8.23 8.31 8.42 8.29 8.19 8.24 8.19 8.29 8.30 8.14 8.32 8.40 ods graphics on; proc univariate plot normal mu0=8.2; var diam; title 'univariate for t-test with H0: mu=8.2'; proc ttest data=ex60 h0=8.2; title 't-test for one population mean'; title “Example 4-60 in page 198”; RUN; ods graphics off; QUIT; ods pdf close;
4-5 Inference on the Mean of a Population, Variance Unknown
4-5 Inference on the Mean of a Population, Variance Unknown
4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population 4-6.2 Confidence Interval on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population
4-6 Inference on the Variance of a Normal Population OPTIONS NODATE NONUMBER; DATA ex410; p_var=0.01; s_var=0.0153; n=20;alpha=0.05; df=n-1; chisq=(n-1)*s_var/p_var; p_value=1-probchi(chisq, df); cv1=cinv(1-alpha, df); /* critical value */ cv2=cinv(alpha, df); lcl=sqrt((n-1)*s_var/cv1); /* ucl= upper confidence limit */ PROC PRINT; var chisq p_value cv1 cv2 lcl; title 'Test of H0: var=0.01 vs H0: var>0.01'; title2 'Ex. 4-10 in pp202'; RUN; QUIT;
4-6 Inference on the Variance of a Normal Population
4-8 Other Interval Estimates for a Single Sample 4-8.1 Prediction Interval
4-8 Other Interval Estimates for a Single Sample 4-8.2 Tolerance Intervals for a Normal Distribution