Modeling and Analysis of Uncertainty

Modeling and Analysis of Uncertainty
Chapter 4 Modeling and Analysis of Uncertainty

Statistical Inference
Statistical inference is using the information contained in a sample to make decisions about the population. Statistical inference has two sub areas: Parameter estimation – the estimating of a parameter of a population probability distribution. Point and Interval estimates. Hypothesis testing – the testing of a hypothesis made regarding a parameter of a probability distribution.

Point Estimation If X is a random variable with a probability distribution f (x), and the distribution has an unknown parameter θ, and if X1..Xn is a random sample of size n from f (x)

Population Mean μ ------------Sample mean
Population Variance σ Sample variance S2 Difference in two means μ1–μ2----Difference in two sample means All of these point estimates are statistics.

Estimator for Percentiles
A p-quantile of X and denote the p-quantile of the standard normal. Assume X~N(μ, σ2) . Then (3) Thus, we would estimate (4)

Estimators for Percentiles
B. When sampling from X  exp(), Thus, What if we wanted to estimate We wouldn’t use s2, we would use

Example 1: The data (ordered) are thicknesses of copper foil from a rolling operation (units in mils). THICKNESS 4.72 4.84 4.87 4.94 4.98 5.00 5.01 5.05 5.06 5.07 5.11 5.12 5.21 5.24 5.38

The normal distribution provides a good fit to the data. Thus, assume the data come from a normal distribution, with The 90th percentile is estimated by (To obtain standard normal quantiles use normal tables or MINITAB or find normal quantiles directly from MINITAB –we’ll demonstrate). The sample 90th percentile (see Chapter 2) is 5.30.

Unbiased Estimators Bias is a measure of closeness of the estimator to the parameter. It is defined by: The point estimator is an unbiased estimator for the parameter θ if: E( ) = θ If the estimator is not unbiased, then the difference E( ) - θ is called the bias of the estimator . In words: “An estimator is unbiased when its expected value is equal to the parameter it estimates.”

Point Estimation  Pdf of an unbiased estimator
Pdf of a biased estimator  “On the average” when an error of size  is made.

Example 2: Recall that for a random sample from a population with mean
Thus, the sample mean (variance) is always unbiased for the population mean (variance). But note – the sample standard deviation is never unbiased for  : However, when sampling from the normal distribution, for sample sizes  10 the bias is very small.

Remark: In the expression for sample variance,
why not use n in the denominator? Because by using n – 1, we get an unbiased estimator for the population variance. Whether or not an estimator is unbiased, the smaller the variance the better.

Point Estimation

Estimator Variance When there are two unbiased estimators for the same parameter, a decision as to which is better can be made by examining the variance of each. The estimator with the least variance is more likely to result in an estimate that is closer to the population parameter. The unbiased estimator with the least variance is referred to as the minimum variance unbiased estimator (MVUE).

Mean Square Error A measure that incorporates both the bias and the variance of an estimator is the mean square error. The mean square error of an estimator of the parameter θ is defined as : MSE ( ) = E( - θ)2 The mean square error can be rewritten as: MSE ( ) = V( )+(bias)2

Relative Efficiency The mean square error can be used to compare the ability of two estimators to estimate a parameter. For two estimators , , the relative efficiency of to is: A ratio less than 1 indicates that is not as efficient as in estimating the parameter value.

The MSE of an unbiased estimator is simply
An estimator with small variance and small bias may be more efficient than one with a larger variance and no bias, as depicted below. Pdf of a biased estimator Pdf of unbiased estimator  Which estimator is better? Why?

Standard Error The standard error of a statistic is the standard deviation of its sampling distribution. If the expression for the standard error contains its own parameters with unknown values, but which have estimators, use the estimators and the result is the estimated standard error of a statistic.

Point Estimation

Hypothesis Testing Statistical Hypotheses
We like to think of statistical hypothesis testing as the data analysis stage of a comparative experiment, in which the engineer is interested, for example, in comparing the mean of a population to a specified value (e.g. mean pull strength).

Hypothesis Testing A statistical hypothesis is a statement about the parameters of one or more populations. They are expressed H0: μ = 50cm/s the null hypothesis H1: μ ≠ 50cm/s the alternative hypothesis Generally the interest is in whether or not a parameter has changed, is in accordance with theory, or is in conformance with a requirement.

Hypothesis Testing cont.
Testing the hypothesis involves taking a random sample, computing a test statistic, and from the test statistic make a decision about the null hypothesis. The value of the test statistic is compared to the critical region, and if it is in the critical region the null hypothesis is rejected.

Type I Error The limits of the critical region are referred to as the critical values. These critical values depend on the desired probability for rejecting the null hypothesis when in fact it is true. Rejecting the null hypothesis H0 when it is true is defined as a type I error. The probability of a type I error is denoted as α. P( reject Ho when Ho is true) The type I error probability is also referred to as the significance level or size of the test. Generally the α value is decided prior to conducting the test.

Type II Error Failing to reject the null hypothesis when it is false is referred to as a Type II error. The probability of making a type II error is denoted as β. P( fail to reject H0 when H0 is false). This probability is a function of 2 values. First, how close the hypothesized value of the parameter is to the actual value of the parameter, and second, the sample size. The closer the two values of the parameter and the smaller the sample size, the larger the value of β. To calculate β, a true population parameter value is required.

Power of a test The power of a statistical test is the probability of rejecting the null hypothesis when it is false. This probability is 1-β. If the power value is deemed to low, increase α or the sample size.

Hypothesis Testing Statistical Hypotheses
For example, suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems. Now burning rate is a random variable that can be described by a probability distribution. Suppose that our interest focuses on the mean burning rate (a parameter of this distribution). Specifically, we are interested in deciding whether or not the mean burning rate is 50 centimeters per second.

Hypothesis Testing 4-3.1 Statistical Hypotheses
Two-sided Alternative Hypothesis One-sided Alternative Hypotheses

Hypothesis Testing 4-3.1 Statistical Hypotheses Test of a Hypothesis
A procedure leading to a decision about a particular hypothesis Hypothesis-testing procedures rely on using the information in a random sample from the population of interest. If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false.

Hypothesis Testing 4-3.2 Testing Statistical Hypotheses

Hypothesis Testing 4-3.2 Testing Statistical Hypotheses
Sometimes the type I error probability is called the significance level, or the -error, or the size of the test.

4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses

Hypothesis Testing 4-3.2 Testing Statistical Hypotheses
The power is computed as 1 - b, and power can be interpreted as the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power properties. For example, consider the propellant burning rate problem when we are testing H 0 : m = 50 centimeters per second against H 1 : m not equal 50 centimeters per second . Suppose that the true value of the mean is m = 52. When n = 10, we found that b = , so the power of this test is 1 - b = = when m = 52.

Hypothesis Testing 4-3.3 One-Sided and Two-Sided Hypotheses
Two-Sided Test: One-Sided Tests:

Hypothesis Testing 4-3.4 General Procedure for Hypothesis Testing
4 and 5 determine 6

Case 1: Inference on the Mean of a Population, Variance Known
Assumptions

4-4 Inference on the Mean of a Population, Variance Known
4-4.1 Hypothesis Testing on the Mean We wish to test: The test statistic is: This statistic follows the standard normal distribution when H0 is true

4-4.1 Hypothesis Testing on the Mean Reject H0 if the observed value of the test statistic z0 is either: z0 > z/2 or z0 < -z/2 Fail to reject H0 if -z/2 < z0 < z/2 As ± z/2 are the critical values that define the borders of the critical (reject) region of the test statistic

4-4.1 Hypothesis Testing on the Mean

Inference on the Mean of a Population, Variance Known
4-4.2 P-Values in Hypothesis Testing Alternatively it is the probability that the test statistic will assume the calculated value z0 or greater when the null is true.

4-4.2 P-Values in Hypothesis Testing

4-4.3 Type II Error and Choice of Sample Size Finding The Probability of Type II Error 

4-4.3 Type II Error and Choice of Sample Size Sample Size Formulas

4-4.3 Type II Error and Choice of Sample Size

Example 3: A. Let B. For the data of Example 2, (thickness of copper foil)

Recall that MINITAB displays this:
Thus, Recall that MINITAB displays this: Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean THICKNESS Variable Minimum Maximum Q Q3 THICKNESS

Example 4: The mean force exerted at a particular point in a newly designed mechanical assembly should be less than 4000 psi. Based on previous designs, the engineer “knows” that the force is normally distributed with standard deviation 65 psi. He will obtain data on 20 prototype assemblies to test at the 0.01 level of significance. A. He obtained

Thus, The engineer cannot (as he “hoped” to) conclude that the mean force is less than 4000 psi. Do the problem on MINITAB, but note that you must have raw, not summarized, data. The engineer collected the data on the next slide.

Stat > Basic Statistics > 1-Sample Z
P-value  0.01. Do not reject H0 MINITAB Output: Test of mu = vs mu < The assumed sigma = 65.0 Variable N Mean StDev SE Mean Z P FORCE

4-4.4 Large-Sample Test In general, if n  30, the sample variance s2 will be close to 2 for most samples, and so s can be substituted for  in the test procedures with little harmful effect. That is for large sample sizes:

The results we have developed for the case of known variance are used for large sample size tests with unknown variance, except for two important differences. Wherever you see a  substitute the sample standard deviation s. The confidence intervals and hypothesis tests become approximate rather than exact. If the underlying distribution is not too non-normal, the approximation should be adequate if n  30. We’ll do an example.

Example 5: A civil engineer working for NYSDOT tests 50 standard-sized specimens of concrete for moisture content after 5 hours of drying time. She collects the following data (appropriate units). She also constructs the histogram shown below. 11.7 15.3 29.1 13.1 39.1 9.4 14.9 11.2 25.0 26.4 31.0 20.6 12.4 19.6 16.9 5.2 15.8 17.8 21.1 19.6 41.7 39.3 22.7 33.1 23.4 23.0 19.3 17.2 5.1 59.4 5.7 12.1 27.7 6.5 16.9 21.9 32.0 20.9 18.6 10.4 5.7 14.2 11.7 15.3 15.2 19.5 11.4 13.6

Stat > Basic Statistics > Normality Test
The histogram indicates that distribution is quite skewed. She also does a normality test: in MINITAB do Stat > Basic Statistics > Normality Test Very low P-value

She just wants a 95% confidence interval on the mean moisture content
She just wants a 95% confidence interval on the mean moisture content. Since the sample size is fairly large, an approximate 95% confidence interval is: You can do this on MINITAB by “making believe” that  is known and   s.

4-4.5 Some Practical Comments on Hypothesis Testing Statistical versus Practical Significance

Interval Estimation The standard error provides a measure of precision of an estimate. In fact, very often in the engineering literature a point on a graph will have so-called “error bars” around it. The plotted point is actually the mean of a sample and the error bar is This is actually a special case of a confidence interval, formally defined on the next slide.

4-4.6 Confidence Interval on the Mean Two-sided confidence interval: One-sided confidence intervals: Confidence coefficient:

4-4.6 Confidence Interval on the Mean

4-4.6 Confidence Interval on the Mean  z 0.005 2.576 0.01 2.326 0.025 1.96 0.05 1.645

Example 6: A. Let [by (10)] (11) L U 

Comparison of (9) and (11) implies that a 100(1 – )% confidence interval for  is
(12) B. The output temperature of a gas turbine is An engineer has “tweaked” the design in order to increase it by at least 5%. The standard deviation will not be affected. A test of 16 turbines gives a sample mean of 678º. By (12), a 95% confidence interval for the (population) mean is

4-4.6 Confidence Interval on the Mean Relationship between Tests of Hypotheses and Confidence Intervals If [l,u] is a 100(1 - a) percent confidence interval for the parameter, then the test of significance level a of the hypothesis will lead to rejection of H0 if and only if the hypothesized value is not in the 100(1 - a) percent confidence interval [l, u].

4-4.6 Confidence Interval on the Mean Confidence Level and Precision of Estimation The length of the two-sided 95% confidence interval is whereas the length of the two-sided 99% confidence interval is

4-4.6 Confidence Interval on the Mean Choice of Sample Size

Example 7: The tensile strength of a fiber used in the manufacture of insulation material is normally distributed with a standard deviation of 0.04gm. Having made some process adjustments, however, the mean has not been established. In a test of 40 fibers, the data shown on the next slide was collected. Find a 99% confidence interval for the population mean.

The 99% confidence interval is:
Tensile Strength (gms) 9.99 10.05 10.02 9.96 10.00 10.03 9.98 10.01 9.90 10.04 9.93 9.97 9.96 10.00 10.07 9.95 10.01 9.99 The 99% confidence interval is: Let’s see how to do it using MINITAB. Do Basic Statistics > 1-Sample Z

The result of Example 7 is: “With 99% confidence, the mean tensile strength is between and grams.” However, for a variable like tensile strength, an engineer would perhaps like to say: “With 99% confidence, the mean tensile strength is at least (whatever).” This brings us to the notion of one-sided confidence intervals, defined next.

Definition: Suppose  is an unknown parameter
Definition: Suppose  is an unknown parameter. (i) Suppose L is a statistic based on a sample (23a) then the observed value of L, l say, is called a 100(1 – )% one-sided lower confidence limit or bound for . (ii) Suppose U is a statistic based on a sample (23b) then the observed value of U, u say, is called a 100(1 – )% one-sided upper confidence limit for .

We said we would define one-sided confidence intervals, but in the definition, we refer to one-sided limits or bounds. Well – we really have defined intervals. Corresponding to the limits l and u in (23a,b) are the intervals (-, u) and ( l, ). For the present case of sampling from known, one-sided lower and upper 100(1  )% confidence limits are respectively (24)

Example 8: In Example 7, it would probably have been more appropriate to report a one-sided confidence limit. Suppose there is a customer-mandated spec that specifies that the mean strength be at least 10. The lower 99% one-sided confidence limit for the mean tensile strength of this fiber is Remark: MINITAB Student doesn’t do one-sided limits directly. Can you figure out how to do it indirectly?

4-4.6 Confidence Interval on the Mean One-Sided Confidence Bounds

Problem 4-27 In the production of airbag inflators for automotive safety systems, a company is interested in ensuring that the mean distance of the foil to the edge of the inflator is at least 2 cm. Measurements on 20 inflators yielded an average value of 2.02 cm. Assume a standard deviation of .05 on the distance measurements and a significance level of .01. Test for conformance to the company’s requirement.

Airbag Inflator and Assembly

Problem 4-27 What is the P-value of this test?
What is the β of this test if the true mean is 1.97 cm.? What sample size would be necessary to detect a true mean of 1.97 with probability of at least .90? Find a 99% lower confidence bound on the true mean. Use the CI found in part e to test the hypothesis in part a. What sample size is required to be 95% confident that the error in estimating the mean is less than .01

4.4 Case II: Inference on the Mean of a Population, Variance Unknown
Let be a random sample from with  ² unknown. This case differs from Case I in that now we shall consider the more realistic situation in which the variance is unknown. The primary reason for considering Case I at all is that it is a convenient way of illustrating the basic concepts of confidence interval estimation and hypothesis testing.

and for the exact hypothesis test, under H0
When sampling from the normal distribution the exact confidence interval for  is based solely on the fact that (27a) and for the exact hypothesis test, under H0 (27b)

Although (27a,b) are true whether or not the variance is known, the problem is that if  is not known this knowledge does us no good with respect to confidence intervals and hypothesis tests for the mean. Intuitively, what you don’t know, you estimate. In particular, suppose in (27a), we estimate the population standard deviation by the sample standard deviation. This leads to the question

The pdf of the random variable T defined in (28) has been derived [see eqn. (4-40), pg. 163, text)]. The distribution is called (Student’s) t distribution. The parameter of a t distribution is it’s degrees of freedom, say k. In particular, the random variable defined by (28) has n – 1 degrees of freedom (write Thus, for each sample size, T has a different pdf. Intuitively, it would seem that the pdfs of the t distributions should look like standard normal pdfs. They do, but there is a difference. Some representative t pdfs are shown on the next slide.

Higher tails than pdf of standard normal
Pdfs of several t distributions, where k denotes the degrees of freedom. They look much like standard normal pdfs, but the tails are higher (more variance), but note that the t pdf approaches that of the standard normal as

Percentage points of t distributions are defined similarly to those of the standard normal as shown in the figure below. Percentage points of the t distribution, where k denotes the degrees of freedom. To obtain percentage points, use MINITAB or Table II in Appendix A. We’ll demonstrate.

4-5 Inference on the Mean of a Population, Variance Unknown

Example 4-7 A sample of size 15 was taken on the coefficient of restitution of golf club drivers. The variance of the coefficient is unknown Does the mean coefficient of restitution exceed .82

4-5.2 P-value for a t-Test

b - P( Type II Error) We need to know what is the distribution of the test statistic under H1? The answer is: T0 has a so-called non-central t-distribution. The probability for making a type II error, and the corresponding power of the test can be illustrated for hypothesis tests having an unknown variance in an identical fashion as was done for tests having a known variance. However, numerically this is a cumbersome computational procedure, an we will find β values and the power of a test using Minitab only.

β - P( Type II Error) Example 4-8
If the mean differs from .82 by as much as .02, is the sample size of 15 adequate to ensure the null hypothesis will be rejected with probability of at least .80 Stat>Power and Sample>1 sample t Calculate power for each sample size Specify a sample size, difference, sigma, click on options and insert the type of alternative hypothesis and level of significance.

Sample size The sample size required to detect a difference from the hypothesized mean when the variance of the characteristic of interest is unknown, with a specified power will be determined using Minitab. Example 4-8 Stat>Power and Sample>1 sample t Calculate sample size for each power value Specify a power, difference, sigma, click on options and insert the type of alternative hypothesis and level of significance.

4-5.3 Type II Error and Choice of Sample Size Fortunately, this unpleasant task has already been done, and the results are summarized in a series of graphs in Appendix A Charts Va, Vb, Vc, and Vd that plot for the t-test against a parameter d for various sample sizes n.

4-5.3 Type II Error and Choice of Sample Size These graphics are called operating characteristic (or OC) curves. Curves are provided for two-sided alternatives on Charts Va and Vb. The abscissa scale factor d on these charts is defined as

Confidence Intervals for the Mean
We can derive a confidence interval for the t-distributed test statistic in a manner similar to when the test statistic follows the standard normal distribution. We can write: (29) By the definition of confidence intervals (see (9)), we see from (29) that

A 100(1 – )% confidence interval for the mean of a normal distribution, variance unknown, is
(30) To obtain one-sided lower (upper) 100(1 – )% limits, in (30), change /2 to  and take the lower (upper) limit. Example 9: Suppose in Example 7 neither the mean nor the standard deviation is known. A. Use the data given in Example 7 to find a 99% confidence interval for the mean.

In Example 7, confidence interval when the variance is unknown is thus Recall when the variance was known in ex. 7, the CI was B. Now use MINITAB. Go to: Stat>Basic Statistics > 1-Sample t The results appear on the next slide.

Sample Size Determination for Confidence Intervals (Variance Unknown)
Recall that in the case where the variance is known, sample size for a 100(1  )% confidence interval is given by (26), repeated below (26) In the present case with the variance unknown, the best that can be done is to use (26) with your best guess of . You can also take a pilot sample and estimate .

Modeling and Analysis of Uncertainty

Similar presentations

Presentation on theme: "Modeling and Analysis of Uncertainty"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modeling and Analysis of Uncertainty

Similar presentations

Presentation on theme: "Modeling and Analysis of Uncertainty"— Presentation transcript:

Similar presentations

About project

Feedback