Statistika & Rancangan Percobaan Dicky Dermawan www.dickydermawan.net78.net dickydermawan@gmail.com 2- Statistics
Nature & Purpose of Statistics In statistics we are concerned with method for designing and evaluating experiments to obtain information about practical problems. In most cases the inspection of each item of population would be too expensive, time-consuming, or even impossible. Hence a few of sample are drawn at random and from this inspection conclusion about the population are inferred.
Parameter Estimation Population sample Size large number N Size small number n Mean Average Variance Variance Probability function/density f(x) Relative frequency function Cumulative frequency function Distribution function F(x)
Processing of Sample: Frequency Table Sample of 100 Values of the Splitting Tensile Strength (lb/in2) 320 380 340 410 360 350 370 300 420 390 440 330 400
Processing of Sample: Absolute Frequency Function Sample of 100 Values of the Splitting Tensile Strength (lb/in2)
Processing of Sample: Relative Frequency Function Sample of 100 Values of the Splitting Tensile Strength (lb/in2)
Processing of Sample: Cumulative Absolute Frequency Sample of 100 Values of the Splitting Tensile Strength (lb/in2)
Processing of Sample: Cumulative Relative Frequency Sample of 100 Values of the Splitting Tensile Strength (lb/in2)
Processing of Sample: Box & Whisker Plot Min Lower Quartile Middle Quartile = Median Upper Quartile Interquartile range Max
Assignment: Box & Whisker Plot Portland Cement Formulation DOX 6E Montgomery
Inferential Statistic Experimental error Hypothesis testing: null hypothesis, alternative hypothesis Type I error α : rejecting a true hypothesis Type II error β: accepting a false hypohesis One-tail test vs Two-tail test Confidence level = Significance Level P-value Confidence interval
Confidence Interval of a normal distribution if σ known: Comparing a single mean to a specified/’standar’ value So far we have regarded the value y1, y2, ….of a sample as n observed value of a single random variable Y. We may equally well regard these n values as single observations of n random variables Y1, Y2,….that have the same distribution and are independent If Y1, …….Yn are independent normal random variables each of which has mean and variance σ2, then the normal random variable: Is normal with the mean and variance σ2/n and the random variable Is normal with the mean 0 and variance 1 The confidence interval for is
Problem: Example 2.1 A vendor submits lots of fabric to a textile manufacturer. The manufacturer wants to know if the lot average breaking strength exceeds 200 psi. If so, she wants to accept the lot. Past experience indicates that a reasonable value for the variance of breaking strength is 100 (psi)2. Four speciments are randomly selected, and the average breaking strength observed is
Example 2.1 The hypothesis to be tested are: This is a one-sided alternative hypothesis The value of the test statistic is: If the confidence level of 95% is chosen, i.e. type I error α = 0.05, we find Zα = 1.645 Thus the difference is significant: H0 is rejected and we conclude that the lot average breaking strength exceeds 200 psi. Thus, we accept the lot. The confidence interval for at 95% confidence level is 205.8 ≤ ≤ 222.2. Clearly, 200 is outside the interval. The P-value is 0.0026.
Problems
Problems
Comparing a single mean to a specified/’standar’ value Confidence Interval of a normal distribution if σ unknown The same as previous, but we use….. t distribution instead of normal distribution Sample standard deviation S instead of σ The test statistic is The confidence interval is At (n-1) degree of freedom
Comparing 2 Treatments Means If Variance Known The test statistic is The confidence interval is
Comparing 2 Treatments Means If Variance Unknown, but σ12 = σ22 Choose confidence level, usually 95%, then find critical t value at associated degree of freedom, i.e. t/2, If |t0|> t /2,, we have enough reason to reject null hypothesis and conclude that the two method differ significantly Alternatively, calculate P value, i.e. the risk of wrongly rejecting the null hypothesis Or set confidence interval and reject null hypothesis if 0 is not included in the interval
Comparing 2 Treatments Means If Variance Unknown, σ12 ≠ σ22 The test statistic is
Problems
Example: Portland Cement Formulation – Dot Diagram Tension bond strength of portland cement mortar is an important characteristics of the product. An engineer is interested in comparing the strength of a modified formulation in which polymer latex emulsions have been added during mixing to the strength of the unmodified mortar. He collected 10 observations (Table 2.1) Plot the dot diagram. Plot the Box & Whisker plot Are the two formulations really different? Or perhaps the observed difference is the results of sampling fluctuation and the two formulations are really identical? DOX 6E Montgomery
Problems
Problems
Problems
Inference about the difference in means Bloking: Paired Comparison Design Bloking is a design technique used to improve the precision with which the comparisons among the factors of interest are made. Often blocking is used to reduce or eliminate the variability transmitted from nuisance factors, i.e. factors that may influence the experimental response but in which we are not interested. The term block refers to a relatively homogeneous experimental unit, and the block represents a restriction on complete randomization because the treatment combinations are only randomized within the block. Blocking is carried out by making comparisons within matched pairs of experimental material. The confidence interval based on paired analysis usually much narrower than that from the independent analysis. This illustrates the noise reduction property of blocking.
Inference about the difference in means Bloking: Paired Comparison Design Statistical model 4 complete randomization: with (2ni -1) degree of freedom Statistical model with blocking: with only (ni pair -1) degree of freedom The test statistic: The confidence interval for 2-sided test:
Inference about the difference in means Bloking: Paired Comparison Design Example: The Story Consider a hardness testing machine that presses a rod with a pointed tip into a metal specimen with a known force. Two different tips are available for this machine, and it is suspected that one tip produces different hardness readings than the other. The test could be performed as follows: a number of metal specimens could randomly be selected. Half are tested by tip 1 and the other half by tip 2. The metal specimens might be cut from different bar stock that were not exactly different in their hardness. To protect against this possibility, an alternative experimental design should be considered: divide each specimen into two part and randomly assign each tip to ½ of each specimen
Inference about the difference in means Bloking: Paired Comparison Design Example: Data Speciment Tip 1 Tip 2 1 7 6 2 3 5 4 8 9 10 - Use the paired data to determine a 95% confidence interval for the difference - What if we use pooled or independent analysis?
Problems
Problems
Inference about The Variances of Normal Distributions In some experiments it is the comparison of variability in the data that is important. For example, in chemical laboratories, we may wish to compare the variability of two analytical methods. Unlike the tests on means, the procedures for tests on variances are rather sensitive to the normality assumption. Suppose we wish to test the hypothesis weather or not the variance of a normal population equals a constant, viz. σ02 . The test statistic is: The appropriate distribution for 02 is chi-square distribution with (n-1) degree of freedom. The confidence interval for σ02 is
Inference about The Variances of Normal Distributions Suppose we wish to test equality of the variances of two normal populations. If independent random samples of size n1 and n2 are taken from populations 1 & 2, respectively, the test statistic for: Is the ratio of the sample variances: The appropriate distribution for F0 is the F distribution with (n1-1) numerator degree of freedom and (n2-1) denominator degree of freedom. The null hypothesis would be rejected if F0 > Fα/2,n1-1,n2-1 The confidence interval for σ12 / σ22 is
Checking for Normality: Normal Probability Plot Probability plotting is a graphical technique for determining whether sample data conform to a hypothesized distribution based on a subjective visual examination of the data. To construct a probability plot, the observation in the sample are first rank from smallest to largest. That is, the sample y1,y2,…,yn is arranged as y(1) ,y(2) ,….,y(n) where y(1) is the smallest observation, with y(n) the largest. The ordered observations y(j) are then plotted against their observed cumulative frequency (j-0.5)/n. The cumulative frequency scale has been arranged so that if the hypothesized distribution adequately describes the data, the plotted points will fall approximately along a straight line. Usually, this is subjective.
Problems
Problems
Problems
Introduction to DOX An experiment is a test or a series of tests Experiments are used widely in the engineering world Process characterization & optimization Evaluation of material properties Product design & development Component & system tolerance determination “All experiments are designed experiments, some are poorly designed, some are well-designed” DOX 6E Montgomery
The Basic Principles of DOX Randomization Running the trials in an experiment in random order Notion of balancing out effects of “lurking” variables Replication Sample size (improving precision of effect estimation, estimation of error or background noise) Replication versus repeat measurements? (see page 13) Blocking Dealing with nuisance factors DOX 6E Montgomery
Strategy of Experimentation “Best-guess” experiments Used a lot More successful than you might suspect, but there are disadvantages… One-factor-at-a-time (OFAT) experiments Sometimes associated with the “scientific” or “engineering” method Devastated by interaction, also very inefficient Statistically designed experiments Based on Fisher’s factorial concept DOX 6E Montgomery