2/15/2016ENGM 720: Statistical Process Control1 ENGM Lecture 03 Describing & Using Distributions, SPC Process
2/15/2016 ENGM 720: Statistical Process Control 2 Assignment: Reading: Chapter 2 Finish reading Chapter 3 Start reading Assignment 2: Obtain access to MS Excel Verify access to the Data Analysis Add-In Access the class website: Download of Normal Plot data spreadsheet Download Assignment 2 Instructions (Materials page)
2/15/2016 ENGM 720: Statistical Process Control 3 What is Quality Many definitions: Better performance Better service Better value Whatever the customer says it is… For SPC, quality means better: Understanding of process variation, Control of the variation in the process, and Improvement in the process variation.
2/15/2016 ENGM 720: Statistical Process Control 4 Understanding Process Variation Three Aspects: Location Spread Shape Basic Statistics: Quantify Communicate
2/15/2016 ENGM 720: Statistical Process Control 5 Location: Mode The mode is the value (or values) that occurs most frequently in a distribution. To find the mode: 1. Sort the values into order (with no repeats), 2. Tally up how many times each value appears in the original distribution. 3. The mode (or modes) has the largest tally Dist. 1 has two modes: 20 and 15 (four times, ea.) Dist. 2 has one mode: 15 (appearing seven times)
2/15/2016 ENGM 720: Statistical Process Control 6 Location: Median Half of the values will fall above and half of the values will fall below the median value. To estimate the median: Sort the values (keeping the duplicates in the list), and then count from one end until you get to one half (rounding down) of the total number of values. For an odd number of values, the median is the next value. For an even number of values, the median value is half of the sum of the current value and the next sorted value. Dist. 1 median is 19.5 Dist. 2 median is 15
2/15/2016 ENGM 720: Statistical Process Control 7 The mean has a special notation: x for a sample ( for the entire population) To calculate the mean: 1. add up all of the values 2. divide the sum by the number of values Dist. 1 mean is 18.6, Dist. 2 mean is 15.0Mean is influenced by outliers Location: Mean
2/15/2016 ENGM 720: Statistical Process Control 8 Range is the difference between the maximum and the minimum values, denoted R. This value gives us the extreme limits of the distribution spread. Much easier to calculate than other measures Very sensitive to outliers Range of Dist. 1 is 11 Range of Dist. 2 is 4 Spread: Range
2/15/2016 ENGM 720: Statistical Process Control 9 Spread: Variance Variance has the symbol 2 when referring to the entire population (s 2 for a sample variance) The formula for the variance is: Measures the dispersion with less emphasis on outliers Units for variance aren’t very intuitive Manual calculation is unpleasant (calculating equation could be used) The variance for Dist. 1 is 10.58, for Dist. 2 it is 1.63 If population is known, use n in denominator!
2/15/2016 ENGM 720: Statistical Process Control 10 Spread: Standard Deviation The standard deviation ( for the population, or s for a sample) is the square root of the variance. Defn. Special calculating formula: Not as easily influenced by outliers Has the same units as measure of location. Std deviation for Dist. 1 is 3.25 Std deviation for Dist. 2 is 1.28 If population is known, use n in denominator!
2/15/2016 ENGM 720: Statistical Process Control 11 Shape: Prob. Density Functions The shape of a distribution is a function that maps each potential x-value to the likelihood that it would appear if we sampled at random from the distribution. This is the probability density function (PDF). +2 -2 +3 -3 ++ -- Area Under the Normal Curve 1 :68.26% of the total area 2 :95.46% of the total area 3 :99.73% of the total area
2/15/2016 ENGM 720: Statistical Process Control 12 Shape: Stem-and-Leaf Plot Divide each number into: Stem – one or more of the leading digits Leaf – remaining digits (may be ordered) Choose between 4 and 20 stems Example: 4| 5| 6| 7| Done!
2/15/2016 ENGM 720: Statistical Process Control 13 Shape: Box (and Whisker) Plot Visual display of central tendency, variability, symmetry, outliers Max value Third quartile First quartile Median Mean Min value
2/15/2016 ENGM 720: Statistical Process Control 14 Shape: Histogram A histogram is a vertical bar chart that takes the shape of the distribution of the data. The process for creating a histogram depends on the purpose for making the histogram. One purpose of a histogram is to see the shape of a distribution. To do this, we would like to have as much data as possible, and use a fine resolution. A second purpose of a histogram is to observe the frequency with which a class of problems occurs. The resolution is controlled by the number of problem classes.
2/15/2016 ENGM 720: Statistical Process Control 15 Histogram Example (Excel)
2/15/2016 ENGM 720: Statistical Process Control 16 Goals of Statistical Quality Improvement Find special causes Head off shifts in process Obtain predictable output Continually improve the process Reduce Variability Identify Special Causes - Good (Incorporate) Improving Process Capability and Performance Characterize Stable Process Capability Head Off Shifts in Location, Spread Identify Special Causes - Bad (Remove) Continually Improve the System Statistical Quality Control and Improvement Time Center the Process LSL 0 USL
2/15/2016 ENGM 720: Statistical Process Control 17 Distributions Distributions quantify the probability of an event Events near the mean are most likely to occur, events further away are less likely to be observed 35.0 ( ) 41.4 ( +2 ) 32.6 ( -2 ) 43.6 ( +3 ) 30.4 ( -3 ) 39.2 ( + ) 34.8 ( - )
2/15/2016 ENGM 720: Statistical Process Control 18 Normal Distribution Notation: r.v. This is read: “x is normally distributed with mean and standard deviation .” Standard Normal Distribution r.v. (z represents a Standard Normal r.v.)
2/15/2016 ENGM 720: Statistical Process Control 19 Simple Interpretation of Standard Deviation of Normal Distribution
2/15/2016 ENGM 720: Statistical Process Control 20 Standard Normal Distribution The Standard Normal Distribution has a mean ( ) of 0 and a standard deviation ( ) of 1 Total area under the curve, (z), from z = – to z = is exactly 1 The curve is symmetric about the mean Half of the total area lays on either side, so: (– z) = 1 – (z) z (z)
2/15/2016 ENGM 720: Statistical Process Control 21 Standard Normal Distribution How likely is it that we would observe a data point more than 2.57 standard deviations beyond the mean? Area under the curve from – to z = 2.5 is found by using the table on pp , looking up the cumulative area for z = 2.57, and then subtracting the cumulative area from 1. z (z)
2/15/2016 ENGM 720: Statistical Process Control 22
2/15/2016 ENGM 720: Statistical Process Control 23 Standard Normal Distribution How likely is it that we would observe a data point more than 2.57 standard deviations beyond the mean? Area under the curve from – to z = 2.5 is found by using the table on pp , looking up the cumulative area for z = 2.57, and then subtracting the cumulative area from 1. Answer: 1 – =.00508, or about 5 times in 1000 z (z)
2/15/2016 ENGM 720: Statistical Process Control 24 What if the distribution isn’t a Standard Normal Distribution? If it is from any Normal Distribution, we can express the difference from an observation to the mean in units of the standard deviation, and this converts it to a Standard Normal Distribution. Conversion formula is: where: x is the point in the interval, is the population mean, and is the population standard deviation.
2/15/2016 ENGM 720: Statistical Process Control 25 What if the distribution isn’t even a Normal Distribution? The Central Limit Theorem allows us to take the sum of several means, regardless of their distribution, and approximate this sum using the Normal Distribution if the number of observations is large enough. Most assemblies are the result of adding together components, so if we take the sum of the means for each component as an estimate for the entire assembly, we meet the CLT criteria. If we take the mean of a sample from a distribution, we meet the CLT criteria (think of how the mean is computed).
2/15/2016 ENGM 720: Statistical Process Control 26 Example: Process Yield Specifications are often set irrespective of process distribution, but if we understand our process we can estimate yield / defects. Assume a specification calls for a value of 35.0 2.5. Assume the process has a distribution that is Normally distributed, with a mean of 37.0 and a standard deviation of Estimate the proportion of the process output that will meet specifications.
2/15/2016 ENGM 720: Statistical Process Control 27 Continuous & Discrete Distributions Continuous Probability of a range of outcomes is the area under the PDF (integration) Discrete Probability of a range of outcomes is the area under the PDF (sum discrete outcomes) 35.0 ( ) 41.4 ( +2 ) 32.6 ( -2 ) 43.6 ( +3 ) 30.4 ( -3 ) 39.2 ( + ) 34.8 ( - ) 35.0 ()()
2/15/2016 ENGM 720: Statistical Process Control 28 Discrete Distribution Example Sum of two six-sided dice: Outcomes range from 2 to 12. Count the possible ways to obtain each individual sum - forms a histogram What is the most frequently occurring sum that you could roll? Most likely outcome is a sum of 7 (there are 6 ways to obtain it) What is the probability of obtaining the most likely sum in a single roll of the dice? 6 36 =.167 What is the probability of obtaining a sum greater than 2 and less than 11? 32 36 =.889
2/15/2016 ENGM 720: Statistical Process Control 29 How do we know what the distribution is when all we have is a sample? Theory – “CLT applies to measurements taken consisting of many assemblies…” Experience – “past use of a distribution has generated very good results…” “Testing” – combination of the above … in this case, anyway! If we know the generating function for a distribution, we can construct a grid (probability paper) that will allow us to observe a straight line when sufficient data from that distribution are plotted on the grid Easiest grid to create is the Standard Normal Distribution … because it is an easy transformation to “standard“ parameters
2/15/2016 ENGM 720: Statistical Process Control 30 Normal Probability Plots Take raw data and count observations (n) Set up a column of j values (1 to j) Compute (z j ) for each j value (z j ) = (j - 0.5)/n Get z j value for each (z j ) in Standard Normal Table Find table entry( (z j )), then read index value (z j ) Set up a column of sorted, observed data Sorted in increasing value Plot z j values versus sorted data values Approximate with sketched line at 25% and 75% points
2/15/2016 ENGM 720: Statistical Process Control 31 Interpreting Normal Plots Assess Equal-Variance and Normality assumptions Data from a Normal sample should tend to fall along the line, so if a “fat pencil” covers almost all of the points, then a normality assumption is supported The slope of the line reflects the variance of the sample, so equal slopes support the equal variance assumption Theoretically: Sketched line should intercept the z j = 0 axis at the mean value Practically: Close is good enough for comparing means Closer is better for comparing variances If the slopes differ much for two samples, use a test that assumes the variances are not the same
2/15/2016 ENGM 720: Statistical Process Control 32 Relationship with Hypothesis Tests Assuming that our process is Normally Distributed and centered at the mean, how far apart should our specification limits be to obtain 99. 5% yield? Proportion defective will be 1 –.995 =.005, and if the process is centered, half of those defectives will occur on the right tail (.0025), and half on the left tail. To get 1 –.0025 = 99.75% yield before the right tail requires the upper specification limit to be set at .
2/15/2016 ENGM 720: Statistical Process Control 33
2/15/2016 ENGM 720: Statistical Process Control 34 Relationship with Hypothesis Tests Assuming that our process is Normally Distributed and centered at the mean, how far apart should our specification limits be to obtain 99. 5% yield? Proportion defective will be 1 –.995 =.005, and if the process is centered, half of those defectives will occur on the right tail (.0025), and half on the left tail. To get 1 –.0025 = 99.75% yield before the right tail requires the upper specification limit to be set at . By symmetry, the remaining.25% defective should occur at the left side, with the lower specification limit set at – 2.81 If we specify our process in this manner and made a lot of parts, we would only produce bad parts.5% of the time.
2/15/2016 ENGM 720: Statistical Process Control 35 Questions & Issues