Basic Statistics Six Sigma Foundations Continuous Improvement Training Six Sigma Foundations Continuous Improvement Training Six Sigma Simplicity
Key Learning Points Simple Statistics can: Increase your Understanding of Process Behavior Helps Identify Improvement Opportunities for 6S Simple Statistics can: Increase your Understanding of Process Behavior Helps Identify Improvement Opportunities for 6S
StatisticsStatistics Common statistics: Miles per gallon (liter); mpg (mpl) Median home prices Consumer price index Inflation rate Stock market average Airline on-time arrival rate Statistics are computed using data. Statistics summarize the data and help us to predict future performance. Common statistics: Miles per gallon (liter); mpg (mpl) Median home prices Consumer price index Inflation rate Stock market average Airline on-time arrival rate Statistics are computed using data. Statistics summarize the data and help us to predict future performance.
Basic Statistics Serve as a means to analyze data collected in the Measure phase. Allow us to numerically describe the data that characterizes our process’ Xs and Ys. Use past process and performance data to make inferences about the future. Serve as a foundation for advanced statistical problem-solving methodologies. Are a concept that creates a universal language based on numerical facts rather than intuition. Serve as a means to analyze data collected in the Measure phase. Allow us to numerically describe the data that characterizes our process’ Xs and Ys. Use past process and performance data to make inferences about the future. Serve as a foundation for advanced statistical problem-solving methodologies. Are a concept that creates a universal language based on numerical facts rather than intuition.
Data Visualization Before any statistical tools are applied, visually display and look at your data. A histogram allows us to look at how the data is distributed across our Y scale of measure. Before any statistical tools are applied, visually display and look at your data. A histogram allows us to look at how the data is distributed across our Y scale of measure. Number of Wins for National Football League Teams (1998) Number of Teams Five teams won eight games Source: AOLSports Number of Games Won
Building a Histogram The following data came from our bicycle test facility: stopping distances required to bring a 150 lb weight to a complete stop with the rear brake applied from a 10 mph cruising speed. Feet X-AxisY-AxisFrequency
In addition to counting occurrences and graphing the results, we can describe processes in terms of central tendency and dispersion. Measures of Central Tendency Mean ( , X bar )—The arithmetic average of a set of values Uses the quantitative value of each data point Is strongly influenced by extreme values Median (M)—The number that reflects the middle of a set of values Is the 50th percentile Is identified as the middle number after all the values are sorted from high to low Is not affected by extreme values Mode—The most frequently occurring value in a data set Measures of Central Tendency Mean ( , X bar )—The arithmetic average of a set of values Uses the quantitative value of each data point Is strongly influenced by extreme values Median (M)—The number that reflects the middle of a set of values Is the 50th percentile Is identified as the middle number after all the values are sorted from high to low Is not affected by extreme values Mode—The most frequently occurring value in a data set
Central Tendency Exercise Determine the mean, median, and mode for the bicycle stopping distances used to create the histograms. Mean =________ Median = ________ Mode= ________ Determine the mean, median, and mode for the bicycle stopping distances used to create the histograms. Mean =________ Median = ________ Mode= ________
Positive Skew Frequency Mean Median Mode Negative Skew Frequency Mean Median Mode Normal Frequency Mode Median Mean Mean, Median, Mode
Range (R) —The d ifference between the highest and lowest Sample Variance (s 2 ) —The average squared distance of each point from the average (Xbar) Sample Standard Deviation(s) —The square root of the variance Range (R) —The d ifference between the highest and lowest Sample Variance (s 2 ) —The average squared distance of each point from the average (Xbar) Sample Standard Deviation(s) —The square root of the variance Measures of Dispersion
Example of Measures of Dispersion Number of Wins for National Football League Teams (1998) Source: AOLSports Frequency Range = 12 Xbar = 8 s 2 = s = 3.42
Dispersion Exercise Find measures of dispersion for the stopping distance data. Fill in the table at the right. Range (R) = Variance (s 2 ) = Std Dev (s) = Find measures of dispersion for the stopping distance data. Fill in the table at the right. Range (R) = Variance (s 2 ) = Std Dev (s) =
A sample is just a subset of all possible values. Population Sample Since the sample does not contain all the possible values, there is some uncertainty about the population. Hence any statistics, such as mean and standard deviation, are just estimates of the true population parameters. Population vs. Sample (Certainty vs. Uncertainty)
SamplePopulation Mean (n = # of samples) Standard Deviation (little “s”) SymbolsSymbols
The Normal Curve In 80 to 90% of problems worked, data will follow a normal bell curve or can be transformed to look like a normal curve. This curve is described by the X bar and s “statistic.” The area under this curve is 1 or 100%. In 80 to 90% of problems worked, data will follow a normal bell curve or can be transformed to look like a normal curve. This curve is described by the X bar and s “statistic.” The area under this curve is 1 or 100%. s For the normal curve, mean = median = mode.
Normal Bell Curve Properties Histograms (bar charts) are developed from samples. Sample statistics (X bar and s) are calculated from representatives of the population. From the histogram and sample statistics, we form a curve that represents the population from which these samples were drawn. Histograms (bar charts) are developed from samples. Sample statistics (X bar and s) are calculated from representatives of the population. From the histogram and sample statistics, we form a curve that represents the population from which these samples were drawn % of the data falls within 6 standard deviations from the mean 6sd 99.73% of the data falls within 3 standard deviations from the mean 3sd 68.26% of the data falls within 1 standard deviation from the mean 1sd
Frequency Normal Frequency Uniform Frequency Exponential Frequency Other Data Distributions Log Normal
Normal Curve Exercise Here is a histogram of the bike stopping distance data. (X bar = 10, s = 2) Does the histogram appear normal? Draw vertical lines at 1sd, 2sd 4sd Discuss Here is a histogram of the bike stopping distance data. (X bar = 10, s = 2) Does the histogram appear normal? Draw vertical lines at 1sd, 2sd 4sd Discuss
Basic Statistics Six Sigma Foundations Continuous Improvement Training Six Sigma Foundations Continuous Improvement Training