1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.

Slides:



Advertisements
Similar presentations
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Random Sampling and Data Description
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Modeling Process Quality
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
QBM117 Business Statistics
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
-Exponential Distribution -Weibull Distribution
Chapter 2 Describing Data with Numerical Measurements
Data Handling Collecting Data Learning Outcomes  Understand terms: sample, population, discrete, continuous and variable  Understand the need for different.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
REPRESENTATION OF DATA.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
AP Statistics: Section 2.2 C. Example 1: Determine if each of the following is likely to have a Normal distribution (N) or a non-normal distribution (nn).
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
Chapter 6 - Random Sampling and Data Description More joy of dealing with large quantities of data Chapter 6B You can never have too much data.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Exploratory Data Analysis Observations of a single variable.
1 Special Continuous Probability Distributions -Exponential Distribution -Weibull Distribution Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Numerical Statistics Given a set of data (numbers and a context) we are interested in how to describe the entire set without listing all the elements.
1 Statistical Analysis – Descriptive Statistics Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Chapter 8 Making Sense of Data in Six Sigma and Lean
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Stracener_EMIS 7305/5305_Spr08_ Reliability Data Analysis and Model Selection Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Chapter 6: Interpreting the Measures of Variability.
Using Measures of Position (rather than value) to Describe Spread? 1.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Chapter 31Introduction to Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc.
1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.
Chapter 5: Organizing and Displaying Data. Learning Objectives Demonstrate techniques for showing data in graphical presentation formats Choose the best.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Chapter 6 Continuous Random Variables Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Parameter, Statistic and Random Samples
Exploratory Data Analysis
Analysis and Empirical Results
Statistics 1: Statistical Measures
-Exponential Distribution -Weibull Distribution
ISE 261 PROBABILISTIC SYSTEMS
NUMERICAL DESCRIPTIVE MEASURES
Ch. 18- Descriptive Statistics.
Descriptive Statistics
Box and Whisker Plots Algebra 2.
An Introduction to Statistics
Topic 5: Exploring Quantitative data
Numerical Measures: Skewness and Location
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
2-1 Data Summary and Display 2-1 Data Summary and Display.
Continuous Statistical Distributions: A Practical Guide for Detection, Description and Sense Making Unit 3.
Honors Statistics Review Chapters 4 - 5
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
The Normal Distribution
Descriptive Statistics Civil and Environmental Engineering Dept.
Presentation transcript:

1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems Engineering Program Department of Engineering Management, Information and Systems Stracener_EMIS 7370/STAT 5340_Sum 08_

2 Time Series Graph or Run Chart Box Plot Histogram and Relative Frequency Histogram Frequency Distribution Probability Plotting

Stracener_EMIS 7370/STAT 5340_Sum 08_ A plot of the data set x 1, x 2, …, x n in the order in which the data were obtained Used to detect trends or patterns in the data over time Time Series Graph or Run Chart

Stracener_EMIS 7370/STAT 5340_Sum 08_ A pictorial summary used to describe the most prominent statistical features of the data set, x 1, x 2, …, x n, including its: - Center or location - Spread or variability - Extent and nature of any deviation from symmetry - Identification of ‘outliers’ Box Plot

Stracener_EMIS 7370/STAT 5340_Sum 08_ Shows only certain statistics rather than all the data, namely - median - quartiles - smallest and greatest values in the sample Immediate visuals of a box plot are the center, the spread, and the overall range of the data Box Plot

Stracener_EMIS 7370/STAT 5340_Sum 08_ Given the following random sample of size 25: 38, 10, 60, 90, 88, 96, 1, 41, 86, 14, 25, 5, 16, 22, 29, 34, 55, 36, 37, 36, 91, 47, 43, 30, 98 Arranged in order from least to greatest: 1, 5, 10, 14, 16, 22, 25, 29, 30, 34, 36, 36, 37, 38, 41, 43, 47, 55, 60, 86, 88, 90, 91, 96, 98 Box Plot

Stracener_EMIS 7370/STAT 5340_Sum 08_ First, find the median, the value exactly in the middle of an ordered set of numbers. The median is 37 Next, we consider only the values to the left of the median: 1, 5, 10, 14, 16, 22, 25, 29, 30, 34, 36, 36 We now find the median of this set of numbers. The median for this group is ( )/2 = 23.5, which is the lower quartile. Box Plot

Stracener_EMIS 7370/STAT 5340_Sum 08_ Now consider the values to the right of the median. 38, 41, 43, 47, 55, 60, 86, 88, 90, 91, 96, 98 The median for this set is ( )/2 = 73, which is the upper quartile. We are now ready to find the interquartile range (IQR), which is the difference between the upper and lower quartiles, = is the interquartile range Box Plot

Stracener_EMIS 7370/STAT 5340_Sum 08_ The lower quartile 23.5 The median is 37 The upper quartile 73 The interquartile range is 49.5 The mean is 45.1 upper quartile lower extreme upper extreme lower quartile median mean Box Plot

Stracener_EMIS 7370/STAT 5340_Sum 08_ A graph of the observed frequencies in the data set, x 1, x 2, …, x n versus data magnitude to visually indicate its statistical properties, including - shape - location or central tendency - scatter or variability Histogram Guidelines for Constructing Histograms – Discrete Data

Stracener_EMIS 7370/STAT 5340_Sum 08_ If the data x 1, x 2, …, x n are from a discrete random variable with possible values y 1, y 2, …, y k count the number of occurrences of each value of y and associate the frequency f i with y i, for i = 1, …, k, Note that Guidelines for Constructing Histograms – Discrete Data

Stracener_EMIS 7370/STAT 5340_Sum 08_ If the data x 1, x 2, …, x n are from a continuous random variable - select the number of intervals or cells, r, to be a number between 3 and 20, as an initial value use r = (n) 1/2, where n is the number of observations - establish r intervals of equal width, starting just below the smallest value of x - count the number of values of x within each interval to obtain the frequency associated with each interval - construct graph by plotting (f i, i) for i = 1, 2, …, k Guidelines for Constructing Histograms – Discrete Data

Stracener_EMIS 7370/STAT 5340_Sum 08_ To illustrate the construction of a relative frequency distribution, consider the following data which represent the lives of 40 car batteries of a given type recorded to the nearest tenth of a year. The batteries were guaranteed to last 3 years. Histogram and Relative Frequency Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ For this example, using the guidelines for constructing a histogram, the number of classes selected is 7 with a class width of 0.5. The frequency and relative frequency distribution for the data are shown in the following table. Histogram and Relative Frequency Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ The following diagram is a relative frequency histogram of the battery lives with an approximate estimate of the probability density function superimposed. Histogram and Relative Frequency

Stracener_EMIS 7370/STAT 5340_Sum 08_ Data are plotted on special graph paper designed for a particular distribution - Normal- Weibull - Lognormal- Exponential If the assumed model is adequate, the plotted points will tend to fall in a straight line If the model is inadequate, the plot will not be linear and the type & extent of departures can be seen Once a model appears to fit the data reasonably will, percentiles and parameters can be estimated from the plot Probability Plotting

Stracener_EMIS 7370/STAT 5340_Sum 08_ Step 1: Obtain special graph paper, known as probability paper, designed for the distribution under examination. Weibull, Lognormal and Normal paper are available at: Step 2: Rank the sample values from smallest to largest in magnitude i.e., X 1  X 2 ..., X n. Probability Plotting Procedure

Stracener_EMIS 7370/STAT 5340_Sum 08_ Step 3: Plot the X i ’s on the paper versus or, depending on whether the marked axis on the paper refers to the % or the proportion of observations. The axis of the graph paper on which the X i ’s are plotted will be referred to as the observational scale, and the axis for as the cumulative scale. Step 4: If a straight line appears to fit the data, draw a line on the graph, ‘by eye’. Step 5: Estimate the model parameters from the graph. Probability Plotting General Procedure

Stracener_EMIS 7370/STAT 5340_Sum 08_ If the cumulative probability distribution function is We now need to linearize this function into the form y = ax +b Weibull Probability Plotting Paper

Stracener_EMIS 7370/STAT 5340_Sum 08_ Then which is the equation of a straight line of the form y = ax +b Weibull Probability Plotting Paper

Stracener_EMIS 7370/STAT 5340_Sum 08_ where and Weibull Probability Plotting Paper

Stracener_EMIS 7370/STAT 5340_Sum 08_ which is a linear equation with a slope of b and an intercept of. Now the x- and y-axes of the Weibull probability plotting paper can be constructed. The x-axis is simply logarithmic, since x = ln(T) and Weibull Probability Plotting Paper

Stracener_EMIS 7370/STAT 5340_Sum 08_ cumulative probability (in %) x Weibull Probability Plotting Paper

Stracener_EMIS 7370/STAT 5340_Sum 08_ To illustrate the process let 10, 20, 30, 40, 50, and 80 be a random sample of size n = 6. Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ We need value estimates corresponding to each of the sample values in order to plot the data on the Weibull probability paper. These estimates are accomplished with what are called median ranks. Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ Median ranks represent the 50% confidence level (“best guess”) estimate for the true value of F(t), based on the total sample size and the order number (first, second, etc.) of the data. Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ There is an approximation that can be used to estimate median ranks, called Benard’s approximation. It has the form: where n is the sample size and i is the sample order number. Tables of median ranks can be found in may statistics and reliability texts. Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ Based on Benard’s approximation, we can now calculate F(t) for each observed value of X. These are shown in the following table: For example, for x 2 =20, ^ ^ Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ cumulative probability (in %) x Weibull Probability Plotting Paper

Stracener_EMIS 7370/STAT 5340_Sum 08_ Now that we have y-coordinate values to go with the x- coordinate sample values so we can plot the points on Weibull probability paper. F(x) (in %) x ^ Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ The line represents the estimated relationship between x and F(x): x F(x) (in %) ^ Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ In this example, the points on Weibull probability paper fall in a fairly linear fashion, indicating that the Weibull distribution provides a good fit to the data. If the points did not seem to follow a straight line, we might want to consider using another probability distribution to analyze the data. Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ Probability Plotting - Example

Stracener_EMIS 7370/STAT 5340_Sum 08_ Probability Paper - Normal

Stracener_EMIS 7370/STAT 5340_Sum 08_ Probability Paper - Lognormal

Stracener_EMIS 7370/STAT 5340_Sum 08_ Probability Paper - Exponential

Stracener_EMIS 7370/STAT 5340_Sum 08_ Given the following random sample of size n=8, which probability distribution provides the best fit ? Example - Probability Plotting

Stracener_EMIS 7370/STAT 5340_Sum 08_ specimens are cut from a plate for tensile tests. The tensile tests were made, resulting in Tensile Strength, x, as follows: Perform a statistical analysis of the tensile strength data. 40 Specimens

Stracener_EMIS 7370/STAT 5340_Sum 08_ Time Series plot: By visual inspection of the scatter plot, there seems to be no trend. 40 Specimens

Stracener_EMIS 7370/STAT 5340_Sum 08_ Specimens Using the descriptive statistics function in Excel, the following were calculated:

Stracener_EMIS 7370/STAT 5340_Sum 08_ Specimens From looking at the Histogram and the Normal Probability Plot, we see that the tensile strength can be estimated by a normal distribution. Using the histogram feature of excel the following data was calculated: and the graph:

Stracener_EMIS 7370/STAT 5340_Sum 08_ Specimens Box Plot The lower quartile The median is The mean 52.6 The upper quartile 55.3 The interquartile range is lower extreme upper extreme lower quartile upper quartile median mean

Stracener_EMIS 7370/STAT 5340_Sum 08_ Specimens

Stracener_EMIS 7370/STAT 5340_Sum 08_ Specimens

Stracener_EMIS 7370/STAT 5340_Sum 08_ Specimens

Stracener_EMIS 7370/STAT 5340_Sum 08_ The tensile strength distribution can be estimated by 40 Specimens f(x) F(x) ^ ^