Probability and Statistics for Engineers Descriptive Statistics Measures of Central Tendency Measures of Variability Probability Distributions Discrete Continuous Statistical Inference Design of Experiments Regression JMB Chapter 1 EGR 252.001 Spring 2009
Descriptive Statistics Numerical values that help to characterize the nature of data for the experimenter. Example: The absolute error in the readings from a radar navigation system was measured with the following results: the sample mean, x = ? 17 22 39 31 28 52 147 (17+22+39+31+28+52+147)/7 = 48 order: 17 22 28 31 39 52 147 ↑ median = x (n+1)/2 , n odd median = (xn/2 + xn/2+1)/2 JMB Chapter 1 EGR 252.001 Spring 2009
Calculation of Mean Example: The absolute error in the readings from a radar navigation system was measured with the following results: _ the sample mean, X = (17+ 22+ 39 + 31+ 28 + 52 + 147) / 7 = 48 17 22 39 31 28 52 147 (17+22+39+31+28+52+147)/7 = 48 order: 17 22 28 31 39 52 147 ↑ median = x (n+1)/2 , n odd median = (xn/2 + xn/2+1)/2 JMB Chapter 1 EGR 252.001 Spring 2009
Calculation of Median Example: The absolute error in the readings from a radar navigation system was measured with the following results: the sample median, x = ? Arrange in increasing order: 17 22 28 31 39 52 147 n odd median = x (n+1)/2 , → 31 n even median = (xn/2 + xn/2+1)/2 17 22 39 31 28 52 147 ~ (17+22+39+31+28+52+147)/7 = 48 order: 17 22 28 31 39 52 147 ↑ median = x (n+1)/2 , n odd median = (xn/2 + xn/2+1)/2 JMB Chapter 1 EGR 252.001 Spring 2009
Descriptive Statistics: Variability A measure of variability (Recall) Example: The absolute error in the readings from a radar navigation system was measured with the following results: sample range: Max - Min 17 22 39 31 28 52 147 range = max – min (useful measure, but very susceptible to extreme values and doesn’t say much about what happens in between) = 147 – 17 = 130 variance – measures the spread of the data around the mean (more on next page …) JMB Chapter 1 EGR 252.001 Spring 2009
Calculations: Variability of the Data sample variance, sample standard deviation, 17 22 39 31 28 52 147 mean 48 median 31 variance 2037.3 std dev 45.137 JMB Chapter 1 EGR 252.001 Spring 2009
Other Descriptors Discrete vs Continuous Distribution of the data discrete: countable continuous: measurable Distribution of the data “What does it look like?” JMB Chapter 1 EGR 252.001 Spring 2009
Graphical Methods – Stem and Leaf Stem and leaf plot for radar data Stem Leaf Frequency 1 7 1 2 2 8 2 3 1 9 2 4 5 2 1 6 7 8 9 10 11 12 13 14 7 1 Stem Leaf Frequency 1 7 1 2 2 8 2 3 1 9 2 4 5 2 1 6 7 8 9 10 11 12 13 14 7 1 JMB Chapter 1 EGR 252.001 Spring 2009
Graphical Methods - Histogram Frequency Distribution (histogram) Develop equal-size class intervals – “bins” ‘Rules of thumb’ for number of intervals 7-15 intervals per data set Square root of n Interval width = range / # of intervals Build table Identify interval or bin starting at low point Determine frequency of occurrence in each bin Calculate relative frequency Build graph Plot frequency vs interval midpoint JMB Chapter 1 EGR 252.001 Spring 2009
Data for Histogram Example: stride lengths (in inches) of 25 male students were determined, with the following results: What can we learn about the distribution of stride lengths for this sample? Stride Length 28.60 26.50 30.00 27.10 27.80 26.10 29.70 27.30 28.50 29.30 26.80 27.00 26.60 29.50 28.00 29.00 25.70 28.80 31.40 JMB Chapter 1 EGR 252.001 Spring 2009
Constructing a Histogram Determining frequencies and relative frequencies Lower Upper Midpoint Frequency Relative Frequency 24.85 26.20 25.525 2 0.08 27.55 26.875 10 0.40 28.90 28.225 7 0.28 30.25 29.575 5 0.20 31.60 30.925 1 0.04 Class intervals – divide max-min by 5 (sqrt of 25), then either add that number to successive intervals OR let Excel find the bins (caution – Excel bins are determined differently … you may want to play a little to get a good picture of the data…) JMB Chapter 1 EGR 252.001 Spring 2009
Computer-Generated Histograms There is no single convention for the x-axis. JMB Chapter 1 EGR 252.001 Spring 2009
Relative Frequency Graph Preferred method is to use cell midpoints on x-axis JMB Chapter 1 EGR 252.001 Spring 2009
Graphical Methods – Dot Diagram Dot diagram (text) Dotplot (Minitab) General rule: one dot for each data point 28.6 26.5 30.0 27.1 27.8 26.1 29.7 27.3 28.5 29.3 28.6 28.6 26.8 27.0 27.3 26.6 29.5 27.0 27.3 28.0 29.0 27.3 25.7 28.8 31.4 JMB Chapter 1 EGR 252.001 Spring 2009