Statistics © 2012 Project Lead The Way, Inc.Principles of Engineering
Statistics The collection, evaluation, and interpretation of data
Statistics Descriptive Statistics Describe collected data Inferential Statistics Generalize and evaluate a population based on sample data Statistics
Values that possess names or labels Color of M&M’s, breed of dog, etc. Categorical or Qualitative Data Values that represent a measurable quantity Population, number of M&M’s, number of defective parts, etc. Numerical or Quantitative Data Data
Sampling Random Systematic Stratified Cluster Convenience Data Collection
Histogram Frequency Polygons Bar Chart Pie Chart Frequency distribution graph Categorical data graph Categorical data graph % Graphic Data Representation
Most frequently used measure of central tendency Strongly influenced by outliers—very large or very small values Mean Arithmetic average Sum of all data values divided by the number of data values within the array Measures of Central Tendency
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55 Determine the mean value of Measures of Central Tendency
Median Data value that divides a data array into two equal groups Data values must be ordered from lowest to highest Useful in situations with skewed data and outliers (e.g., wealth management) Measures of Central Tendency
Determine the median value of Organize the data array from lowest to highest value. 59, 60, 62, 63, 63 48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55 Select the data value that splits the data set evenly. 2, 5, 48, 49, 55, 58, Median = 58 What if the data array had an even number of values? 60, 62, 63, 63 5, 48, 49, 55, 58, 59, Measures of Central Tendency
Usually the highest point of curve Mode Most frequently occurring response within a data array May not be typical May not exist at all Modal, bimodal, and multimodal Measures of Central Tendency
Determine the mode of 48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55 Mode = 63 Determine the mode of 48, 63, 62, 59, 58, 2, 63, 5, 60, 59, 55 Mode = 63 & 59 Bimodal Determine the mode of 48, 63, 62, 59, 48, 2, 63, 5, 60, 59, 55 Mode = 63, 59, & 48 Multimodal Measures of Central Tendency
Range Standard Deviation Measure of data scatter Difference between the lowest and highest data value Square root of the variance Data Variation
Calculate by subtracting the lowest value from the highest value. 2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63 Calculate the range for the data array. Range
Sample Standard Deviation Population Standard Deviation. In practice, only the sample standard deviation can be measured and therefore is more useful for applications. Population Standard Deviation A population standard deviation represents a parameter, not a statistic. The standard deviation of a population gives researchers an amount of dispersion of data for an entire population of survey respondents. Sample Standard Deviation A standard deviation of a sample estimates the standard deviation of a population based on a random sample. The sample standard deviation, unlike the population standard deviation, is a statistic that measures the dispersion of the data around the sample mean. Standard Deviation – Sample vs. Population
s for a sample, not population 1. Calculate the mean 2. Subtract the mean from each value and then square it. 3. Sum all squared differences. 4. Divide the summation by the number of values in the array minus Calculate the square root of the product. Sample Standard Deviation
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63 Calculate the sample standard deviation for the data array ( ) 2 = ( ) 2 = ( ) 2 = 0.13 ( ) 2 = 1.85 ( ) 2 = ( ) 2 = ( ) 2 = ( ) 2 = ( ) 2 = ( ) 2 = Sample Standard Deviation
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63 Calculate the standard deviation for the data array = 5, s = Sample Standard Deviation
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63 Calculate the population standard deviation for the data array 1. Calculate the mean 2. Subtract the mean from each data value and square each difference Population Standard Deviation
Variation 3. Sum all squared differences = 5, Divide the summation by the number of data values 5. Calculate the square root of the result Note that this is the sum of the unrounded squared differences. Population Standard Deviation
Numerical assignment of each outcome of a chance experiment A coin is tossed three times. Assign the variable X to represent the frequency of heads occurring in each toss. Toss Outcomex Value HHH HHT HTH THH HTT THT TTH TTT x =1 when? HTT,THT,TTH Graphing Frequency Distribution
The calculated likelihood that an outcome variable will occur within an experiment Toss OutcomeX value HHH HHT HTH THH HTT THT TTH TTT xPxPx Graphing Frequency Distribution
xPxPx x Histogram
Available airplane passenger seats one week before departure What information does the histogram provide the airline carriers? What information does the histogram provide prospective customers? open seats percent of the time Histogram