EGR Statistical Inference “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” (H.G. Wells, 1946) “There are three kinds of lies: white lies, which are justifiable; common lies, which have no justification; and statistics.” (Benjamin Disraeli) “Statistics is no substitute for good judgment.” (unknown)
EGR Statistical Inference Suppose – –A mechanical engineer is considering the use of a new composite material in the design of a vehicle suspension system and needs to know how the material will react under a variety of conditions (heat, cold, vibration, etc.) –An electrical engineer has designed a radar navigation system to be used in high performance aircraft and needs to be able to validate performance in flight. –An industrial engineer needs to validate the effect of a new roofing product on installation speed. –A motorist must decide whether to drive through a long stretch of flooded road after being assured that the average depth is only 6 inches.
EGR Statistical Inference What do all of these situations have in common? How can we address the uncertainty involved in decision making? –a priori –a posteriori
EGR Probability A mathematical means of determining how likely an event is to occur. –Classical (a priori): Given N equally likely outcomes, the probability of an event A is given by, where n is the number of different ways A can occur. –Empirical (a posteriori): If an experiment is repeated M times and the event A occurs m A times, then the probability of event A is defined as,
EGR The Role of Probability in Statistics In statistical inference, we want to make general statements about the population based on measurements taken from a sample. –How will all suspension systems produced with the new composite behave? –How will the radar navigation system perform in all aircraft? –What speed improvements will we obtain for all roofing applications using the new product? To answer these questions, we ___________ from the population and hope to generalize the results.
EGR Observations & Statistical Inference Example, –An experiment is designed to determine how long it takes to install a roof using a new product. Experiment Design –Result: t = 2.32 sec/ft 2, P = p – value:
EGR Descriptive Statistics Numerical values that help to characterize the nature of data for the experimenter. –Example: The absolute error in the readings from a radar navigation system was measured with the following results: –the sample mean, x = _________________________ –the sample median, x = _____________ ~
EGR Descriptive Statistics Measure of variability –(Recall) Example: The absolute error in the readings from a radar navigation system was measured with the following results: –sample range: –sample variance:
EGR Variability of the Data sample variance, – sample standard deviation, –
EGR Other Descriptors Discrete vs Continuous –discrete: –continuous: Distribution of the data –“What does it look like?”
EGR Graphical Methods Dot diagram –useful for understanding relationships between factor settings and output –example (pp ) stem and leaf plot –example (radar data)
EGR Graphical Methods (cont.) Frequency Distribution (histogram) –equal-size class intervals – “bins” –‘rules of thumb’ for interval size 7-15 intervals per data set √ n more complicated rules –Identify midpoint –Determine frequency of occurrence in each bin –Calculate relative frequency –Plot frequency vs midpoint
EGR Relative Frequency Histogram Example: stride lengths (in inches) of 25 male students were determined, with the following results: What can we learn about the distribution of stride lengths for this sample? Stride Length
EGR Constructing a Histogram Determining relative frequencies Class IntervalClass Midpt. Frequency, F Relative frequency
EGR Relative Frequency Graph
EGR What can you see? Unimodal, Bimodal, or Multi-modal distributions Recognizable distribution? Skewness
EGR Your turn ** … Look at problem 3 on page 20 –do parts a & b –for each data set, construct the following dot diagram stem and leaf plot relative frequency histogram –construct the above for the combined data set –draw conclusions ** - time permitting (Note: this also makes a good study problem)