Quartiles Divide data sets into fourths or four equal parts. Smallest data value Q1Q2Q3 Largest data value 25% of data 25% of data 25% of data 25% of data
Median breaks a set of data into two halves. If we desire to break the set of data into quarters, the appropriate measures are quartiles. First quartile is the value that seperates the first quarter of a data set from the rest. Third quartile is the value that seperates the last quarter of a data set from the rest.
Median divides a set of data into 2 equal groups. First quartile is the median of the first group. Third quartile is the median of the second group.
IQR for a set of measurements is the distance between the first and third quartile.
Outliers Extreme observations or unusual measurements Can occur because of the error in measurement of a variable, during data entry or errors in sampling. An outlier may result from transposing digits when recording a measurement from incorrectly reading an instrument dial. Outliers may themselves contain important information not shared with the other measurements in the set.
Therefore, isolating outliers if they are present is an important step in any preliminary analysis of a data set. The boxplot is designed expressly for this purpose.
Checking for outliers by using Quartiles Step 1: Determine the first and third quartiles of data. Step 2: Compute the interquartile range (IQR). Step 3: Determine the fences. Fences serve as cutoff points for determining outliers. Step 4: If data value is less than the lower fence or greater than the upper fence, considered outlier.
The Five Number Summary; Boxplots Compute the five-number summary Five-no summary can be used to create a simple graph called box-plot to visually describe the data distribution. From the boxplot, we can quickly detect Whether there are any outliers in the data-set. Boxplot uses IQR to create imaginary fences to separate outliers from the rest of the data set.
The five-number summary can be used to create a simple graph called a boxplot. Form the boxplot, you can quickly detect any skewness in the shape of the distribution and see whether there are any outliers in the data set. Lower fence Upper fence Outlier
- symmetric - Skewed left because the tail is to the left - Skewed right because the tail is to the right
Characteristics Of Skewed Distributions
TO CONSTRUCT BOXPLOT Step 1: Determine the lower and upper fences: Step 2: Draw vertical lines at. Step 3: Label the lower and upper fences. Step 4: Draw a line from to the smallest data value that is larger than the lower fence. Draw a line from to the largest data value that is smaller than the upper fence. Step 5: Any data value less than the lower fence or greater than the upper fence are outliers and mark (*).
Example : Sketch the boxplot and interpret the shape of the boxplot.
Solution: 0.97, 1.14, 1.85, 2.47, 3.41, 3.94, 3.97, 4.02, 4.11, 5.22
- The distribution is skewed left
Example 1: Here are the SAT math scores for 19 randomly selected students. Find the median, first quartile and third quartile
Example:2 Here are the heights in inches of 12 randomly selected college females. Find the median, first quartile and third quartile
Example:3 As American consumers become more careful about the foods they eat, food processors try to stay competitive by avoiding excessive amounts of fat, cholesterol and sodium in the foods they sell. The following data are the amounts of sodium per slice (in milligrams) for each of eight brands of regular American cheese. Construct a box- plot for the data and look for outliers