Describing Distributions with Numbers Section 1.2
Two Ways to Measure the Center Mean Median Mean—add up all the observations and divide by the number of observations you have Mean is not resistant to extreme values Median—arrange all observations in order from smallest to largest If the number of observations is odd, it’s the middle value If the number of observations is even, it’s the mean of the two center values Median is resistant to extreme values
Ways to Measure the Spread Range Quartiles Standard Deviation Range—highest value versus lowest Q1—median of observations to the left of the median Q3—median of observations to the right of the median
IQR Interquartile Range Distance between the first and third quartiles IQR = Q3 – Q1 IQR--Spread of the middle half of data
Identifying Outliers Multiply the IQR by 1.5 An observation is an outlier if it falls IQR(1.5) above the third quartile or below the first quartile
Five Number Summary Used to create a boxplot/modified boxplot Minimum, Q1, Median, Q3, Maximum Used to create a boxplot/modified boxplot Use for skewed distributions or distributions with strong outliers
Variance The average of the squares of the deviations of the observations from the mean
Degrees of Freedom n – 1 n is the number of observations in your data set
Linear Transformation Changes the original variable x into the new variable xnew Xnew = a + bx a shifts values of x up or down (changes center, not spread) b changes the size of the unit of measurement (changes center and spread) Does not change shape