Descriptive statistics Describing data with numbers: measures of variability
What to describe? What is the “location” or “center” of the data? How do the data vary?
Measures of Variability Range Interquartile range Variance and standard deviation Coefficient of variation All of these measures are appropriate for measurement data only.
Range The difference between largest and smallest data point. Highly affected by outliers.
Range?
Range Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean GPA Variable Minimum Maximum Q1 Q3 GPA Range = = 1.96
Interquartile range The difference between the third quartile and the first quartile. So, the “middle-half” of the values. Robust to outliers.
Interquartile range?
Interquartile range Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean GPA Variable Minimum Maximum Q1 Q3 GPA IQR = = 0.795
Variance Sample variance denoted by s 2 Measures average squared deviation of data points from their mean. Highly affected by outliers.
Standard deviation Sample standard deviation is square root of sample variance, and so is denoted by s. Measures average deviation of data points from their mean. Also highly affected by outliers.
Variance or standard deviation?
Variance or standard deviation Sex N Mean Median TrMean StDev SE Mean female male Minimum Maximum Q1 Q3 female male Females: s = mph and s 2 = = mph 2 Males: s = mph and s 2 = = mph 2
Variance or standard deviation?
Variance or standard deviation Sex N Mean Median TrMean StDev SE Mean female male Sex Minimum Maximum Q1 Q3 female male Females: s = kph and s 2 = = kph 2 Males: s = kph and s 2 = = kph 2
Coefficient of Variation Ratio of sample standard deviation to sample mean multiplied by 100. Measures relative variability, that is, variability relative to the magnitude of the data. Unitless, so good for comparing variation between two groups.
Coefficient of variation (MPH) Sex N Mean Median TrMean StDev SE Mean female male Minimum Maximum Q1 Q3 female male Females: CV = (11.32/91.23) x 100 = 12.4 Males: CV = (17.39/106.79) x 100 = 16.3
Coefficient of variation (KPH) Sex N Mean Median TrMean StDev SE Mean female male Sex Minimum Maximum Q1 Q3 female male Females: CV = (18.86/152.05) x 100 = 12.4 Males: CV = (28.98/177.98) x 100 = 16.3
The most appropriate measure of variability depends on … the shape of the data’s distribution.
Choosing Appropriate Measure of Variability If data are symmetric, with no serious outliers, use range and standard deviation. If data are skewed, and/or have serious outliers, use IQR. If comparing variation across two data sets, use coefficient of variation.