Summary of Prev. Lecture Central Tendency Mode Highest frequency with Nominal or Category data Median Middle value that can avoid outliers' influence Mean Arithmetic Mean: First and Second Moment Geometric Mean Weighted Mean
Distribution Descriptor 2 Geography Jinmu Choi Measure of Dispersion (2) Range and Percentile (2) Mean Deviation, Variance, Std. Dev. (3) Weighted Var. and Std. Dev., CV (3) Skewness and Kurtosis (2) Summary and Next…
Dispersion Dispersion: How the values are concentrated or scattered around the mean and along the value line Very similar to the mean Quite different from the mean Just scattered around Xa: 1, 3, 5, 7, 9, 11, 13: Mean = Range = Xb: -11, -5, 1, 7, 13, 19, 25: Mean =
Dispersion Measures Magnitude of dispersion Direction and Sharpness Range: Maximum – Minimum Percentiles Mean deviations Standard deviations Direction and Sharpness Skewness Kurtosis
Range Range: Maximum – Minimum Xb: -11, -5, 1, 7, 13, 19, 25 : Mean = The greater the range in a data series, the more dispersed the data are Only how far the values are scattered Xb: -11, -5, 1, 7, 13, 19, 25 : Mean = Range = Xc: -11, -10, 6, 7, 8, 24, 25: Mean =
Percentiles Milestones within the range of data Sorting and counting ¼, ½, ¾ of the total observations from the minimum Medium = ½ from the minimum = 50% Xb: -11, -5, 1, 7, 13, 19, 25 : Mean = Range = Percentile Xc: -11, -10, 6, 7, 8, 24, 25: Mean =
Mean Deviation Dispersion using all values The average difference from all values to their mean Xa: 1, 3, 5, 7, 9, 11, 13: Mean Dev. = 3.4286 Xb: -11, -5, 1, 7, 13, 19, 25: Mean Dev. = 10.285 Only concern the distance of the values from the mean, not the direction M.:5 M.Dev. = 2.22… M.:6 M.Dev. = 3.33… 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 18
Variance Squared difference from the mean Population variance Sample variance
Standard Deviation Averaged squared deviation The magnitude or scale of the original dataset Mean: 201.23, Var.: 88432.30, Std. Dev. : 297.38 Resembling Normal distribution with Standard Dev. About 68% of the data value: About 95% of the data value: About 99% of the data value: M.:5 Std.Dev. = 2.58… M.:6 Std.Dev. = 4.76… 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 18
Weighted Variance Variance for grouped data Get the range for each group (class) Get mid value for each group (class) Put mid value for each observation Calculate variance using list of mid values Range 4~50 50~200 200~1000 Mid value 27 125 600
Weighted Standard Deviation Square root of weighted variance Unweighted Vs. Weighted statistics Unweighted variance: 88432.30 Unweighted std. dev.: 297.38 Weighted variance: 1537.7615 Weighted std. dev.: 39.21 Why they are differ? Variations in each group have been removed
Coefficient of Variation Problem of Mean, Variance: Sensitive to scale Standard deviation Coefficient of variation To check just scale difference between two datasets Mean: the center of the data Standard deviation: how much dispersion the data have Both (CV): difference in magnitude for comparing multiple datasets X: 1 3 5 7 9 11 13: mean 7, std. dev.: 4 Y: 10 30 50 70 90 110 130: mean 70, std.dev.: 40
Skewness Third moment statistic: Directional bias of the distribution of the data Use frequency distribution (histogram) X axis: numerical range Y axis: frequency Positive skewness Bulk < Mean Negative skewness Mean < Bulk
Kurtosis Fourth moment statistic: Sharpness of the distribution of the data Use histogram X axis: numerical range Y axis: frequency Kurtosis of normal dist.: 3 Normal distribution: K=0 High Kurtosis (sharp peak): K>0 Low Kurtosis (flat): K<0
Summary Dispersion Direction and Sharpness Range: gives boundary Percentile: gives clustering of observation Mean Deviation: magnitude of dispersion Variance and Standard Deviation: magnitude of dispersion Weighted Variance and Standard Deviation: dispersion of grouped values Coefficient of Variation: removes scale differences Direction and Sharpness Skewness: direction from mean Kurtosis: sharpness compared to normal distribution
Next Lab3: Additional Statistics and MAUP Lecture 4: Relationship Descriptor 1. Correlation Analysis (Ch 3, pp.94-107)