Dispersion
Basic problem Population 1: 3,3,3,3,3 Population 2 : 1,2,3,4,5 Population 3: 0,0,15,0,0 Population 4: -294,-24,3,30,300 What is the similarity and the difference among the examples? How can we measure dispersion?
Dispersion The differencies between the values and the value’s deviaton from the measures of central tendency Difference: xi-xj Deviation:
How can we measure the dispersion Difference Range Gini ’s average absolute difference Deviation Standard deviation Variance Coefficient of variation
Range Range Advantages? Disadvantages? Interpretation?
How can we measure the dispersion Difference Range Gini ’s average absolute difference Deviation Standard deviation Variance Coefficient of variation
Variance Nominator: total sum of squares (TSS) Calculation from ungrouped data: Nominator: total sum of squares (TSS) Denominator: number of elements (N) Example
Standard deviation Calculation from ungrouped data Interpretation: how much do the individuals deviate on average from the mean.
Example Ages (year) 20, 20, 20, 25, 25, 32, 33, 33, 33, 33
Calculation from frequency distribution Ages 20, 20, 20, 25, 25, 32, 33, 33, 33, 33 Calculation from frequency distribution Ages, year Nr of Ind. (fj) Ratio, gj 20 3 0,3 25 2 0,2 32 1 0,1 33 4 0,4 Total 10 1,0
Coefficient of variation Interpretation: how much do the individuals deviate on average from the mean, measured as percentage of the mean. Advantages?
Example Ages 20, 20, 20, 25, 25, 32, 33, 33, 33, 33
Calculation of dispersion from intervals Start from the midpoints (Xi0 + Xi1)/2 Ages, year Number of flats; fi Midpoints, Xi -11 273 730 6,0 12-21 685 541 17,0 22-31 861 297 27,0 32-41 550 961 37,0 42-56 424 911 49,5 57-81 442 392 69,5 82- 484 677 94,5 Total 3 723 509 -
xi fi fi(x- )2 fixi2 6,0 273 730 344 968 233 9 854 280 17,0 685 541 411 495 985 198 121 349 27,0 861 297 181 087 694 627 885 513 37,0 550 961 11 156 960 754 265 609 49,5 424 911 27 194 304 1 041 138 178 69,5 442 392 346 835 328 2 136 863 958 94,5 484 677 1 361 457 693 4 328 286 779 Total 3 723 509 2 684 196 197 9 096 415 666
Properties of Dispersion What happens if we add the same number to all of our values or multiply all of them by the same number? Measures of dispersion Increasing all values by the same A number Multiplying all values by the same A(≠0) number Range Remains unchanged Range is multiplied by A Variance Variance is multiplied by A Standard deviation Standard deviation is multiplied by A Coefficient of variation