SPREAD MEASURES Prof. Dr. Hamit ACEMOĞLU 1
The Aim By the end of this lecture, the students will be aware of spread measures and to calculate the extent of spread by using SPSS. 2
The Goals Able to count spread measures, By using SPSS to be able to calculate interval (range) percentiles variance standard deviation To tell the variance and standard deviation formulas, To explain the variation within individuals and between individuals. 3
Spread measures -Interval (range) -Percentile range -Variance -Standart deviation -Variations within and betwen individuals 4
If we specify two properties including -central tendency and -spread measures of a numerical dataset, we summarises enough of our data structure. It was mentioned about measure of central tendency in a previous lecture. It is time to talk on spread measures. 5 5
Sayfa 1 / 2 T.C. İSTANBUL VALİLİĞİ İSTANBUL HALK SAĞLIĞI MÜDÜRLÜĞÜ HASTA LABORATUAR SONUÇLARI Tar TETKİK ADI SONUÇ BİRİMİ REFERANS ARALIĞI WBC 5.8 K/µ 3.0 - 12.0 NE% 51.7 % 35 - 80 NE# 3 1.1 - 9.6 LYM% 38.9 15 - 50 LYM# 2.3 0.5 - 6.0 MONO% 7.4 2 - 12 MONO# 0.4 0.1 - 1.4 EOS% 1.3 0 - 6 EOS# 0.1 0 - 0.4 BASO% 0.7 0 - 2 BASO# 0.0 - 0.3 RBC 3.8 M/uL 3.2 - 6.0 HGB 11.9 g/dL 10 - 18 HCT 35 30 - 55 MCV 91.9 fL 78 - 105 MCH 31.3 pg 25 - 33 MCHC 34 30 - 36 RDW 13 9 - 18 PLT 316 150 - 500 MPV 7.8 0 - 15 Demir (serum) 89 ug/dL 37 - 145 Demir bağlama kapasitesi 233 155 - 355 Ferritin 28.9 ng/mL 11.0 - 306.8 Vitamin B12 240 pg/mL 126.5 - 505 LDL kolestrol ▲178 mg/dL 0 - 130 Kolesterol ▲256 0 - 200 HDL kolesterol 53 35 - 70 Trigliserid 125 0 - 150 Alanin aminotransferaz (ALT) 14 U/L 0 - 35 Aspartat transaminaz (AST) 19 IU/L Bilirubin (total) - İndirek Bilirubin 0.6 0.1 - 1.0 Bilirubin (direkt) - Direkt Bilirubin 0.0 - 0.2 Bilirubin (total) - Total Bilirubin 0.3 - 1.2 Gamma glutamil transferaz (GGT) 20 0 - 38 Laktik Dehidrogenaz (LDH) 193 0 - 247 Glukoz 87 74 - 106 Glikozile hemoglobin (Hb A1C) 5.5 4.0 - 6.0 Serbest T3 3.1 2.1 - 3.8
Interval (range) The difference between largest and the smallest value of our data is called range. R=Xmax-Xmin Usually small (min. ) and largest (max.) values are given instead of range. If there is more outlier, It should be noted that the extent of the range is not sufficiently reliable. 9
Percentile ranges When we sort our data from small to large, where *1 % of the total data section is called as 1 percentile, *50 % of the total data section is called as 50 percentile. 1. place quartile:(n+1)/4 3. place quartile:3*1. quartile 10
-The place of 1. quartile in this data set (8+1)/4=2.25 value No Height 1 145 2 148 3 154 4 160 5 166 6 170 7 176 8 182 -The place of 1. quartile in this data set (8+1)/4=2.25 value -1. quartile=148+ (154-148)x0.25=149.5 -3. quartile place=3*1. quartile place=3*2.25=6.75 value -3. quartile=170+(176-170)*0.75=174.2 13
Exact 50% of the limit value is called " median ". Between 25-75 percentile is called interquartile range When the data is sorted, interquartile range shows 50% remaining in the middle. 14
When the data originated form a large enough sample representing the population both ends in the remaining value of the 2.5 % are called reference interval, reference range or normal range. In the case of measuremets such as lab. while comparing our data with population, we decide to look at the range, whether it is normal or not.
Variance A way of measuring the distribution of data is to look at how each observation deviates from the arithmetic mean. We can not take the average of the values we will achieve. Because, since the things on the plus side will be near to the minus side, they will cancel each other. We make a calculation taking the square of the distance from arithmetic mean of each value . We sum these values and divide by sample size (sample size(n-1)). This is called variance calculation. It is represented as s2 16
While calculating variance, unlike arithmetic mean we divide by (n-1) While calculating variance, unlike arithmetic mean we divide by (n-1). The reason for this, our work is on a particular sample, not the entire population. In this case it is shown theoretically getting a close variance to population value. 17
Example S2=(9726-(300)2/10))/9=80,67 S=8,98 No 1 2 3 4 5 6 7 8 9 10 Sum Age 14 25 38 41 22 26 33 35 300 X2 196 625 1444 1681 484 676 1089 1225 9726 S2=(9726-(300)2/10))/9=80,67 S=8,98
No Age=Xi Mean Xi-Mean 1 14 30 -16 256 2 25 -5 3 38 8 64 4 41 11 121 5 22 -8 6 7 26 -4 16 9 33 10 35 Sum 300 726 n-1= 10-1=9 Variance= 726/9=80,67 Standart deviation= 8,98 Variance and standard deviation can be calculated as in the figure
Standart deviation Standard deviation is the square root of the variance . Dividing standard deviation by arithmetic mean and expressing as a percentage, we finde coefficient of variation. The advantage of the coefficient of variance is not affected by the variable unit (expressed as %). But it is not prefered due to the theoretical disadvantages. 20
Variations within and between individuals We can get different results if we make multiple measurements of the same individual ( intra-individual differences ) This difference may arise from not giving the same answer every time by the individual or measurement error. However, intra-individual differences is less than inter-individual differeneces. These differences will be important during the research design. 21
Only use two observation Affected by outliers Spread criteria Pozitive properties Negative properties Interval (Range) Easily detectable Only use two observation Affected by outliers It tends to increase as the number of samples increases Interval based on percentiles Generally not affected bay outliers. It is independent of the sample size Suitable for squed data Calculation is cumbersome Not calculated for small samples It is defined as algebraically 22
Considers every observations It is defined as algebraically Spread creteria Pozitive properties Negative properties Variance Considers every observations It is defined as algebraically The measurement unit is square of the raw data Affected by outliers Not suitable for squed data Standart deviation It has the same advantages as the variance The measurement unit is the same as the raw data İnterpreted easyly Not suitible for squed data 23
Exercises Sample data:3,5,8,9,11,13,23 The distribution range of the sample data 1. and 3. percentiles ? Variance? Standart deviation ? 24
Spread measures Summary -Interval (range) -Percentile range -Variance -Standart deviation -Variations within and betwen individuals 25