Theme 4. Measures of individual position 1. Introduction. 2. Quantile: Ranges Percentiles, Percentiles, Deciles and Quartiles. 3. Standard Scores: Introduction, calculation and main features. 4. Derived scales. (Appendix: Non-linear transformations)
Introduction Here we will study statistical indicators that provide insights into various characteristic points of the distribution In particular, these indices provide information about the position of specific values within the data set. --A person with a very high percentile on an IQ test will mean that the person is well above most people in intelligence. --if we know that a person have a typical high positive standard score on an intelligence test, this would provides information about the intelligence of the person (higher intelligence in relation to the group).
Quantile: Ranges Percentiles, Percentiles, Deciles and Quartiles Percentiles divide (sorted) data distributions into 100 parts. Each part contains 1/100 scores. Centil 60, for example, is the score that leaves below 60% of the data. The Centil 15 leaves below 15% of the data. Percentiles divide the dataset into 100 parts. There are other quantiles. One is the median, dividing the distribution into two parts (median = Centil 50) Other quantiles are the decile (Decile 1 = Centil 10) and quartiles (Quartile 1 = Centil 25, Quartile 2 = Median, Quartile 3 = Centil 75)
Percentiles Computation of centiles Centile k: Median (Centil 50): Posición de Orden = Median (Centil 50): Posición de Orden =
Percentile Rank (PR) It is an inverse measure of the percentile. It can be used, for example, to indicate the position of the score of an aptitude test. Consider that a score has a percentile rank of 78. That means that 78% of the other people have a lower rating. Calculation (non-group): It can easily be computed with Excel (see next slide).
Example in Excel –MIcrosoft The data are in proportions, not in percentages Función RANGO.PERCENTIL(matriz;x)
Linear transformations: standarized scores With the form y = a + bx This is how we can switch from Celsius to Fahrenheit degrees. IMPORTANT: linear transformations DO NOT change the shape of the distribution. (You can change the mean and SD, but not the shape of the distribution.) The skewness and kurtosis won’t be altered.
Standarized (z) scores They indicate ndicate the number of SDs that an observation is separated from the group mean. Mean standard scores is 0 The variance (and SD) is 1 Important: z-scores are abstract (this allows tje comparison of variables with different units).
Standarized scores (example) Students A and B have passed a test, and we know that the standard score of A 1 and the standard score of B is 0. Which one got a better note? Obviously is A and her/his score is 1 SD above the mean of the group; In the case of B, her/his score is exactly the mean of the group Standarized scores and outliers In many cases, it usually indicates that if z> 3, such values are often considered atypical. (That is a criterion that does not necessarily coincide with atypical scores in the box and whisker diagrams.)
4.4 transformations (on standardarized scores) A small drawback of the standarized scores is that they invole the use of very small values (decimals, usually) and negative values. Therefore, we can compute linear transformations on standard scores. The example we are going to see are the T scores (mean 50 and SD 10) and IQ scales (mean 100 and dSD15).
T-scores are a t transofrmation of the standard score Z, which has beeb shifted and scaled to have a mean of 50 and a standard deviation of 10 Generically Observe that the new mean is given by b, and the SD is given by the absolute value of a In the case of T, a=10 y b=50 IQ scales For IQ scales:
Appendix: Non-linear transformations Why do transformations (nonlinear) in the data? -To make more symmetrical distributions, -To obtain more linear relationship between variables (if you have more than one variable at the same time; T.5 y T.6)
How to correct positive/negative asymmetry? To correct negative asymmetry To correct positive asymmetry
Example. RTs from a participant Observe not only that there are some atypical scores on both sides, but there is a clear positive asymmetry.
Transformed data (squared root) Now the distribution is more symmetrical Observe not only that there is still some positive asymmetry. With the log, we can further reduce positive skewness.
Log transformed data Observe not only the positive asymmetry has disappeared (in case there is some negative asymmetry caused by a few atypical scores).
This family of transformations has important properties: 1. They preserve the order of values; ie the highest values of the original scale will remain the highest values in the transformed scale. 2. They modify the distance between values. With powers p <1 (root x or log x) data is compressed at the top of the distribution relative to the lower values; With powers p> 1 (as the square of x) we have the opposite effect.
In short, these nonlinear transformation make the transformed variable less asymmetrical. Why is that important? - Distributions that show a clear asymmetry are difficult to study. - Outliers may now a bit closer to the bulk of the data. - The statistical methods often employ the arithmetic mean; but the mean of an asymmetric distribution is not a perfect index of central tendency.