Quantitative Methods in HPELS 440:210 Variability Quantitative Methods in HPELS 440:210
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Introduction Statistics of variability: Describe how values are spread out Describe how values cluster around the middle Several statistics Appropriate measurement depends on: Scale of measurement Distribution
Basic Concepts Measures of variability: Frequency Range Interquartile range Variance and standard deviation Each statistic has its advantages and disadvantages
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Frequency Definition: The number/count of any variable Scale of measurement: Appropriate for all scales Only statistic appropriate for nominal data Statistical notation: f
Frequency Advantages: Disadvantages: Ease of determination Only statistic appropriate for nominal data Disadvantages: Terminal statistic
Calculation of the Frequency Instat Statistics tab Summary tab Group tab Select group Select column(s) of interest OK
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Range Definition: The difference between the highest and lowest values in a distribution Scale of measurement: Ordinal, interval or ratio
Range Advantages: Disadvantages: Ease of determination Terminal statistic Disregards all data except extreme scores
Calculation of the Range Instat Statistics tab Summary tab Describe tab Calculates range automatically OK
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Interquartile Range Definition: The difference between the 1st quartile and the 3rd quartile Scale of measurement: Ordinal, interval or ratio Example: Figure 4.3, p 107
Interquartile Range Advantages: Disadvantages: Ease of determination More stable than range Disadvantages: Disregards all values except 1st and 3rd quartiles
Calculation of the Interquartile Range Instat Statistics tab Summary tab Describe tab Choose additional statistics Choose interquartile range OK
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Variance/SD Population The average squared distance/deviation of all raw scores from the mean The standard deviation squared Statistical notation: σ2 Scale of measurement: Interval or ratio Advantages: Considers all data Not a terminal statistic Disadvantages: Not appropriate for nominal or ordinal data Sensitive to extreme outliers
Variance/SD Population Standard deviation: The average distance/deviation of all raw scores from the mean The square root of the variance Statistical notation: σ Scale of measurement: Interval or ratio Advantages and disadvantages: Similar to variance
Calculation of the Variance Population Why square all values? If all deviations from the mean are summed, the answer always = 0
Calculation of the Variance Population Example: 1, 2, 3, 4, 5 Mean = 3 Variations: 1 – 3 = -2 2 – 3 = -1 3 – 3 = 0 4 – 3 = 1 5 – 3 = 2 Sum of all deviations = 0 Sum of all squared deviations Variations: 1 – 3 = (-2)2 = 4 2 – 3 = (-1)2 = 1 3 – 3 = (0)2 = 0 4 – 3 = (1)2 = 1 5 – 3 = (2)2 = 4 Sum of all squared deviations = 10 Variance = Average squared deviation of all points 10/5 = 2
Calculation of the Variance Population Step 1: Calculate deviation of each point from mean Step 2: Square each deviation Step 3: Sum all squared deviations Step 4: Divide sum of squared deviations by N
Calculation of the Variance Population σ2 = SS/number of scores, where SS = Σ(X - )2 Definitional formula (Example 4.3, p 112) or ΣX2 – [(ΣX)2] Computational formula (Example 4.4, p 112)
Computational formula Step 4: Divide by N
Computation of the Standard Deviation Population Take the square root of the variance
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Variance/SD Sample Process is similar with two distinctions: Statistical notation Formula
Statistical Notation Distinctions Population vs. Sample N = n
Formula Distinctions Population vs. Sample s2 = SS / n – 1, where SS = Σ(X - M)2 Definitional formula ΣX2 - [(ΣX)2] Computational formula Why n - 1?
N vs. (n – 1) First Reason General underestimation of population variance Sample variance (s2) tend to underestimate a population variance (σ2) (n – 1) will inflate s2 Example 4.8, p 121
Actual population σ2 = 14 Average biased s2 = 63/9 = 7 Average unbiased s2 = 126/9 = 14
N vs. (n – 1) Second Reason Degrees of freedom (df) df = number of scores “free” to vary Example: Assume n = 3, with M = 5 The sum of values = 15 (n*M) Assume two of the values = 8, 3 The third value has to be 4 Two values are “free” to vary df = (n – 1) = (3 – 1) = 2
Computation of the Standard Deviation of Sample Instat Statistics tab Summary tab Describe tab Calculates standard deviation automatically OK
Agenda Introduction Frequency Range Interquartile range Variance/SD of population Variance/SD of sample Selection
Selection When to use the frequency Nominal data With the mode When to use the range or interquartile range Ordinal data With the median When to sue the variance/SD Interval or ratio data With the mean
Textbook Problem Assignment Problems: 4, 6, 8, 14.