Download presentation
Presentation is loading. Please wait.
Published byBarbara Ramsey Modified over 6 years ago
1
Summary descriptive statistics: means and standard deviations:
Measures of central tendency ("averages") Measures of dispersion (spread of scores)
2
1. The Mode: The most frequent score in a set of scores. 6, 11, 22, 22, 96, 98. Mode = 22
3
Advantages of the mode: (i) Simple to calculate, easy to understand
Advantages of the mode: (i) Simple to calculate, easy to understand. (ii) The only average which can be used with nominal data. Disadvantages of the mode: (i) May be unrepresentative and hence misleading. e.g.: 3, 4, 4, 5, 6, 7, 8, 8, 96, 96, 96. Mode is 96 - but most of the scores are low numbers. (ii) May be more than one mode in a set of scores. e.g.: 3, 3, 3, 4, 4, 4, 6, 6, 6 has three modes!
4
2. The Median: When scores are arranged in order of size, the median is either
(a) the middle score (if there is an odd number of scores) 4, 5 ,6 ,7, 8, 8, 96. Median = 7. or (b) the average of the middle two scores (if there is an even number of scores). 4, 5, 6, 7, 8, 8, 96, 96. Median = (7+8)/2 = 7.5.
5
Advantages of the median:
(i) Resistant to the distorting effects of extreme high or low scores. Disadvantages of the median: (i) Ignores scores' numerical values, which is wasteful if data are interval or ratio. (ii) More susceptible to sampling fluctuations than the mean. (iii) Less mathematically useful than the mean.
6
Add all the scores together and divide by the total number of scores.
3. The Mean: Add all the scores together and divide by the total number of scores. e.g. ( ) / = / =
7
Advantages of the mean:
(i) Uses information from every single score. (ii) Resistant to sampling fluctuation - i.e., varies the least from sample to sample. (Important since we normally want to extrapolate from samples to populations). Disadvantages of the mean: (i) Susceptible to distortion from extreme scores. e.g.: 4, 5, 5, 6 : mean = 5. 4, 5, 5, 106: mean = 30. (ii) Can only be used with interval or ratio data, not with ordinal or nominal data.
8
1. The Range: The difference between the highest and lowest scores. (i.e. range = highest - lowest). Advantages: Quick and easy to calculate, easy to understand. Disadvantages: Unduly influenced by extreme scores. 3, 4, 4, 5, 100. Range = (100-3) = 97. 3, 4, 4, 5, 5. Range = (5-3) = 2. Conveys no information about the spread of scores between the highest and lowest scores. e.g. 2, 2, 2, 2, 2, 20 and 2, 20, 20, 20, 20, 20 have exactly the same range (18) but very different distributions.
9
2. The Standard Deviation (S.D.):
The "average difference of scores from the mean". The bigger the s.d., the more scores differ from the mean and between themselves, and the less satisfactory the mean is as a summary of the data. Advantages: Like the mean, uses information from every score. Disadvantages: Not intuitively easy to understand! Can only be used with interval or ratio data.
10
X How to calculate the standard deviation:
For the set of scores 5, 6, 7, 9, 11: X (a) Work out the mean: = 38 / 5 = 7.6
11
) ( - X X = s n å 2 (b) Subtract the mean from each score:
= -2.6 = -1.6 = -0.6 = 1.4 = 3.4
12
( ) - X X = s n å 2 (c) Square the differences just obtained:
= 6.76 = 2.56 = 0.36 = 1.96 = 11.56
13
( ) - X X = s n å 2 (d) Add up the squared differences:
= 23.20
14
( ) 2 - å X X = s n (e) Divide this by the total number of scores, to get the variance: 23.20 / 5 = 4.64
15
( ) 2 - å X X = s n (f) Standard deviation is the square root of the variance (we do this to get back to the original units): 4.64 = 2.15 is our sample standard deviation.
16
Complications in using the mean and s.d.:
We usually obtain the mean and s.d. from a sample - very rarely from the parent population. Sometimes we are content to describe our sample per se, but usually we want to extrapolate to the population from our sample. A sample mean is a good estimate of the population mean. A sample s.d. tends to underestimate the population s.d. Hence, when using the sample s.d. as a description of the sample, divide by n. When using the sample s.d. as an estimate of the population s.d., divide by n-1 (to make the s.d. larger than it would otherwise have been).
17
sample s.d. as description of a sample (n ("sigma n") on calculators):
sample s.d. as an estimate of the population s.d. (n-1 on calculators): population s.d. if you measure every member of the population (n on calculators): sample mean as description of a sample: sample mean as an estimate of population s.d.: population mean (“mu”):
18
The Standard Error of the Mean: This is the standard deviation of a set of sample means. Shows how much variation there is within a set of sample means, and hence how likely our particular sample mean is to be in error, as an estimate of the true population mean. means of different samples actual population mean
19
Formula for the standard error:
We normally estimate this from our obtained data:
20
So - find the standard deviation; then divide this by the square root of the number of scores.
If the S.E. is small, our obtained sample mean is more likely to be similar to the true population mean than if the S.E. is large. Increasing n reduces the size of the S.E.. A sample mean based on 100 scores is probably closer to the population mean than a sample mean based on 10 scores.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.