Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately describes the center of the distribution –represents the entire distribution of scores Used to compare two (or more) sets of data –comparing the average score for one set versus the average score for another set.
Figure 3-1 Frequency distribution for ratings of attractiveness of a female face shown in a photograph for two groups of male participants: those who had consumed no alcohol and those who had consumed moderate amounts of alcohol.
Figure 3-2 (p. 74) Three distributions demonstrating the difficulty of defining central tendency. In each case, try to locate the “center” of the distribution
The Mean, the Median, and the Mode It is essential that central tendency be determined by an objective and well ‑ defined procedure so that others will understand exactly how the "average" value was obtained and can duplicate the process. No single procedure always produces a good, representative value. Therefore, researchers have developed three commonly used techniques for measuring central tendency: –the mean –the median –the mode
The Mean Most commonly used measure of central tendency Requires scores on an interval or ratio scale Computation –sum, or total, for the entire set of scores –then dividing this sum by the number of scores –Population formula ΣX/ N – for example scores of 3, 7, 4, 6 ΣX = 20 N = 4 20/4 = 5 – –Sample formula M = ΣX/ n or in some books
Figure 3-3 (p. 76) The frequency distribution shown as a seesaw balanced at the mean. Conceptually, the mean can also be defined as: the amount that each individual receives when the total is divided equally among all individuals n = 6 boys with 180 baseball cards divided equally M = ΣX/ n M = 180/6 = 30 the balance point of the distribution
Weighted Mean When combining two sets of scores with different sample size –First Sample n = 12 ΣX = 72 M = 6 –Second Sample n = 8 ΣX = 56 M = 7 Overall Mean –WRONG (6+7)/2 = 6.5 only if samples are the same size –CORRECT ( ) / (12 + 8) = 128/20 = 6.4 –Combined sum divided by combined n –M = ΣX/ n M = (ΣX 1 + ΣX 2 ) / n 1 + n 2
Table 3.1 (p. 78) Calculating the mean from a frequency distribution table. Statistics quiz scores for a section of n = 8 students. Σ X = 66 n = 8 M = 66/8 = 8.25
Changing the Mean Calculation of the mean involves every score in the distribution, so: –modifying a set of scores by discarding scores by adding new scores will usually change the value of the mean To determine how the mean will be affected –determine how the number of scores is affected –determine how the sum of the scores is affected
Figure 3-4 (p. 80) Adding a Score A distribution of n = 5 scores that is balanced with a mean of µ = 7. What if a new score X = 10 is added to the distribution? Original sample n = 5 ΣX =35 M = 35/5 = 7 New sample n = 6 ΣX = 45 M = 45/6 = 8 10 ▼ →.... 7
Changing the Mean Changing the value of any score will change the value of the mean –If constant value is added to every score in a distribution, then the same constant value is added to the mean –If every score is multiplied by a constant value, then the mean is also multiplied by the same constant value
Table 3.2 Adding or Subtracting a Constant Number of sentences recalled for humorous and nonhumorous sentences. +2
Table 3.3 Multiplying or dividing by a constant; Measurement of five pieces of wood in inches and converted to centimeters. So mean centimeters is 2.54 times mean inches.
The Median The midpoint of scores listed in order from smallest to largest 50% of the scores are equal to or less than the median Same as the 50 th percentile Computation –requires scores measured on an ordinal, interval, or ratio scale –simple counting procedure With an odd number of scores –list the values in order –the median is the middle score in the list. With an even number of scores –list the values in order –the median is half-way between the middle two scores
Example 3.5 The median divides the area in the graph exactly in half. Scores of 3, 5, 8, 10, 11 organized by value An odd number of scores so the middle score is 8
Example 3.6 (modified) The median divides the area in the graph exactly in half. Scores of 3, 3, 4, 5, 7, 8 organized by value An even number of scores so the middle is between 4 and 5 median is (4 + 5)/2 = 9 / 2 = 4.5
The Median If the scores are measurements of a continuous variable it is possible to find the median by first placing the scores in a frequency distribution histogram with each score represented by a box in the graph. Then, draw a vertical line through the distribution so that exactly half the boxes are on each side of the line The median is defined by the location of the line.
Figure 3-5 (page 84) A distribution with several scores clustered at the median. The median for this distribution is positioned so that each of the four boxes above X = 4 is divided into two sections, with 1/4 of each box below the median (to the left) and 3/4 of each box above the median (to the right). As a result, there are exactly four boxes, 50% of the distribution, on each side of the median – 0.75
Using a frequency distribution table to calculated 50% which is the median from example 3.7 numbers A continuous variable has upper and lower limits 50% falls between 37.5 and 87.5 a distance of 50 50% is 37.5 below 87.5% so 37.5/50 = 0.75 distance between 3.5 and 4.5 is 1 1(0.75) = – 0.75 = 3.75 See box 3.2 on page 86 xfcfc% % % % % %
Figure 3-6 (Page 86) A population of N = 6 scores with a mean of = 4. Notice that the mean does not necessarily divide the scores into two equal groups. In this example, 5 out of the 6 scores have values less than the mean. For these six scores 2, 2, 2, 3, 3, 12 The median is the middle point in the scores in this case 2.5 Median
The Mode The most frequently occurring category or score in the distribution Peak in a frequency distribution graph For data measured on any scale of measurement: –nominal –ordinal –interval –ratio
Table 3.4 (p. 87) Favorite restaurants named by a sample of n = 100 students. Caution: The mode is a score or category, not a frequency. For this example, the mode is Luigi’s, not f = 42.
Bimodal Distributions It is possible for a distribution to have more than one mode. Such a distribution is called bimodal. In addition, the term "mode" is often used to describe a peak in a distribution that is not really the highest point. Thus, a distribution may have a major mode at the highest peak and a minor mode at a secondary peak in a different location.
Figure 3.7 (page 89) Bimodal distribution A frequency distribution for tone identification scores. An example of a bimodal distribution.
Selecting a Measure of Central Tendency Mean is preferred –uses every score in the distribution –commonly used in inferential statistics Situations where you cannot or should not compute a mean at all –nominal data –ordinal data (usually inappropriate) Situations where the mean does not provide a good, representative value –Extreme scores (see fig 3.8)
Figure 3-8 (p. 90) Frequency distribution of errors committed before reaching learning criterion. Example of effects of an extreme score on the mean This is an obvious example but what if the scores where only a little skewed. Statistics to the rescue, there are tests for skewness Mean M = ΣX/ n M = 203/10 = 20.3 Median is 11.5 Mode is 11.0
Selecting a Measure of Central Tendency Situations where the mean does not provide a good, representative value Missing values Random Missing scores from errors, equipment failure …… Usually remove all scores for that person If a large number of scores are missing stop the research Undetermined values (see table 3.5) Open-ended distributions For example: a score category of 5 or more pizzas Can not calculate the mean Plan ahead, try to get quantitative values
Table 3.5 (p. 91) Amount of time to complete puzzle. Undetermined values in the data set because Person 6 did not complete the puzzle. After 60 minutes the researcher stopped the test. There is no value for the 6 th person so the mean can not be calculated. The Median 12.5 which between 3 rd and 4 th scores. Note 1: some researchers record the maximum time referring to it as “timed out” in this case 60 which will be an extreme score instead of missing value Generally a bad idea even for experienced researchers because the value is really unknown. Note 2: when it is one or two scores out of a set of one hundred scores some researchers treat this as random missing values and remove the person. i.e. remove #6. However, person #6 really did work on the puzzle and this person is part of the sample. 60
The Median One advantage of the median is that it is relatively unaffected by extreme scores. The median tends to stay in the "center" of the distribution even when –When the distribution is very skewed from a few extreme scores –Undetermined values; see table 3.5 –Open-ended distribution Use the median for Ordinal measurement scale In these situations, the median serves as a good alternative to the mean. Used as a supplemental measure of central tendency that is reported along with the mean.
The Mode The only measure of central tendency that can be used for data measured on a nominal scale. Discrete variables are whole number –such as number of children in a family –Calculating the mean can produce fractions Families have 2.33 children –Mode is more sensible but lacks accuracy family has 2 children Used as a supplemental measure of central tendency that is reported along with the mean or the median. –Helps to describe shaped
Central Tendency and the Shape of the Distribution Mean, the median, and the mode are systematically related to each other. In a unimodal symmetrical distribution, the mode, mean, and median will all have the same value. (see fig 3.11) In a skewed distribution (see fig 3.12) –mode will be located at the peak on one side –the mean usually will be displaced toward the tail on the other side. The median is usually located between the mean and the mode.
Figure 3-11 (p. 96) Measures of central tendency for three symmetrical distributions: normal, bimodal, and rectangular.
Figure 3-12 (p. 96) Measures of central tendency for skewed distributions.