Descriptive Statistics
My immediate family includes my wife Barbara, my sons Adam and Devon, and myself. I am 62, Barbara is 61, and the boys are both 30. Barbara and Devon have master’s degrees, Adam has a bachelor’s degree, and I have a doctorate.
Descriptive Variables Name Age Gender Education Relationship to the family
Why Use Statistics It is not easy to describe all the characteristics (lots of variables) of all of the members of a group (lots of people). Summaries of the characteristics of the members of a group are called descriptive statistics.
Problems with Gathering Information About Characteristics Did you get the same information from each respondent? Is the information appropriate to your problem? Can you transform the information into numbers? Are the numbers in a form that can be analyzed? The solution to these problems is to use measurement scales.
A Quick Review Measures of Central Tendency Mode—the response that occurs most frequently Median—the point where half of the scores are above and half below Mean—average
Mode The response that occurs most frequently
Median The point at which half the scores are higher and half are lower
Mean The average
Nominal Scales Unordered classification – Think of this as a group of containers into which you will sort data. Allows comparison of group sizes – Which container has the most in it? No information is embedded in the order of the categories Mode (the only measure of central tendency)
What color is your car? Nominal— Mode
Ordinal Scales Ordered classification – Containers where it makes sense that they are in order Allows comparison of both group sizes and relative position of categories Categories are ordered but not evenly spaced – Some containers may be larger or smaller than others – The distance between the containers may not be equal Median (best measure of central tendency) or Mode
Do you like chocolate? Ordinal— Median or Mode
Interval Scales Ordered Classification – Just like ordinal the order makes sense Categories are ordered and evenly spaced – Unlike ordinal all of the containers are of equal size and spaced evenly Mean (best measure of central tendency) Ratio scales are the same as interval except they start at zero
How many ham sandwiches did you eat last week? Ratio/Interval— Mean, Median or Mode
Descriptive Variables Name Age Gender Education Relationship to the family
Think up a nominal, ordinal and interval scale related to each of the following: Political affiliation Restaurant ratings Temperature Shoe size Teaching assignments Teacher effectiveness Income
Test Scores First: What kind of variable is Test Score?
Measures of Central Tendency Mode—the response that occurs most frequently Median—the point where half of the scores are above and half below Mean—average
Computing Measures of C T Lay out all of the scores in numerical order Compute the mode by finding the number that occurs most often Compute the median by finding the middle number in the list of scores Compute the mean by adding up all of the numbers and dividing by the number of numbers
Computing Measures of C T Mode (most frequent) Median (midpoint of responses) Mean = 344/13 or (average)
Frequency Distribution The number of scores at each possible level 20 — 1 21 — 0 22 — 0 23 — 2 24 — 2 25 — 0 26 — 1 27 — 4 28 — 2 29 — 0 30 —
Histogram Bar chart of a frequency distribution 20 — 1 21 — 0 22 — 0 23 — 2 24 — 2 25 — 0 26 — 1 27 — 4 28 — 2 29 — 0 30 — 1 Score Frequency
Histogram Bar chart of a frequency distribution 20 — 1 21 — 0 22 — 0 23 — 2 24 — 2 25 — 0 26 — 1 27 — 4 28 — 2 29 — 0 30 — 1 Score Frequency Mode Median Mean 25.69
Histogram Exercise—On a piece of paper: 1.Make a histogram 2.Compute measures of central tendency
Histogram Mode = 27 Median = 27 Mean = 25.69
Histogram Mode = 27 Median = 27 Mean = Mode = 27 Median = 27 Mean = 25.69
Measures of Variability Range—the distance between the highest and lowest score Standard Deviation—the average distance all the scores are from the mean Well kind of…
Standard Deviation
Computing Standard Deviation X = mean n = each score N = total number of scores ∑ = sum (in this case, the sum of the differences of each score from the mean, squared) ∑(X-n) 2 N-1 Standard Deviation =
Standard Deviation X = ∑(X-n) 2 N-1
Standard Deviation 20 (5.69) 23 (2.69) 24 (1.69) 26 (-.31) 27 (-1.31) 28 (-2.31) 30 (-4.31) ∑(X-n) 2 N-1 x (5.69) x (2.69) x (1.69) x (-.31) x (-1.31) x (-2.31) x (-4.31) Square each difference to make them positive
Standard Deviation 20 (5.69) 23 (2.69) 24 (1.69) 26 (-.31) 27 (-1.31) 28 (-2.31) 30 (-4.31) ∑(X-n) 2 N-1 x (5.69) x (2.69) x (1.69) x (-.31) x (-1.31) x (-2.31) x (-4.31) = 32.4 = 7.25 = 2.86 = 0.09 = 1.71 = 5.33 = 18.6 Squared differences from the mean
Standard Deviation 20 (5.69) 23 (2.69) 24 (1.69) 26 (-.31) 27 (-1.31) 28 (-2.31) 30 (-4.31) ∑(X-n) 2 N-1 x (5.69) x (2.69) x (1.69) x (-.31) x (-1.31) x (-2.31) x (-4.31) = 32.4 = 7.25 = 2.86 = 0.09 = 1.71 = 5.33 = Sum of squared differences
Standard Deviation 20 (5.69) 23 (2.69) 24 (1.69) 26 (-.31) 27 (-1.31) 28 (-2.31) 30 (-4.31) ∑(X-n) 2 N-1 x (5.69) x (2.69) x (1.69) x (-.31) x (-1.31) x (-2.31) x (-4.31) = 32.4 = 7.25 = 2.86 = 0.09 = 1.71 = 5.33 = Average of squared differences / (13-1) = 7.40
Standard Deviation 20 (5.69) 23 (2.69) 24 (1.69) 26 (-.31) 27 (-1.31) 28 (-2.31) 30 (-4.31) ∑(X-n) 2 N-1 x (5.69) x (2.69) x (1.69) x (-.31) x (-1.31) x (-2.31) x (-4.31) = 32.4 = 7.25 = 2.86 = 0.09 = 1.71 = 5.33 = Average of squared differences / (13-1) = 7.40 Average of differences 7.40 = 2.72
Histogram Mode = 27 Median = 27 Mean = Range = 10 SD = 2.72 Mode = 27 Median = 27 Mean = Range = 7 SD = 2.43
Compute the mean 2. Subtract each score from the mean (9 scores—9 differences) 3. Square each difference 4. Add up the squares 5. Divide by n-1 (8) 6. Compute the square root Compute the Standard Deviation ∑(X-n) 2 N-1