Ch. 10: Summarizing the Data
Criteria for Good Visual Displays Clarity Data is represented in a way closely integrated with their numerical meaning. Precision Data is not exaggerated. Efficiency Data is presented in a reasonably compact space.
Frequency Distribution Example
Bar Graphs Example
Stem-and-Leaf Chart Stems Leaves 8 2 7 4 9 6 5 1 3
Back-to-Back Stem-and-Leaf Chart Depression Stems Hypomania 2 5 8 1 6 4 3
Measures of Central Tendency: Determining The Median Arrange scores in order Determine the position of the midmost score: (N+1)*.50 Count up (or down) the number of scores to reach the midmost position The median is the score in this (N+1)*.50 position
Measures of Central Tendency: The Arithmetic Mean The balancing point in the distribution Sum of the scores divided by the number of scores, or
Measures of Central Tendency: The Mode The most frequently occurring score Problem: May not be one unique mode
Symmetry and Asymmetry Symmetrical (b) Asymmetrical or Skewed Positively Skewed (a) Negatively Skewed (c)
Comparing the Measures of Central Tendency If symmetrical: M = Mdn = Mo If negatively skewed: M < Mdn Mo If positively skewed: M > Mdn Mo
Measures of Spread: Types of Ranges Crude Range: High score minus Low score Extended Range: (High score plus ½ unit) minus (Low score plus ½ unit) Interquartile Range: Range of midmost 50% of scores
Measures of Spread: Variance and Standard Deviation Variance: Mean of the squared deviations of the scores from its mean Standard Deviation: Square root of the variance
Summary Data for Computing the Variance and Standard Deviation Raw scores X - M (X – M)2 2 -3 9 4 -1 1 5 7 8 3 X = 30 (X – M) = 0 (X – M)2 = 24 M = 5
Descriptive vs. Inferential Formulas Use descriptive formula when: One is describing a complete population of scores or events Symbolized with Greek letters Use inferential formula when: Want to generalize from a sample of known scores to a population of unknown scores Symbolized with Roman letters
Variance: Descriptive vs. Inferential Formulas Descriptive Formula Inferential Formula Called the “unbiased estimator of the population value”
Confidence Interval for a Mean
Values of x (for df =5) for Five Different Confidence Intervals CI x t(x) (for df = 5) 99.9% .001 6.87 99% .01 4.03 95% .05 2.57 90% .10 2.02 80% .20 1.48
The Normal Distribution Standard Normal Distribution: Mean is set equal to 0, Standard deviation is set equal to 1
Standard Scores or z-scores Raw score is transformed to a standard score corresponding to a location on the abscissa (x-axis) of a standard normal curve Allows for comparison of scores from different data sets.
Raw Scores (X) and Standard Scores (z) on Two Exams Student ID and gender Exam 1 Exam 2 Average of z1 and z2 scores X1 score z1 score X2 score z2 score 1 (M) 42 +1.78 90 +1.21 +1.50 2 (M) 9 -1.04 40 -1.65 -1.34 3 (F) 28 +0.58 92 +1.33 +0.96 4 (M) 11 -0.87 50 -1.08 -0.98 5 (M) 8 -1.13 49 6 (F) 15 -0.53 63 -0.33 -0.43 7 (M) 14 -0.62 68 -0.05 -0.34 8 (F) 25 +0.33 75 +0.35 +0.34 9 (F) +1.61 89 +1.16 +1.38 10 (F) 20 -0.10 72 +0.18 +0.04 Sum () 212 688 Mean (M) 21.2 68.8 SD () 11.69 1.0 17.47 0.98