Displaying data Seminar 2
Today’s Question Once we have collected a large number of measurements, how can we summarize or describe those measurements most effectively by using visual techniques?
Conveying Summary data Part 1 Conveying Summary data
Pie Charts Pie charts are commonly used to represent scores that have a fixed total (e.g., 100%) – that’s why it’s a full circle Here, income spent on necessities. Rarely used in psychology research. Why?
Barcharts The same pie chart can be translated into a barchart But barcharts do not assume that there is a fixed total. Hence, a barchart does not necessarily convert back into a pie chart.
Line graphs X-axis: usually, but not always, a continuous variable Any differences in interpretation between the two graphs?
Scatterplots X-axis: almost always continuous variable A line of best fit is usually added Very useful for inspecting outliers
Let’s take a look at Excel Advantages Hi-resolution graphics tool Easy to use Disadvantages Cannot plot broken axes Plotting of error bars is complicated No histograms (you need to install the Add-On Data Analysis Toolpak)
What does Excel offer? You probably need only three: column, line & scatter
WARNING: Never plot 3D graphs
Conveying Distributional information Part 2 Conveying Distributional information
An Example How can you create some order in the chaos? How stressed have you been in the last 2 weeks? Scale: 0 (not at all) to 10 (feels like exploding) 4 7 7 7 8 8 7 8 9 4 7 3 6 9 10 5 7 10 6 8 7 8 7 8 7 4 5 10 10 0 9 8 3 7 9 7 9 5 8 5 0 4 6 6 7 5 3 2 8 5 10 9 10 6 4 8 8 8 4 8 7 3 7 8 8 8 7 9 7 5 6 3 4 8 7 5 7 3 3 6 5 7 5 7 8 8 7 10 5 4 3 7 6 3 9 7 8 5 7 9 9 3 1 8 6 6 4 8 5 10 4 8 10 5 5 4 9 4 7 7 7 6 6 4 4 4 9 7 10 4 7 5 10 7 9 2 7 5 9 10 7 2 5 9 8 10 10 6 8 3 How can you create some order in the chaos?
Frequency Tables A frequency table shows how often each value of the variable occurs Stress rating Frequency 10 14 9 15 8 26 7 31 6 13 5 18 4 16 3 12 2 1 4 7 7 7 8 8 7 8 9 4 7 3 6 9 10 5 7 10 6 8 7 8 7 8 7 4 5 10 10 0 9 8 3 7 9 7 9 5 8 5 0 4 6 6 7 5 3 2 8 5 10 9 10 6 4 8 8 8 4 8 7 3 7 8 8 8 7 9 7 5 6 3 4 8 7 5 7 3 3 6 5 7 5 7 8 8 7 10 5 4 3 7 6 3 9 7 8 5 7 9 9 3 1 8 6 6 4 8 5 10 4 8 10 5 5 4 9 4 7 7 7 6 6 4 4 4 9 7 10 4 7 5 10 7 9 2 7 5 9 10 7 2 5 9 8 10 10 6 8 3
Frequency Polygon Stress rating Frequency 10 14 9 15 8 26 7 31 6 13 5 18 4 16 3 12 2 1
Histograms Another way of visually representing information contained in a frequency table Histograms are like bar charts; bars are used instead of connected points The bars typically cover “intervals” (also called “bins”) of values. The first bar here covers scores > 0 and < 1.
Shapes of Distributions These representational aides all describe frequency distributions: the way score frequencies are distributed with respect to the values of the variable Distributions can take on a number of shapes or forms
Unimodal Distributions The mode of a distribution refers to the most frequently occurring score Mode = “peak” In a unimodal distribution, one score occurs much more frequently than others
Multimodal Distributions In multimodal distributions, more than one mode exists (or approximately so) In a bimodal distribution, two modes exist What will cause a bimodal distribution? Note: This is not a binomial distribution
Rectangular or Uniform Distributions In a uniform distribution, all values are observed equally often
Question Suppose you throw a dice. What will the shape of the distribution of the numbers be? *** Hint to Tutorial 4 ***
Symmetrical and Skewed Distributions A symmetrical distribution is balanced: if we cut it in half, the two sides would be mirror images of one another Normal distribution: a particular kind of distribution that resembles a bell (bell-shaped distribution)
Skewed Distributions A skewed distribution is unbalanced; there may be a cluster of scores piling on one end of the scale
Question: What are some possible reasons causing skewed distributions? negative skew positive skew Question: What are some possible reasons causing skewed distributions?
Boxplots An efficient way to display five attributes: minimum non-outliers, lower quartile, median, upper quartile, maximum non-outliers More in Seminar 3
Boxplot conveys even more information! What does it convey? __________
Part 3 Tables
Why use tables? Efficient presenting simple lists (e.g., demographic characteristics) But can be difficult to comprehend (any solutions?) Chan et al. (2012). What do love and jealousy taste like? Emotion.
Contingency tables A matrix format that displays the data (e.g., frequency or mean) of the variables This is a 2 (Sex: male vs. females) x 2 (Handedness: right vs. left) contingency table Right-handed Left-handed Total Males 43 9 52 Females 44 4 48 87 13 100 Marginal totals/means Grand total or mean Marginal totals/means
What makes a good or bad graph? Part 3 What makes a good or bad graph?
Your task Get into groups of about 5. Identify any problems with the following graphs. Devise a solution. Think whether your solution creates other problem(s).
Nummenmaa et al. (2014). Bodily maps of emotions. PNAS.
World Bank (2015). World development indicators.
Take home messages Graphs should be minimally simple (even colors should be avoided – why?) Plots communicate information easily But sometimes tables do a better job. Think of what you want to communicate, and think of what readers will interpret your data.