Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4: overplotting and multipanel plots

Similar presentations


Presentation on theme: "Lecture 4: overplotting and multipanel plots"— Presentation transcript:

1 Lecture 4: overplotting and multipanel plots
Trevor A. Branch Beautiful graphics in R, FISH554 SAFS, University of Washington

2 Class readings Course website: files\readings
All are relevant, cover a range from classic papers to reports to guides, for example Wong ( ): one-page classic tips published in Nature Methods Wainer (1984): how to display data badly Gelman (2011): why tables are so much better than graphs (plus responses) Rougier (2014): ten rules for better figures McCauley et al. (2015): wonderful graphics in a paper

3 McCauley et al. (2015) paper (paid graphic designer Nicole Fuller to illustrate paper, visit her work at McCauley DJ et al. (2015) Marine defaunation: animal loss in the global ocean. Science 347:

4 McCauley et al. (2015) (combination of R and Illustrator to create lovely graphics in paper itself)
McCauley DJ et al. (2015) Marine defaunation: animal loss in the global ocean. Science 347:

5 McCauley DJ et al. (2015) Marine defaunation: animal loss in the global ocean. Science 347:236-247

6

7 Classic data visualization references
Bertin J (1983) Semiology of graphics, 2nd Edition, Vol. University of Wisconsin Press, Madison, Wisconsin (Lays out in exquisite detail everything you need to create beautiful plots) Cleveland, W. S. (1994) The elements of graphing data. 2nd edn. AT&T Bell Laboratories, Murray Hill, New Jersey (Conducted experiments to see how people interpret graphics, transformed the field. Though the figures in the book are rather poor!)

8 Area plots and polygon() (from last lecture)

9 Source: http://www. informationisbeautiful
Created by David McCandless & Lee Byron, taken from The Visual Miscellaneum Searches for “we broke up because…”

10 Advanced use of polygon()

11 Line shading using col="black" and col="gray50"
Branch et al. (2006) Marine Policy 30:

12 Line shading using angle=45 and number=20
Branch (2006) Bulletin of Marine Science 78:

13 The overplotting problem
Fig. 1. Number of publications per capita per year published by Czech avian ecologists up to 2006 plotted against their beer consumption per capita per year in litres. Both data sets shown are Box-Cox transformed (thus neither the output score nor the consumption score values enable the identification of particular persons included in this research). The negative relationship between beer consumption and publication success is significant not only for the whole data set (rs0.55, n34, p0.0008) but also for ‘‘past’’ (included in the first survey in 2002; m) and ‘‘present’’ researchers (included in 2006; k) analyzed separately (‘‘past’’: rs0.68, n18, p0.002; ‘‘present’’: rs0.52, n16, p0.04). Grim T (2008) A possible role of social activity to explain differences in publication output among ecologists. Oikos 117:

14 Source: http://www. destination360

15 Old Faithful predictions: hexbin()
Next interval (min) This interval (min)

16 Multipanel plots: key lessons
Shrink figures Small multiples Delete extra axes One caption per axis

17 Tufte: small multiples
Consumer Reports 47 (April 1982) p Tufte (2001) The visual display of quantitative information, p. 174

18 Tufte: graphics can be shrunk way down
Figure: Bertin Semiology of Graphics English translation 1983, p. 214 Tufte (2001) The visual display of quantitative information, p. 169

19 Well-designed small multiples
Comparative Multivariate Shrunken, high-density Large data matrix Almost entirely data-ink Efficient in interpretation Narrative in content

20 Data density Data density 0.02 per cm²
Data density is 0.02 numbers per square centimeter Data density 0.02 per cm² Executive Office of the President (1973) Office of Management and Budget, Social Indicators, Washington DC, p. 86 Tufte (2001) The visual display of quantitative information, p

21 The communes of France Data density 17,000 per cm²
Bertin (1983) Semiology of Graphics, English translation, p. 152 In: Tufte (2001) The visual display of quantitative information, p. 166

22 This is a dashboard, allows for very high data density
Source: Performance Dashboards Measuring Monitoring and Managing Your Business This is a dashboard, allows for very high data density

23 For non-data-ink, less is more. For data-ink, less is a bore.
Tufte (2001) The visual display of quantitative information, p. 175

24 How can this figure be improved?

25 Exercise 1a Plot the data in "OldFaithfulDuration.csv“
Then solve the overplotting problem using different symbols, sizes, colors, and jittering; also try hexbin()

26 Exercise 2a Create a small multiple plot of the data in “CalCurrRevenue.csv” like the plot below (or better!).

27 Exercise 1b (advanced) Plot the data in "OldFaithfulDuration.csv“
Solve the overplotting problem with circle size proportional to number of points at each intersection of x and y value

28 Exercise 2b (advanced) Create a small multiple plot of the data in “CalCurrRevenue.csv”, using polygon() to shade the area of the catches.


Download ppt "Lecture 4: overplotting and multipanel plots"

Similar presentations


Ads by Google