Presentation is loading. Please wait.

Presentation is loading. Please wait.

ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple

Similar presentations


Presentation on theme: "ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple"— Presentation transcript:

1 ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple
FISH 512: Super-Advanced R

2 Goals for the lecture If you have used ggplot2 before:
learn some new tricks obtain a useful handout Share your tricks with others If you have not used ggplot2 before: Learn the basics Start using for easy data exploration

3 Figure Credit: Sean Anderson

4 Sources http://docs.ggplot2.org/current/
Elegant Graphics for Data Analysis: Sean Anderson – notes for ggplot2, FSH 554 Cookbook for R:

5 Fundamentals to ggplot2
Data massaging Layering Themes

6 Data massaging AGRRA database

7 Data format requirements
“long” format data, “tidy data” Each aesthetic or facet variable in its own column Useful packages: reshape2, plyr, dplyr (Lecture 6)

8 Merging multiple data sets with shared columns
base R merge() Joins 2 data frames by matching ID variables plyr Join functions Merge data frames by ID variables, more flexible than merge() join_all() – list of data frames Can specify rows to use (1st data frame only, rows from all data frames, etc.) reshape2 melt() Reshapes data into long form Option to specify ID and measurement variables Cast functions Reshapes data into wide form dcast() or acast() depending on if you want data frame or array output

9 Atlantic and Gulf Rapid Reef Assessment (AGRRA)

10 Variables in Data 3 data files Simplified variables for this example
Average and standard deviation for most variables Shared ID Variables Coral Mortality Algae Abundance Fish Biomass Country Number of Corals Crustose Total Fish Biomass Year Total Standing Dead Turf Herbivore Biomass Site Macro Invertivore Biomass Date Piscivore Biomass Depth Zone Number of Transects

11 DataMassageAGRRA.R

12 DataMassageAGRRA.R Wide form to long form

13 DataMassageAGRRA.R Wide form to long form
Specify variables that should have separate value columns

14 DataMassageAGRRA.R Wide form to long form
Specify variables that should have separate value columns Get 2 separate value columns for each factor

15 DataMassageAGRRA.R Wide form to long form
Specify variables that should have separate value columns Get 2 separate value columns for each factor + a couple other boring steps in code…. And repeat for algae and fish biomass data

16 DataMassageAGRRA.R Join multiple data frames from a list

17 DataMassageAGRRA.R Join multiple data frames from a list
Include rows from all data frames, will result in NAs for ID variable combinations that don’t have data for some measurement variables

18 Take a look at the data structure “SelectAGRRA_long
Take a look at the data structure “SelectAGRRA_long.csv” read into “Explore_ggplot2.R” “DataMassageAGRRA.R” and original csv files for reference

19 Layering ggplot2 functions

20 Difference between plot and ggplot2 functions
qplot = ggplot wrapper: less syntax for common tasks ggplot will work in all cases

21 ggplot function Layered grammar data + geometric representation + aesthetics + layout See file: ggplot2_explore.R Hadley Wickham – Elegant Graphics for Data Analysis

22 Data and aesthetic mapping
Data frame Columns in data frame

23 Data and aesthetic mapping
Data frame Columns in data frame Can add data aesthetics to initial aes function, or add as another layer later to base plot

24 Slide courtesy of Sean Anderson
Geoms Slide courtesy of Sean Anderson

25

26

27

28

29 Position adjustment

30

31 Scales Controls the mapping from data to aesthetic attributes scale_xxx_yyy()

32

33

34 Faceting Investigating whether patterns hold across all conditions
Discrete variables in data frame

35

36

37

38 Subsetting data So far – all examples used data that was available for every ID variable Country, Depth, Zone, Number of Transects (all data points) Number of Corals, Total Fish Biomass (not all data points, but for all of the above categorical variables) Need to subset data to adjust which data to plot when you’re only interested in 1 factor from a list of possible factors

39

40

41 Specifying data in geom
OR This example – just used a different subset from the same dataset Useful if working with multiple datasets or would like to refer to any other object

42 Specifying data in geom
OR This example – just used a different subset from the same dataset Useful if working with multiple datasets or would like to refer to any other object

43 Adding lines and rectangles to plots
See also: geom_abline() geom_vline() geom_hline() geom_rect()

44 Statistics stat_smooth < 1000 points – default “loess”
> 1000 points – default “gam” This would take a lot more time to create in base R!

45

46

47

48 Publication quality figures
Adjusting the theme

49

50 See example code for creating custom theme in“ggplot2_explore.R”
Courtesy of:

51

52 ggthemes package

53 Exercises in code ending with… Recreating this plot!

54 Coordinate System Maps position of objects onto plane of the plot (x,y) coordinates – potential for more dimensions but not yet capable Cartesian Semi-log Polar


Download ppt "ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple"

Similar presentations


Ads by Google