Download presentation
Presentation is loading. Please wait.
Published byKelley Quinn Modified over 6 years ago
1
ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple
FISH 512: Super-Advanced R
2
Goals for the lecture If you have used ggplot2 before:
learn some new tricks obtain a useful handout Share your tricks with others If you have not used ggplot2 before: Learn the basics Start using for easy data exploration
3
Figure Credit: Sean Anderson
4
Sources http://docs.ggplot2.org/current/
Elegant Graphics for Data Analysis: Sean Anderson – notes for ggplot2, FSH 554 Cookbook for R:
5
Fundamentals to ggplot2
Data massaging Layering Themes
6
Data massaging AGRRA database
7
Data format requirements
“long” format data, “tidy data” Each aesthetic or facet variable in its own column Useful packages: reshape2, plyr, dplyr (Lecture 6)
8
Merging multiple data sets with shared columns
base R merge() Joins 2 data frames by matching ID variables plyr Join functions Merge data frames by ID variables, more flexible than merge() join_all() – list of data frames Can specify rows to use (1st data frame only, rows from all data frames, etc.) reshape2 melt() Reshapes data into long form Option to specify ID and measurement variables Cast functions Reshapes data into wide form dcast() or acast() depending on if you want data frame or array output
9
Atlantic and Gulf Rapid Reef Assessment (AGRRA)
10
Variables in Data 3 data files Simplified variables for this example
Average and standard deviation for most variables Shared ID Variables Coral Mortality Algae Abundance Fish Biomass Country Number of Corals Crustose Total Fish Biomass Year Total Standing Dead Turf Herbivore Biomass Site Macro Invertivore Biomass Date Piscivore Biomass Depth Zone Number of Transects
11
DataMassageAGRRA.R
12
DataMassageAGRRA.R Wide form to long form
13
DataMassageAGRRA.R Wide form to long form
Specify variables that should have separate value columns
14
DataMassageAGRRA.R Wide form to long form
Specify variables that should have separate value columns Get 2 separate value columns for each factor
15
DataMassageAGRRA.R Wide form to long form
Specify variables that should have separate value columns Get 2 separate value columns for each factor + a couple other boring steps in code…. And repeat for algae and fish biomass data
16
DataMassageAGRRA.R Join multiple data frames from a list
17
DataMassageAGRRA.R Join multiple data frames from a list
Include rows from all data frames, will result in NAs for ID variable combinations that don’t have data for some measurement variables
18
Take a look at the data structure “SelectAGRRA_long
Take a look at the data structure “SelectAGRRA_long.csv” read into “Explore_ggplot2.R” “DataMassageAGRRA.R” and original csv files for reference
19
Layering ggplot2 functions
20
Difference between plot and ggplot2 functions
qplot = ggplot wrapper: less syntax for common tasks ggplot will work in all cases
21
ggplot function Layered grammar data + geometric representation + aesthetics + layout See file: ggplot2_explore.R Hadley Wickham – Elegant Graphics for Data Analysis
22
Data and aesthetic mapping
Data frame Columns in data frame
23
Data and aesthetic mapping
Data frame Columns in data frame Can add data aesthetics to initial aes function, or add as another layer later to base plot
24
Slide courtesy of Sean Anderson
Geoms Slide courtesy of Sean Anderson
29
Position adjustment
31
Scales Controls the mapping from data to aesthetic attributes scale_xxx_yyy()
34
Faceting Investigating whether patterns hold across all conditions
Discrete variables in data frame
38
Subsetting data So far – all examples used data that was available for every ID variable Country, Depth, Zone, Number of Transects (all data points) Number of Corals, Total Fish Biomass (not all data points, but for all of the above categorical variables) Need to subset data to adjust which data to plot when you’re only interested in 1 factor from a list of possible factors
41
Specifying data in geom
OR This example – just used a different subset from the same dataset Useful if working with multiple datasets or would like to refer to any other object
42
Specifying data in geom
OR This example – just used a different subset from the same dataset Useful if working with multiple datasets or would like to refer to any other object
43
Adding lines and rectangles to plots
See also: geom_abline() geom_vline() geom_hline() geom_rect()
44
Statistics stat_smooth < 1000 points – default “loess”
> 1000 points – default “gam” This would take a lot more time to create in base R!
48
Publication quality figures
Adjusting the theme
50
See example code for creating custom theme in“ggplot2_explore.R”
Courtesy of:
52
ggthemes package
53
Exercises in code ending with… Recreating this plot!
54
Coordinate System Maps position of objects onto plane of the plot (x,y) coordinates – potential for more dimensions but not yet capable Cartesian Semi-log Polar
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.