R for Epi Workshop Module 3: Data visualization with ggplot

Slides:



Advertisements
Similar presentations
Rich Pugh Andy Nicholls Head to Head: Lattice vs ggplot2 Rich Pugh
Advertisements

KompoZer. This is what KompoZer will look like with a blank document open. As you can see, there are a lot of icons for beginning users. But don't be.
Plotting with ggplot2: Part 1
WELCOME TO THE ANALYSIS PLATFORM V4.1. HOME The updated tool has been simplified and developed to be more intuitive and quicker to use: 3 modes for all.
Systems Analysis I Data Flow Diagrams
Baburao Kamble (Ph.D) University of Nebraska-Lincoln
- Circle markers produced by TAsimage: They do not match was is produced on screen. The line width is too thick. Some other markers need to be tune a bit.
Introduction to Flash FYS100 Creative Discovery in Digital Art Forms Spring 2007 Burg.
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
Ranjeet Department of Physics & Astrophysics University of Delhi Working with Origin.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Ggplot2 A cool way for creating plots in R Maria Novosolov.
Medway: Here we David Whiting SEPHIG, 16 June, 2016.
Science Notebook Guide Who needs a Science Notebook? What materials do I need to make a Science Notebook? When is it due? Where will I keep it? Why do.
Poster Title Author Name(s) PRINTING INFORMATION
Developing Poster Presentations in the Social Sciences
Thinking Web > CONTENT DEVELOPMENT
Tidy data, wrangling, and pipelines in R
Hidden Slide for Instructor
Computer Fundamentals 1
Development Environment
AP CSP: Cleaning Data & Creating Summary Tables
Weebly Elements, Continued
Overview of R and ggplot2 for graphics
Tutorial 2: Formatting a Workbook
Digital Text and Data Processing
ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple
Getting your data into R
Jonathan W. Duggins; James Blum NC State University; UNC Wilmington
Introduction to R Programming with AzureML
Next Generation R tidyr, dplyr, ggplot2
Introduction to R Studio
Summary Statistics in R Commander
QM222 A1 Visualizing data using Excel graphs
Data manipulation in R: dplyr
PowerPoint Xpress Start
How to Structure a Geofluids Presentation
Data visualization in Python
Dplyr I EPID 799C Mon Sep
Prepared by Kimberly Sayre and Jinbo Bi
Measuring Polygon Side Lengths
Ggplot2 I EPID 799C Mon Sep
Data Visualization using R
Developing Poster Presentations in the Social Sciences
Sec. 1.1 HW Review Pg. 19 Titanic Data Exploration (Excel File)
Lab 1 Introductions to R Sean Potter.
ggplot2 II EPID 799C Wed Sep New Packages (install now!)
To get ready for class: 1. Get births ready as usual
OOP Paradigms There are four main aspects of Object-Orientated Programming Inheritance Polymorphism Abstraction Encapsulation We’ve seen Encapsulation.
CHAPTER 1 Exploring Data
Recoding II: Numerical & Graphical Descriptives
Poster Title Author Name(s) PRINTING INFORMATION
Tidy data, wrangling, and pipelines in R
Installing Packages Introduction to R, Part II
Whatcha doin'? Aims: Begin to create GUI applications. Objectives:
To get ready for class: 1. Get births ready as usual
Overview of R and ggplot2 for graphics
Lecture 7 – Delivering Results with R
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
Developing Poster Presentations in the Social Sciences
Marion cried and cried…and then cried some more.
Chapter 1 Introducing Small Basic
LearnZillion Notes: --This is your hook. Start with a question to draw the student in. We want that student saying, “huh, how do you do X?” Try to be specific.
R for Epi Workshop Module 2: Data Manipulation & Summary Statistics
Getting Started with Data
DATA VISUALISATION (QUANTITATIVE).
A brief introduction to the nutrient tool-kit, getting R Studio to work and checking the data Martyn Kelly
Open data in teaching and education
Spark with R Martijn Tennekes
Presentation transcript:

R for Epi Workshop Module 3: Data visualization with ggplot Mike Dolan Fliss, MSW, MPS PhD Candidate in Epidemiology UNC Gillings School of Global Public Health

Module Outline The grammar of graphics ggplot syntax Data-aes-geoms Facets & Themes Preparing data to Plot Scales Colors Labels & legends Stats ggplot extension packages Bit of theory to start, then we’ll dig back into our births data. Broadly divided into required and not-required parts of ggplot.

Same MO as before: Code along! Recommend typing it out Tab-complete is your speed and typo friend Copy and paste always available Mostly follow-along, with a little bit of free-work at the end of the module. A little bit of live coding. Minimum to start: library(tidyverse) setwd("D:/<YOUR DIR HERE>/R Workshop/") births_sm = readRDS("births_final.rds") Note about readRDS and save/load.

To start: New script, load data, headers #........................................ # IPH Workshop: R for Epi # Examples for modules 3 and 4 # Mike Dolan Fliss library(tidyverse) setwd("D:/<YOUR FILE PATH HERE!>/R Workshop/") # ^ Above is Mike’s – make your own! births_sm = readRDS("births_final.rds") # Module 3 #### # _ Slide: ggplot components ####

To start: New script, load data, headers

1. Grammar of Graphics Underlying theory

Grammar of Graphics A consistent, theoretical framework for describing data visualizations. Based on a book (1999) by Leland Wilkinson Helps think about and plan graphics outside of R… but implemented deeply in R’s in ggplot2 package. May also be familiar if you’ve worked in Tableau – Wilkinson now works for Tableau

Grammar of graphics components https://medium.com/tdebeus/think-about-the-grammar-of-graphics-when-improving-your-graphs-18e3744d8d18

Grammar of graphics components data aesthetic mapping geometric object statistical transformations scales coordinate system position adjustments faceting

Anatomy of a ggplot data aesthetic mapping geometric object statistical transformations scales coordinate system position adjustments faceting

2. ggplot2 syntax An implementation of gg!

ggplot components (are the same!) data aesthetic mapping geometric object statistical transformations scales coordinate system position adjustments faceting

ggplot components

ggplot components Or minimally, ggplot(data=<DATA>)+ <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) e.g., using mpg dataset ggplot(mpg)+ geom_point(aes(displ, hwy, color=class)) * Read the “+” here as “and” … And a note on overloaded operators!

Versus base / qplot shorthand hist(mpg$cyl) ggplot(mpg)+geom_histogram(aes(x=cyl)) ggplot(mpg, aes(cyl))+geom_histogram() qplot(mpg$cyl, geom="histogram") Simple vs. Easy

3. Data, aes, geoms The flexible power of aesthetic mappings

data ggplot likes “long”, well structured data.frames ggplot “stats” can make quick transformations dplyr will help with complicated transformations tidyr will help go from wide to long Extensions allow ggplot to understand other kinds of data (e.g. maps, network data) Stats – don’t do this.

geoms …and many more in sister packages. Like other key packages, a cheat sheet is built into R.

Aesthetic Mappings Data Aesthetics Geometry Numbers & Factors (characters coerced) meduc mage cigdur wksgest preterm_f pnc5_f county_name raceeth_f …and more x y color (name, rgb) fill size linetype (int or name) alpha height width shape angle ….and more Minimum for geom_point(), with rest are defaulted https://cran.r-project.org/web/packages/ggplot2/vignettes/ggplot2-specs.html

Cheatsheets Cheatsheet: https://www.rstudio.com/wp- content/uploads/2015/03/ggplot2-cheatsheet.pdf ... Or right in RStudio! (One more benefit of using current RStudio IDE version)

Sampling for prototyping speed Graphs can take a few moments, and for prototyping we want speed. Let’s use some dplyr to generate a sample to play with, and transfer to a dplyr tibble for the dataset viewing benefits. set.seed(1) # for replicability table(complete.cases(births_sm)) # test how it works births_10k = births_sm %>% filter(complete.cases(.)) %>% # dot syntax as_tibble() %>% sample_n(10000) births_10k %>% head(2) # helpful when doing EDA

Basic geometries and mappings Let’s create a scatterplot of wksgest and mage using ggplot and geom_point. ggplot(births_10k, aes(mage, ksgest))+geom_point()

Basic geometries and mappings D’oh! Overplotting! Use the geom_jitter() geometry instead. ggplot(births_10k, aes(mage, ksgest))+geom_jitter()

Basic geometries and mappings Let’s try colors. Map cigdur to color. That’s it! ggplot(births_10k, aes(mage, wksgest, color=cigdur))+ geom_jitter()

Basic geometries and mappings Fancier: change color to preterm_f, change the point character to a “.” and alpha to 0.5. Note global aes! ggplot(births_10k, aes(mage, wksgest, color=preterm_f))+ geom_jitter(pch=“.”, alpha=0.1) # ^ Typical chained spacing in last two examples, like dplyr

Aesthetic inheritance Subsequent geometric layers will inherit the aesthetic mappings from the original ggplot() call unless they’re overwritten. Meaning these are equivalent: ggplot(births_10k, aes(mage, wksgest))+ geom_jitter() ggplot(births_10k)+ geom_jitter(aes(mage, wksgest)) #^ equivalent

Aesthetic inheritance 2 …but there’s good reason to be intentional about this – like when we want multiple geometries to use the same mappings ggplot(births_10k, aes(mage, wksgest, color=cigdur))+ geom_jitter(alpha=0.1) + geom_smooth(method="lm") We’d like geom_jitter and geom_smooth to BOTH inherit mappings. They can be overridden in a specific geometry.

Connecting dplyr to ggplot While we’re at it, we may want to send dplyr output (without saving) directly into ggplot. Example: ggplot(births_10k, aes(mage, wksgest, color=cigdur))+ geom_jitter(alpha=0.1) + geom_smooth() # smooth is unhappy. Why? births_10k %>% filter(cigdur %in% c("Y", "N")) %>% # Note new logical operator, %in%! ggplot(aes(mage, wksgest, color=cigdur))+ geom_smooth() # Sends the dplyr'ed tibble / data.frame into # the first param of ggplot... the data param!

Common Visual EDA* workflow: Exploratory data analysis Pick some variables: mage, wksgest Consider a geometry: e.g. geom_bin2d Check the help file (F1, or type it in) Peruse, maybe run the example code (can run right within help window) Write our own!

Common Visual EDA workflow: Pick some variables: mage, wksgest Consider a geometry: e.g. geom_bin2d Check the help file (F1, or type it in) Peruse, maybe run the example code (can run righ within help window). Write our own!

Common Visual EDA workflow: Pick some variables: e.g. mage, wksgest Consider a geometry: e.g. geom_bin2d Check the help file (F1, or type it in) Peruse, maybe run the example code (can run righ within help window). Write our own! ggplot(births_10k, aes(mage, wksgest))+ geom_bin2d() Note: Some blanks seem to be a function of our unassigned (so, default) bin width. We could do something like bins = max(births_10k$wksgest)-min(births_10k$wksgest) to fix this!

4. Facets & Themes Small multiples and overall look and feel

Facets (wrap/grid):small multiples Facets take an R formula object (e.g . ~ x, y ~ x) and split your graph into small multiples based on that. Can also “free the scales” so they aren’t shared across plots with scales = “free_x” and/or “free_y” ggplot(births_10k, aes(mage, wksgest))+ geom_jitter(aes(color=preterm_f), pch=".", alpha=0.5)+ geom_smooth()+ facet_wrap( ~ raceeth_f)

Facets (grid/wrap):small multiples Facets take an R formula object (e.g . ~ x, y ~ x) and split your graph into small multiples based on that. Can also “free the scales” so they aren’t shared across plots with scales = “free_x” and/or “free_y” ggplot(births_10k, aes(mage, wksgest))+ geom_jitter(aes(color=preterm_f), pch=".", alpha=0.5)+ geom_smooth()+ facet_grid( ~ raceeth_f)

Facets (grid/wrap):small multiples Facets take an R formula object (e.g . ~ x, y ~ x) and split your graph into small multiples based on that. Can also “free the scales” so they aren’t shared across plots with scales = “free_x” and/or “free_y” ggplot(births_10k, aes(mage, wksgest))+ geom_jitter(aes(color=preterm_f), pch=".", alpha=0.5)+ geom_smooth()+ facet_grid(methnic ~ mrace)

Themes Change the theme with theme_NAME(), e.g. theme_minimal(). Can define your own themes, or tweak existing ones. See https://cran.r- project.org/web/packages/ggthemes/vignettes/ggthemes.html for more themes. More on ggplot extensions later! ggplot(births_10k, aes(mage, wksgest))+ geom_jitter(aes(color=preterm_f), pch=".", alpha=0.5)+ geom_smooth()+ facet_grid( ~ raceeth_f)+ theme_minimal()

5. Preparing data to plot Thinking ahead

Back to data Sometimes we need to transform our data to get at the question we have in mind. Saw with cigdur earlier. e.g. What is the association of preterm birth and maternal age?* * Again, we’re not approaching this from a stricter causal inference epidemiology frame in this workshop… Before, we did a minor transform with a filter for cigdur

Preparing to plot: preterm/mage Might first try this as a plot of the relationship of preterm birth and maternal age using: ggplot(births)+ geom_histogram(aes(x=mage, fill=preterm_f), position="fill") Maybe suggestive of some meaning… but YUCK!

Preparing to plot: preterm/mage Let’s try again, but first create a new dataset to plot with. Use dplyr, group the original dataset by (mage), and summarize it into two variables: pct_preterm and n. Then create a plot of the relationship of preterm birth and maternal age. magepreterm_df = births_sm %>% group_by(mage) %>% summarise(pct_preterm = mean(preterm_f == "Preterm", na.rm=T)*100, n=n()) ggplot(magepreterm_df, aes(mage, pct_preterm))+ geom_point(aes(size = n))+ geom_smooth(aes(weight=n)) # ^ Note the weight & size aesthetics!

Preparing to plot: preterm/mage Could do this all in one pass, if we don’t need to keep working with the temporary dataset. Readable! births_sample %>% group_by(mage) %>% summarise(pct_preterm = mean(preterm_f == "Preterm", na.rm=T)*100, n=n()) %>% ggplot(aes(mage, pct_preterm))+ geom_point(aes(size = n))+ geom_smooth(aes(weight=n))

Other plots to explore Starting to speak to some of the reality of the data. How might we interpret these? What do you see?

Plots & Interpretation ggplot(births_10k, aes(wksgest))+ geom_histogram(bins=25) * Note we don’t (yet) know how to clean up axes, labels, legends, etc. Just aiming for the functional look of the plot this time (see upper right.) But this is typical of iterative development – aim for visual communication, then hone details.

Plots & Interpretation births_10k %>% filter(cigdur %in% c("N", "Y")) %>% ggplot(aes(wksgest, fill=cigdur))+ geom_density(adjust=2, alpha=.5)+theme_minimal()

Plots & Interpretation births_10k %>% filter(cigdur %in% c("N", "Y")) %>% ggplot(aes(cigdur, wksgest, color=cigdur, fill=cigdur))+ geom_boxplot(alpha=.5)+ facet_wrap(~pnc5_f)+ theme_minimal()

Plots & Interpretation county_stats = births_sm %>% group_by(county_name, cores) %>% summarise(pct_preterm = mean(preterm, na.rm=T)*100, pct_earlyPNC = mean(pnc5, na.rm=T)*100, n=n()) ggplot(county_stats, aes(pct_earlyPNC, pct_preterm, size=n))+ geom_point()+ geom_smooth()

Sidenote (other tools later)

Plots & Interpretation births_sm %>% group_by(raceeth_f) %>% summarise(preterm = mean(preterm, na.rm=T), pnc5 = mean(pnc5, na.rm=T)) %>% gather(measure, percent, -raceeth_f) %>% mutate(percent = percent * 100) %>% ggplot(aes(measure, percent, fill=raceeth_f, group=raceeth_f, label=round(percent, 1))) + geom_bar(stat="identity", position="dodge")+ geom_text(position=position_dodge(.9), aes(y=percent+3), hjust=0.5)

Plots & Interpretation births_10k %>% filter(cigdur %in% c("N", "Y")) %>% ggplot(aes(wksgest, mage, color=cigdur))+ geom_density_2d(alpha=.5)+theme_minimal()

Worth a skim: http://r-statistics Worth a skim: http://r-statistics.co/Top50-Ggplot2-Visualizations- MasterList-R-Code.html http://chartmaker.visualisingdata.com/

Pause for a brief breather and questions! Next: Colors, themes, ggplot extension packages

Module Outline The grammar of graphics ggplot syntax Data-aes-geoms Facets & Themes Preparing data to Plot Scales: Limits, colors, positions Labels & legends Stats ggplot extension packages

6. Scales Limits, colors, positions

ggplot components

Scales: axes Some form of: scale_x_continuous(limits=c(a,b), breaks=a:b) …is pretty common A note on clipping: Most common

Scales: axes births_sm %>% group_by(mage) %>% summarise(pct_preterm = mean(preterm_f == "Preterm", na.rm=T)*100, n=n()) %>% ggplot(aes(mage, pct_preterm))+ geom_point(aes(size = n))+ geom_smooth(aes(weight=n))+ scale_y_continuous(limits=c(0,30)) # coord_cartesian(ylim = c(0,30)) #change the "window"

Scales : color Typically using brewer or manual Will focus on discrete scales, but continuous works very similarly (and is typically easier)

Scales: Colors births_10k %>% ggplot(aes(mage, wksgest, color=preterm_f))+ geom_jitter()+ scale_color_discrete() # default scale_color_manual(values = c("Blue", "Red")) scale_color_manual(values = c("Preterm" = "Red", "Term" = "Blue"))

Scales: color brewer http://colorbrewer2.org RColorBrewer::display.brewer.all() http://colorbrewer2.org

Scales: viridis Use the color scales in this package to make plots that are pretty, better represent your data, easier to read by those with colorblindness, and print well in grey scale. Compare, for instance, with rainbow / spectral perceptual “jumps.”

Scales: Colors births_10k %>% ggplot(aes(mage, wksgest, color=preterm_f))+ geom_jitter()+ scale_color_brewer(palette = "Paired") scale_color_viridis_d(begin=0, end=.4) # (0->1)

positions

positions dodge fill jitter nudge stack ggplot(mpg, aes(fl, fill=drv))+ geom_bar(position=“<POSITION>”) Where <POSITION> is one of: dodge fill jitter nudge stack Can be tweaked further with the function equivalents position = position_dodge(width=1) Or a few more detailed versions …+geom_label(aes(label=mylabs), nudge_y=1)

coordinate systems Rarely used, except for coord_flip() Other solutions in helper packages. Check ggstance in particular. (Also used for radial geometries, maps, etc.)

coordinate systems # Horizontal with coord_flip() ggplot(mpg, aes(class, hwy, fill = factor(cyl))) + geom_boxplot() + coord_flip() # In ggstance, you supply aesthetics in their natural order: # Horizontal with ggstance ggplot(mpg, aes(hwy, class, geom_boxploth()

7. Labels & Legends Limits, colors, positions

Labels labs() covers almost every label: x y title subtitle caption And can also do legend titles… …or do in the scale_*() function Use g+annotate() to just stick something (anything!) right where you want it.

Labels

Labels ggplot(county_stats, aes(pct_earlyPNC, pct_preterm, size=n, color=county_name, label=county_name))+ geom_point()+ geom_text(nudge_y = .5, size=3)+ # ^ behind the scenes, this is a position adjustment! geom_smooth(aes(weight=n), color="black", show.legend = F, se=F)+ scale_color_discrete(guide="none")+ #^ turn color guide off another way labs(x="% early prenatal care", y="% preterm", title="% preterm vs. % early prenatal care County", subtitle="Roughly inverse relationship, but very rough!")+ annotate(geom="rect", xmin=75, xmax=80, ymin=0, ymax=10, alpha=.2, fill="black")+ annotate(geom="text", x=77, y=5, angle=90, label="nothing here!", vjust=0.5)

7. Stats More advanced! Put on your thinking caps / this is just an introduction.

Layer=data+stats+geoms Hadley’s (package author) underlying theory from grammar of graphics: layer=data+stats+geoms. Often stat or geom imply the other, so each has a default parameter of the other. http://vita.had.co.nz/papers/layered-grammar.pdf Excellent Stack Overflow Review https://stackoverflow.com/questions/38775661/what-is-the- difference-between-geoms-and-stats-in-ggplot2 Default stats for geoms: http://sape.inf.usi.ch/quick- reference/ggplot2/geom

Let’s Try http://ggplot2.tidyverse.org/reference/

Stat variables geoms, behind the scenes, often calculate a new dataframe to actually plot on screen. stat_<thing> calculates a new dataframe explicitly That dataframe has some “secret” (documented) variable names, accessible by special inline variables ..count.. ..ncount.. ..density.. ..ndensity.. ..count.. ..prop.. Etc. What if I want to refer to those new variables elsewhere in the ggplot call? (rare use – dplyr instead!)

Example g = ggplot(births_10k, aes(cigdur, fill=preterm_f))+ geom_bar(position="dodge") ggplot_build(g)$data

Example ggplot(births_10k, aes(cigdur, fill=preterm_f))+ geom_bar(position="dodge", aes(y=..count../sum(..count..))) ggplot(births_10k, aes(cigdur, fill=preterm_f, group=preterm_f))+ geom_bar(position="dodge", aes(y=..prop..)) # ^ less than ideal. Really best to do yourself… Requires careful group-aesthetic assignment.

8. ggplot extensions Extending the language

Overview Extensions to ggplot2 add geometries, statistics, themes… all within the language you now know. Dedicated website for them: http://www.ggplot2-exts.org/ And many in development elsewhere. Let’s look at some greatest hits

So,so many themes https://github.com/cttobin/ggthemr ggthemes, etc. library(ggthemes) ggplot(county_stats, aes(pct_preterm, pct_earlyPNC, size=n, label=cores))+ geom_jitter()+ theme_tufte()

ggsci https://ggsci.net/ : ggsci offers a collection of ggplot2 color palettes inspired by scientific journals, data visualization libraries, science fiction movies, and TV shows

plotly https://plot.ly/ Make your plots mouse-interactive. Great for exploring data. Minimal use is: ggplotly() …which makes your last ggplot interactive. Can also send plots to your account online, integrate with shiny interactive websites (renderPlotly and outputPlotly), etc. Will demo more in Module 4.

ggrepel Already have geom_text() and geom_label (label draws a text box, text is just text) Now have geom_text_repel() and geom_label_repel() to use a “force” to spread out data and labels. library(ggrepel) ggplot(county_stats %>% filter(n>1000), aes(pct_preterm, pct_earlyPNC, size=n, label=county_name))+ geom_jitter(alpha=0.3)+ geom_text_repel(force=0.01, show.legend = F)+ theme_minimal()

ggmosaic library(ggmosaic) ggplot(births_10k)+ geom_mosaic(aes(x=product(preterm_f, raceeth_f), fill=preterm_f))

GGally ggpairs

GGally high level EDA tools. ggcorr()

ggdag For the causal epidemiologists in the room! Works alongside dagitty tools and website. Can download a saved dag from the web as an object, plot and adjust it for publication.

Special mention: broom Broom… tidies data! Nice to get ready for ggplot. Currently broom provides tidying methods for many S3 objects from the built-in stats package, including lm glm htest anova nls kmeans manova TukeyHSD arima It also provides methods for S3 objects in popular third-party packages, including Lme4 glmnet boot gam survival Lfe zoo multcomp sp maps Will demo just a little in Module 4.

Survminer (also GGAlly::ggsurv()) s <- survival::survfit(Surv(wksgest, preterm) ~ pnc5_f, data=births_sm, type="kaplan-meier") ggplot2::autoplot(s)+ labs(title="Rough survival plot, preterm birth as outcome") # autoplot: have ggplot guess what to do by the data type

Thinking big: packages off CRAN For example, treemapify https://github.com/wilkox/treemapify Install dev versions of things with devtools, and install_github, etc.

Thinking big: Sankey / river / alluvial diagrams Good example of how many ways to cut it: https://stackoverflow.com/questions/9968433/sanke y-diagrams-in-r

10. Putting it together

births_sm %>% group_by(mage) %>% summarise(pct_preterm = mean(preterm_f == "Preterm", na.rm=T)*100, n=n()) %>% ggplot(aes(mage, pct_preterm))+ geom_point(aes(size = n), alpha=0.5)+ geom_smooth(aes(weight=n), se=F, lty="dashed", color="red")+ scale_x_continuous(limits = c(0,NA))+ scale_y_continuous(limits = c(0,30))+ labs(x="Maternal age", y="% Preterm", size="# of births", title="Relationship between maternal age and % preterm birth", subtitle=paste0("Data from NC SCHS, including ", prettyNum(nrow(births_sm), big.mark = ","), " births."), caption = "NOTES: The relationship of maternal age and preterm birth \r is roughly quadratic, as modeled with a LOESS curve.")+ theme_minimal() ggsave("maternal_age_and_preterm.png", width=6, height=4)

Saving a ggplot The usual GUI / Rstudio way

Saving a ggplot Programmatically - super easy. Save my last ggplot: ggsave(“filename.png”) ggsave(“filename.pdf”, width=10, height=8) Or can specifically save the ggplot to an object, then send to ggsave() by object name. my_ggplot = ggplot()… etc. ggsave(g, “my_new_ggplot.png”)

END of MODULE 3!