Download presentation
Presentation is loading. Please wait.
Published byLogan Parks Modified over 9 years ago
1
Introduction to Graphics in R 3/12/2014
2
First, let’s get some data Load the Duncan dataset It’s in the car package. Remember how to get it? – library(car) – data(Duncan)
3
Getting started Okay, now plot income levels: – plot(Duncan$income) What is this graph? Can you make it a line plot instead? – plot(Duncan$income, type=“l”)
4
Histogram The X axis is useless. Wouldn’t a histogram be more informative? Make a histogram If you’re stuck, use google – hist(Duncan$income)
5
Fix the title ‘Histogram of Duncan$income’ is not a good title Change it to ‘Income Distribution in Duncan Dataset’ – hist(Duncan$income, main="Income Distribution in Duncan Dataset")
6
Another option There’s another way to set the title. Maybe some of you will have done this (my crystal ball is murky): – hist(Duncan$income) – title("Income Distribution in Duncan Dataset“) But wait. That looks awful. We need to not print the title as part of the hist() call. How do we do that? hist(Duncan$income, main="")
7
Scatterplot Okay, let’s look at income vs. prestige Make a scatterplot comparing income (x-axis) to prestige (y-axis) – plot(Duncan$income, Duncan$prestige) Did you get the x- and y- axes right? Add a title: Income vs. Prestige – title(“Income vs. Prestige”)
8
Scatterplot: Axis labels The axis labels display the variable names. Can we do better than that? Label the X axis “Income” and the Y axis “Prestige” – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige")
9
Scatterplot: Axis range How come income doesn’t have ticks at 0 and 100 but prestige does? Make both axes run from 0 to 100 – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100))
10
Scatterplot Axis Tick Marks Actually, your collaborator wants tick marks every 5 points on the X axis. DO IT Caveat: this is trickier: – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, at=seq(0,100, by=5))
11
Axis labels sideways Your collaborator still isn’t happy. Turn the x labels sideways. – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, las=2, at=seq(0,100, by=5))
12
More columns Now your collaborator wants to see how education affect this relationship. Create a dichotomous variable named ‘high_education’ categorizing education > 50 as TRUE and <= 50 as FALSE – Duncan$high_education 50
13
High education: sanity check How many high and low education jobs are there? – table(Duncan$high_education) Plot education (y-axis) by high_education (x- axis) – plot(Duncan$high_education, Duncan$education) Does it look right?
14
Adding color Okay, now color your income/prestige graph so high-education jobs are blue and low- education jobs are red This is a little tricky – colors <- as.numeric(Duncan$high_education)+1 – plot(Duncan$income, Duncan$prestige, col=c("red", "blue")[colors], xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, at=seq(0,100, by=5))
15
Bar plot Okay, now run this code: – plot(Duncan$type, Duncan$income) What happened? Why didn't we get a scatterplot? Can you get one? – plot(as.numeric(Duncan$type), Duncan$income)
16
More than one plot at a time Now your collaborator wants your scatterplot and histogram side-by-side. (Don’t worry about color if you don't want to) – opar<-par() – par(mfrow=c(1,2)) – hist(Duncan$income, main="Income Distribution in Duncan Dataset") – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, at=seq(0,100, by=5)) – par(opar)
17
ggplot ggplot is a whole different beast from base graphics ggplot is like R itself – some work to get oriented, but powerful once you do You don't have to know ggplot to be successful using R – But you do have to experiment with it for this class
18
Load the ggplot library Hint: the package name, confusingly, is ggplot2
19
Plot income vs. prestige It will be easiest to start using qplot. Qplot mimics plot(), but uses the ggplot layout engine. – qplot(Duncan$income, Duncan$prestige)
20
ggplot qplot is the training wheels version of ggplot ggplot's syntax takes some getting used to. Try this: – ggplot(Duncan) + aes(x=income, y=prestige) + geom_point() Huh? What are the pluses about?
21
ggplot syntax ggplot objects are weird You execute them (like a command) to draw their plot But you construct them by adding options to them Options specify data source, data columns, etc, resulting in code like this: p <- ggplot(Duncan) p <- p + aes(x=income, y=prestige) p + geom_point()
22
Where ggplot shines In my opinion, it's harder to think about doing simple plots in ggplot But when I want to do something multi- faceted (e.g. with different colors, sizes, etc.), ggplot makes it really easy I use it a lot for to understand 3+-way relationships in data
23
ggplot example (one of many)
24
ggplot code for that example ggplot(data=nycnames) + aes(x=as.factor(race), y=n1_013002p, color=as.factor(nbhdarkwalk)) + geom_point(position="jitter") + scale_x_discrete(breaks=1:7, limits=1:7, name="Subject Race", labels=c('Asian', 'Black', 'First\nPeoples', 'Pacific\nIslander', 'Non-Hispanic\nWhite', 'Other', 'Hispanic')) + scale_color_discrete(breaks=1:4, limits=1:4, name="Neighborhood Safe After Dark", labels=c('Strongly Agree', 'Somewhat Agree', 'Somewhat disagree', 'Strongly Disagree')) + scale_y_continuous(name="Neighborhood percent white (1km buffer)")
25
Exercises
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.