Introduction to Graphics in R 3/12/2014. First, let’s get some data Load the Duncan dataset It’s in the car package. Remember how to get it? – library(car)

Slides:



Advertisements
Similar presentations
1 SESSION 5 Graphs for data analysis. 2 Objectives To be able to use STATA to produce exploratory and presentation graphs In particular Bar Charts Histograms.
Advertisements

R graphics  R has several graphics packages  The plotting functions are quick and easy to use  We will cover:  Bar charts – frequency, proportion 
Introduction to Excel 2007 Part 2: Bar Graphs and Histograms February 5, 2008.
1.2 - Displaying quantitative data with graphs (Histograms)
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Graphing With Excel 2010 University of Michigan – Dearborn Science Learning Center Based on a presentation by James Golen Revised by Annette Sieg…
Let's zoom in on one corner of the coordinate plane
Introduction to Excel 2007 Bar Graphs & Histograms Psych 209 February 1st, 2011.
Section 3.2 ~ Picturing Distributions of Data
Introduction to Excel 2007 Part 1: Basics and Descriptive Statistics Psych 209.
Graphing in Excel Part II – Bar Graphs Oooh, that sounds like fun! More fun than a barrel of monkeys!
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms Psych 209.
Section 5: Graphs in Science
2-Day Introduction to Agent-Based Modelling Day 2: Session 5 Variables and Debugging.
Adding Controls to User Forms. Adding Controls A user form isn’t much use without some controls We’re going to add controls and write code for them Note.
1 Lesson Making Scatterplots. 2 Lesson Making Scatterplots California Standard: Statistics, Data Analysis, and Probability 1.2 Represent two.
An introduction to Plotting in MATLAB Rikard Johansson Department of Biomedical Engineering (IMT) Linköping University
The gchart Procedure The gchart Procedure is used to create bar charts of various types (it can also create pie charts. It’s most basic form would look.
Some Possibly Useful Graphics Functions Lunch presentation.
Slides online 1) Is depressed, blue 2) Is not relaxed, does not handle stress well 3) Can be tense.
Graphs in Science You Can Do It!!!.
Chapter 1 Graphing. Types of Graphs Type of Graph What does it show?Example Drawing Scatterplot Bar graph Pie graph Line graph used to determine if two.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Fractions, Decimals, and Percent By: Jack M 1 _____ %
Graphing in the Biology Classroom
Quantitative Skills 1: Graphing
How To Make a Graph (The Right Way)
Making Graphs from Data. Bar, Line, or Pie? / One of the first things one needs to do when graphing is decide whether to make a bar graph or a line graph.
Graphing Guidelines  Often the goal of an experiment is to find the relationship between two variables.  As one variable changes, so does the other.
Organizing Data A graph is a pictorial representation of information recorded in a data table. It is used to show a relationship between two or more factors.
Graphing Parameters Titles X-Axis Title Y-Axis Title Legend Scales Color Gridlines library(help="graphics") Basic Chart Types The R Graphics Package LineHistogram.
V. Rouillard  Introduction to measurement and statistical analysis GRAPHICAL PRESENTATION OF EXPERIMENTAL DATA It is nearly always the case that.
Programming for Artists ART 315 Dr. J. R. Parker Art/Digital Media Lab Lec 11 Fall 2010.
Dot Plots and Histograms Lesson After completing this lesson, you will be able to say: I can create a dot plot and histogram to display a set of.
Unit 9: Probability, Statistics and Percents Section 1: Relative Frequency and Probability The frequency of something is how often it happens Relative.
16 Graph Skills How to read and understand advanced types of bar charts, area graphs, climographs and triangle graphs How to advance their skills of drawing.
Advanced Stata Workshop FHSS Research Support Center.
HS115 Unit 7 Seminar December 22, 2010 David Rudnick, Instructor.
Graphing.
Warm Up 1.What does the data to the right tell you? 2.Are there any trends that you notice about plant height?
EGR 101 Resistance Lab Read this before class on Tuesday 9/18.
R-Graphics Stephen Opiyo. Basic Graphs One of the main reasons data analysts turn to R is for its strong graphic capabilities. R generates publication-ready.
Chapter One, Section 5: Graphs in Science
Statistical Analysis Topic – Math skills requirements.
Graphing and the Coordinate Plane. This is a chameleon: His name is Sam. Sam likes to eat bugs and flies. He always has a lot to eat, because he is very.
GRAPHING NOTES Part 1. TYPES OF GRAPHS Graphs are used to illustrate what happens during an experiment. Bar graph - used for comparing data. Pie graph.
Ggplot2 A cool way for creating plots in R Maria Novosolov.
Data visualization. Numbers are boring Data tells a story.
How to Create Bar Graphs. Bar Graphs Bar graphs are descriptive. They compare groups of data such as amounts and categories. They help us make generalizations.
Physical Science Mr. Barry
GRAPHING BASICS Data Management & Graphing. Types of Graphs After collecting your data, you will need to organize it into a graph. After collecting your.
Aim: How do we construct a line graph? Do Now: 1.How many inches of rain fell during the month of June? 2.During which month did the most rain fall?
Variable A Variable isanything that may affect (change) the out come of the experiment. In an experiment we are looking for a “Cause and Effect” “Cause.
THE 7 BASIC QUALITY TOOLS AS A PROBLEM SOLVING SYSTEM Kelly Roggenkamp.
Introduction to Engineering MATLAB – 9 Plotting - 2 Agenda Formatting plots.
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
Excel Part 4 Working with Charts and Graphics. XP Objectives Create an embedded chart Work with chart titles and legends Create and format a pie chart.
Excel Part 4 Working with Charts and Graphics. XP Objectives Create an embedded chart Work with chart titles and legends Create and format a pie chart.
Graphing. Line Graphs  Shows a relationship where the dependent variable changes due to a change in the independent variable  Can have more than one.
Introduction to plotting data Fish 552: Lecture 4.
Frequency Distributions Chapter 2. Descriptive Statistics Distributions are part of descriptive statistics…we are learning how to describe some data by.
1.1 ANALYZING CATEGORICAL DATA. FREQUENCY TABLE VS. RELATIVE FREQUENCY TABLE.
Visual Displays of Data Chapter 3. Uses of Graphs Positive and negative uses – Can accurately and succinctly present information – Can reveal/conceal.
Using R Graphs in R.
Summary Statistics in R Commander
R Assignment #4: Making Plots with R (Due – by ) BIOL
Lab 1 Introductions to R Sean Potter.
Simulate Multiple Dice
DATA VISUALISATION (QUANTITATIVE).
Presentation transcript:

Introduction to Graphics in R 3/12/2014

First, let’s get some data Load the Duncan dataset It’s in the car package. Remember how to get it? – library(car) – data(Duncan)

Getting started Okay, now plot income levels: – plot(Duncan$income) What is this graph? Can you make it a line plot instead? – plot(Duncan$income, type=“l”)

Histogram The X axis is useless. Wouldn’t a histogram be more informative? Make a histogram If you’re stuck, use google – hist(Duncan$income)

Fix the title ‘Histogram of Duncan$income’ is not a good title Change it to ‘Income Distribution in Duncan Dataset’ – hist(Duncan$income, main="Income Distribution in Duncan Dataset")

Another option There’s another way to set the title. Maybe some of you will have done this (my crystal ball is murky): – hist(Duncan$income) – title("Income Distribution in Duncan Dataset“) But wait. That looks awful. We need to not print the title as part of the hist() call. How do we do that? hist(Duncan$income, main="")

Scatterplot Okay, let’s look at income vs. prestige Make a scatterplot comparing income (x-axis) to prestige (y-axis) – plot(Duncan$income, Duncan$prestige) Did you get the x- and y- axes right? Add a title: Income vs. Prestige – title(“Income vs. Prestige”)

Scatterplot: Axis labels The axis labels display the variable names. Can we do better than that? Label the X axis “Income” and the Y axis “Prestige” – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige")

Scatterplot: Axis range How come income doesn’t have ticks at 0 and 100 but prestige does? Make both axes run from 0 to 100 – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100))

Scatterplot Axis Tick Marks Actually, your collaborator wants tick marks every 5 points on the X axis. DO IT Caveat: this is trickier: – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, at=seq(0,100, by=5))

Axis labels sideways Your collaborator still isn’t happy. Turn the x labels sideways. – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, las=2, at=seq(0,100, by=5))

More columns Now your collaborator wants to see how education affect this relationship. Create a dichotomous variable named ‘high_education’ categorizing education > 50 as TRUE and <= 50 as FALSE – Duncan$high_education 50

High education: sanity check How many high and low education jobs are there? – table(Duncan$high_education) Plot education (y-axis) by high_education (x- axis) – plot(Duncan$high_education, Duncan$education) Does it look right?

Adding color Okay, now color your income/prestige graph so high-education jobs are blue and low- education jobs are red This is a little tricky – colors <- as.numeric(Duncan$high_education)+1 – plot(Duncan$income, Duncan$prestige, col=c("red", "blue")[colors], xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, at=seq(0,100, by=5))

Bar plot Okay, now run this code: – plot(Duncan$type, Duncan$income) What happened? Why didn't we get a scatterplot? Can you get one? – plot(as.numeric(Duncan$type), Duncan$income)

More than one plot at a time Now your collaborator wants your scatterplot and histogram side-by-side. (Don’t worry about color if you don't want to) – opar<-par() – par(mfrow=c(1,2)) – hist(Duncan$income, main="Income Distribution in Duncan Dataset") – plot(Duncan$income, Duncan$prestige, xlab="Income", ylab="Prestige", xlim=c(0,100), xaxt="n") – axis(1, at=seq(0,100, by=5)) – par(opar)

ggplot ggplot is a whole different beast from base graphics ggplot is like R itself – some work to get oriented, but powerful once you do You don't have to know ggplot to be successful using R – But you do have to experiment with it for this class

Load the ggplot library Hint: the package name, confusingly, is ggplot2

Plot income vs. prestige It will be easiest to start using qplot. Qplot mimics plot(), but uses the ggplot layout engine. – qplot(Duncan$income, Duncan$prestige)

ggplot qplot is the training wheels version of ggplot ggplot's syntax takes some getting used to. Try this: – ggplot(Duncan) + aes(x=income, y=prestige) + geom_point() Huh? What are the pluses about?

ggplot syntax ggplot objects are weird You execute them (like a command) to draw their plot But you construct them by adding options to them Options specify data source, data columns, etc, resulting in code like this: p <- ggplot(Duncan) p <- p + aes(x=income, y=prestige) p + geom_point()

Where ggplot shines In my opinion, it's harder to think about doing simple plots in ggplot But when I want to do something multi- faceted (e.g. with different colors, sizes, etc.), ggplot makes it really easy I use it a lot for to understand 3+-way relationships in data

ggplot example (one of many)

ggplot code for that example ggplot(data=nycnames) + aes(x=as.factor(race), y=n1_013002p, color=as.factor(nbhdarkwalk)) + geom_point(position="jitter") + scale_x_discrete(breaks=1:7, limits=1:7, name="Subject Race", labels=c('Asian', 'Black', 'First\nPeoples', 'Pacific\nIslander', 'Non-Hispanic\nWhite', 'Other', 'Hispanic')) + scale_color_discrete(breaks=1:4, limits=1:4, name="Neighborhood Safe After Dark", labels=c('Strongly Agree', 'Somewhat Agree', 'Somewhat disagree', 'Strongly Disagree')) + scale_y_continuous(name="Neighborhood percent white (1km buffer)")

Exercises