Download presentation
Presentation is loading. Please wait.
1
R workshop for Advanced users
Sohee Kang, PhD Math and Stats Learning Centre & Computer and Mathematical Sciences
2
Subscripting Subscripting can be used to access and manipulate the elements of objects like vectors, matrices, arrays, data frames and lists. Subscripting operations are fast and efficient, and should be the preferred method when dealing with data in R.
3
Numeric subscripts In R, the first element of an object has subscript 1. A vector of subscripts can be used to access multiple elements of an object. > x <- 1:10 > x [1] > x[c(1,3,5)] [1] 1 3 5 Negative subscripts extract all elements of an object except the one specified. > x[-c(1,3,5)] [1]
4
character subscripts If a subscriptable object has names associated to it, a character string or vector of character strings can be used as subscripts. > x <- 1:10 > names(x) <- letters[1:10] > x[c("a", "b", "c")] a b c 1 2 3 Note: Negative character subscripts are not allowed.
5
logical subscripts We can use logical values to choose which elements of the object to access. Elements corresponding to TRUE in the logical vector are included, and elements corresponding to FALSE are ignored. > x <- 1:10; names(x) <- letters[1:10] > x>5 a b c d e f g h i j FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE > x[x > 5] f g h i j # using logical subscript to modify the object > x[x > 5] <- 0 > x a b c d e f g h i j
6
subscripting multidimensional objects
For multidimensional objects, subscripts can be provided for each dimension. To select all elements of a given dimension, use the “empty" subscript. > mat <- matrix(1:12, 3, 4, byrow=TRUE) > mat [,1] [,2] [,3] [,4] [1,] [2,] [3,] > mat[5] [1] 6 > mat[2,2] > mat[1, ] [1] > mat[c(1,3),] [,1] [,2] [,3] [,4] [1,] [2,]
7
the order() function sorting rows of a matrix arbitrarily
# sort the iris data frame by Sepal.Length > iris.sort <- iris[order(iris[,"Sepal.Length"]),] > head(iris.sort) Sepal.Length Sepal.Width Petal.Length Petal.Width Species setosa setosa setosa setosa setosa setosa Try to sort the iris data frame in decreasing order with respect to Sepal.Width.
8
the drop= argument avoiding dimension reduction
By default, subscripting operations reduce the dimensions of an array whenever possible. To avoid that, we can use the drop=FALSE argument. > s1 <- mat[1,]; s1 [1] > dim(s1) NULL > s2 <- mat[1,,drop=FALSE] > > s2 <- mat[1,,drop=FALSE]; s2 [,1] [,2] [,3] [,4] [1,] > dim(s2) [1] 1 4
9
combined selections for matrices
Suppose we want to get all the columns for which the element at the first row is less than 3: > mat[ , mat[1, ] <3] [,1] [,2] [1,] [2,] [3,]
10
complex logical expressions subscripting data frames
> dat <- data.frame(a = seq(5, 20, by=3), b = c(8, NA, 12, 15, NA, 21)) > dat a b NA NA > dat[dat$b < 10, ] NA NA NA NA.1 NA NA > > # removing the missing values > dat[!is.na(dat$b) & (dat$b < 10), ] a b 1 5 8
11
the function subset() subscripting data frames
The function subset()allows one to perform selections of the elements in a data frame in very simple way. > dat <- data.frame(a = seq(5, 20, by=3), b = c(8, NA, 12, 15, NA, 21)) > subset(dat, b < 10) a b 1 5 8 Note: The subset() function always returns a new data frame, matrix of vector, and is not adequate for modifying elements of a data frame.
12
Exercise 1
13
Loops and Functions A loop allows the program to repeatedly execute commands. Loops are common to many programming languages and their use may facilitate the implementation of many operations. There are three kinds of loops in R: `for' loops `while' loops `repeat' loops Note: Loops can be very inefficient in R. For that reason, their use is not advised, unless necessary.
14
`for' loops General form: for (variable in sequence) { set_of_expressions } > for(i in 1:10) { print(sqrt(i)) } [1] 1 [1] [1] ... [1]
15
Easy Example: col.v <- rainbow(100)
cex.v <- seq(1, 10, length.out=100) plot(0:1, 0:1, type="n") for(i in 1:200) { print(i) points(x=runif(1), y=runif(1), pch=16, col=sample(col.v, size=1), cex=sample(cex.v, size=1)) Sys.sleep(0.1) }
16
`while' loops General form: while (condition) { set_of_expressions }
> a <- 0; b <- 1 > while(b < 10) { print(b) temp <- a+b a <- b b <- temp } [1] 1 [1] 2 [1] 3 [1] 5 [1] 8
17
`repeat' loops General form: repeat (condition) { set_of_expressions if (condition) { break } } > a <- 0; b <- 1 > repeat { print(b) temp <- a+b a <- b b <- temp if(b>=10){break} } [1] 1 [1] 2 [1] 3 [1] 5 [1] 8 Note: The loop is terminated by the break command.
18
cleaning the mess To have a cleaner version when working with loops, we can do: # Arithmetic Progression > x <- 1; d <- 2 > while (length(x) < 10) { position <- length(x) new <- x[position]+d x <- c(x,new) } > print(x) [1]
19
writing functions A function is a collection of commands that perform a specific task. General form: function.name <- function (arguments){ set_of_expressions return (answer) }
20
writing functions Example: Arithmetic Progression
> AP <- function(a, d, n){ x <- a while (length(x) < n){ position <- length(x) new <- x[position]+d x <- c(x, new) } return(x)
21
writing functions Once you run this code, you will have available a new function called AP. To run the function, type on the console: > AP(1,2,10) [1] > AP(1,0,10) [1] Note that for d==0 the function is returning a sequence of ones. We can easily x this with an if statement.
22
the `if' statement General form: if (condition) { set_of_expressions } We can also combine the `if' with the `else' statement: } else {
23
the `if' statement > AP <- function(a, d, n){ if(d ==0) { return("Error: argument `d' should not be 0") break } else { x <- a while (length(x) < n){ position <- length(x) new <- x[position]+d x <- c(x, new) } return(x) > AP(1, 0, 3) [1] "Error: argument `d' should not be 0"
24
Exercise 2
25
Graphics R offers and incredible variety of graphs. Type this code to get a sense of what is possible: demo(graphics) x <- 10*(1:nrow(volcano)) y <- 10*(1:ncol(volcano)) image(x, y, volcano, col = terrain.colors(100), axes = FALSE) contour(x, y, volcano, levels = seq(90, 200, by = 5), add = TRUE, col = "peru") axis(1, at = seq(100, 800, by = 100)) axis(2, at = seq(100, 600, by = 100)) box() title(main = "Maunga Whau Volcano", font.main = 4)
26
Managing graphics and graphical devices: opening multiple graphical devices
data(iris) plot(iris$Sepal.Length, iris$Sepal.Width, pch=19) dev.new() plot(iris$Sepal.Length, iris$Petal.Length, pch=19) #you can also use "X11()", but it may not work in some Mac computers
27
jpeg(file="SepalLenght_vs_SepalWidth. jpeg") plot(iris$Sepal
jpeg(file="SepalLenght_vs_SepalWidth.jpeg") plot(iris$Sepal.Length, iris$Sepal.Width, pch=19) dev.off #closes graphical device png(file="SepalLenght_vs_SepalWidth.png") dev.off() pdf(file="SepalLenght_vs_SepalWidth.pdf") postscript(file="SepalLenght_vs_SepalWidth.ps") # often used for publication
28
High level graphical functions (create a graph)
iris[1:5,] plot(iris$Sepal.Length) plot(iris$Sepal.Length, iris$Sepal.Width) plot(iris$Petal.Length, iris$Petal.Width) plot(iris$Petal.Length, iris$Petal.Width, xlab="Sepal length (cm)", ylab="Petal Width (cm)", cex.axis=1.5, cex.lab=1.5, bty="n", pch=19) boxplot(iris$Sepal.Length ~ iris$Species) boxplot(iris$Sepal.Length ~ iris$Species, names=expression(italic("Iris setosa"), italic("Iris versicolor"), italic("Iris virginica")), ylab="Sepal length (cm)", cex.axis=1.5, cex.lab=1.5) coplot(iris$Petal.Length ~ iris$Petal.Width | iris$Sepal.Length, overlap=0, pch=19) pairs(iris) hist(iris$Sepal.Length)
29
Low level graphical functions (affect an existing graph)
plot(iris$Petal.Length, iris$Petal.Width, xlab="Sepal length (cm)", ylab="Petal Width (cm)", cex=1.3, cex.axis=1.5, cex.lab=1.5, bty="n") points(iris$Petal.Length[iris$Species=="setosa"], iris$Petal.Width[iris$Species=="setosa"], cex=1.3, pch=19, col="red") points(iris$Petal.Length[iris$Species=="versicolor"], iris$Petal.Width[iris$Species=="versicolor"], cex=1.3, pch=19, col="blue") points(iris$Petal.Length[iris$Species=="virginica"], iris$Petal.Width[iris$Species=="virginica"], cex=1.3, pch=19, col="green") legend("topleft", c("Iris setosa", "I. versicolor", "I. virginica"), pch=19, col=c("red", "blue", "green"), cex=1.3) legend("bottomright", c(expression(italic("Iris setosa")), expression(italic("Iris versicolor")), expression(italic("Iris virginica"))), legend(1.3, 1.9, c(expression(italic("Iris setosa")),
30
Plotting in 2d The cars data frame is a two-column data set of cars speeds and stopping distances from the 1920s head(cars) speed dist
31
Plotting in 2d By default plot() produces a scatterplot >plot(cars)
Axis labels are from the names in the data frame Axis scale is from the range of the data
32
Plotting in 2d To add details it’s better to use low-level functions.
plot(cars,type="p") line(lowess(cars),col="red")
33
Combining plots To show multiple plots in one
graphics window use the par() command with the mfrow Parameter: par(mfrow=c(2,2)) plot(cars,type="p") plot(cars,type="l") plot(cars,type="h") plot(cars,type="s")
34
Managing graphics and graphical devices: partitioning a graphical device
layout(matrix(1:4, 2, 2)) #see help for "layout“ layout.show(4) #see help for "layout.show" plot(iris$Sepal.Length, iris$Sepal.Width, pch=19, cex.lab=1.5, cex.axis=1.5, xlab="Sepal length (cm)", ylab="Sepal width (cm)") plot(iris$Sepal.Length, iris$Petal.Length, pch=19, cex.lab=1.5, cex.axis=1.5, xlab="Sepal length (cm)", ylab="Petal length (cm)") plot(iris$Sepal.Length, iris$Petal.Width, pch=19, cex.lab=1.5, cex.axis=1.5, xlab="Sepal length (cm)", ylab="Petal width (cm)") plot(iris$Sepal.Length, iris$Petal.Width, pch=19, type="n", axes=F, bty="n", xlab="", ylab="") mtext("Sepal length", line=-3, cex=1.5) mtext("versus", line=-5, cex=1.5) mtext("other variables", line=-7, cex=1.5) mtext("in Anderson's Iris", line=-9, cex=1.5) layout(matrix(1:6, 3, 2)) layout.show(6) plot(iris$Sepal.Length, iris$Sepal.Width, pch=19) plot(iris$Sepal.Length, iris$Petal.Length, pch=19) plot(iris$Sepal.Length, iris$Petal.Width, pch=19) plot(iris$Sepal.Width, iris$Petal.Length, pch=19) plot(iris$Sepal.Width, iris$Petal.Width, pch=19) plot(iris$Petal.Length, iris$Petal.Width, pch=19)
35
Plotting in 3D Some data is well-suited to three dimensional plots
The matrix volcano records elevations of the volcano Maunga Whau in New Zealand volcano[1:4,1:4] dim(volcano) x<-1:87 y<-1:61 #Default are a heat map image(x,y,volcano)
36
Plotting in 3D #Changing the color map image(x,y,volcano,col=terrain.colors(range(volcano)))
37
Plotting in 3D #A contour map contour(x,y,volcano) #A perspective plot
persp(x,y,volcano)
38
The package ggplot2 The ggplot2 package was created in 2005 by Hadley Wickham It implements the object oriented design ideas of Leland Wilkinson’s The Grammar of Graphics.
39
The Package ggplot2 The function qplot() is a ggplot2
version of the regular function plot() library(ggplot2) qplot(speed,dist,data=cars)
40
The package ggplot2 Combining geoms can produce more complex plots
library(ggplot2) library(foreign) dat <- read.dta(" ologit.dta") ggplot(dat, aes(x = apply, y = gpa)) + geom_boxplot(size = .75)
41
The package ggplot2 p<-ggplot(dat, aes(x = apply,y = gpa))+geom_boxplot(size = .75) p<-p+geom_jitter(alpha = .5) p
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.