Ch.6 Efficient calculations
6.1 Vectorized computations Replacing numbers with ‘for’ expression tmp <- Sys.time() x <- rnorm(15000) for (i in 1:length(x)){ if(x[i] > 1){ x[i] <- 1 } Sys.time - tmp Time difference of secs
6.1 Vectorized computations Replacing numbers with a ‘vectorized’ function tmp <- Sys.time() x <- rnorm(15000) x[x>1] <- 1 Sys.time() - tmp Time difference of secs
6.1 Vectorized computations The if-else expression tmp <- Sys.time() x <- rnorm(15000) for (i in 1:length(x)){ if(x[i] > 1){ x[i] <- 1 } else { x[i] <- -1 } } Sys.time() - tmp Time difference of secs
6.1 Vectorized computations a vectorized function ‘ifelse’ tmp <- Sys.time() x <- rnorm(15000) x 1,1,-1) tmp - Sys.time() Time difference of secs
6.1 Vectorized computations The cumsum function x <- 1:10 y <- cumsum(x) y [1]
6.1 Vectorized computations Matrix multiplication C <- A%*%B If we choose the elements of the matrices A and B ‘cleverly' explicit for-loops could be avoided. For example: A <- matrix(rnorm(1000),ncol=10) n <- dim(A)[1] mat.means <- t(A)%*%rep(1/n,n)
6.2 The apply and outer functions To calculate the means of all columns in a matrix, use the following syntax: M <- matrix(rnorm(10000),ncol=100) apply(M,1,mean) colMeans(M) rowMeans(M) the apply function
6.2 The apply and outer functions The function apply can also be used with a function that you have written yourself. Extra arguments to your function must now be passed trough the apply function. The following construction calculates the number of entries that is larger than a threshold d for each column in a matrix the apply function
6.2 The apply and outer functions tresh d) } M <- matrix(rnorm(10000),ncol=100) apply(M,1,tresh,d=0.6) [1] [17] [33] [49] [65] [81] [97] ?lapply ?sapply the apply function
6.2 The apply and outer functions This function is used to run another function on the cells of a so called ragged array. A ragged array is a pair of two vectors of the same size. One of them contains data and the other contains grouping information. The following data vector x en grouping vector y form an example of a ragged array. x <- rnorm(50) y <- as.factor(sample(c("A","B","C","D"), size=50, replace=T)) tapply(x, y, mean, trim = 0.3) A B C D The tapply function
6.2 The apply and outer functions For every combination of the vector elements of x and y this function is evaluated. Some examples are given by the code below. x <- 1:3 y <- 1:3 z <- outer(x,y,FUN="-") z [,1] [,2] [,3] [1,] [2,] [3,] The outer function
6.2 The apply and outer functions x <- c("A", "B", "C", "D") y <- 1:9 z <- outer(x, y, paste, sep = "") z # function with grid combination x <- seq(-4,4,l=50) y <- x myf <- function(x,y){ sin(x)+cos(y)} z <- outer(x,y, FUN = myf) persp(x,y,z, theta=45, phi=45, shade = 0.45) The outer function
6.2 The apply and outer functions The outer function
6.2 Other vectorized functions colSums (x, na.rm = FALSE, dims = 1) rowSums (x, na.rm = FALSE, dims = 1) colMeans(x, na.rm = FALSE, dims = 1) rowMeans(x, na.rm = FALSE, dims = 1) rowsum(x, group, reorder = TRUE,...) aggregate(x, by, FUN,..., simplify = TRUE) sweep(x, MARGIN, STATS, FUN="-",...) scale(x, center = TRUE, scale = TRUE) …
“The quiet statisticians have changed our world -not by discovering new facts or technical developments but by changing the ways that we reason, experiment and form our opinions... “ - by Ian Hacking Thank you !