Programming in R coding, debugging and optimizing Katia Oleinik Scientific Computing and Visualization Boston University October 12, 2012.

Slides:



Advertisements
Similar presentations
Overview of programming in C C is a fast, efficient, flexible programming language Paradigm: C is procedural (like Fortran, Pascal), not object oriented.
Advertisements

R – C/C++ programming Katia Oleinik Scientific Computing and Visualization Boston University
Intermediate Code Generation
Introduction to Matlab
A MATLAB function is a special type of M-file that runs in its own independent workspace. It receives input data through an input argument list, and returns.
Programming in Visual Basic
Introduction to C Programming
 2005 Pearson Education, Inc. All rights reserved Introduction.
 2000 Prentice Hall, Inc. All rights reserved. Chapter 2 - Introduction to C Programming Outline 2.1Introduction 2.2A Simple C Program: Printing a Line.
Introduction to C Programming
I210 review Fall 2011, IUB. Python is High-level programming –High-level versus machine language Interpreted Language –Interpreted versus compiled 2.
Chapter 10.
1 Chapter 3 Flow of Control. 2 Outline  How to specify conditions?  Relational, Equality and Logical Operators  Statements  Statements: compound statement.
 2007 Pearson Education, Inc. All rights reserved Introduction to C Programming.
Computer Science 1620 Programming & Problem Solving.
Introduction to C Programming
Fundamentals of Python: From First Programs Through Data Structures
1 Project 5: Median. 2 The median of a collection of numbers is the member for which there are an equal number less than or equal and greater than or.
Fundamentals of Python: First Programs
High-Level Programming Languages: C++
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
Control Structures – Selection Chapter 4 2 Chapter Topics  Control Structures  Relational Operators  Logical (Boolean) Operators  Logical Expressions.
CPS120: Introduction to Computer Science Decision Making in Programs.
Introduction to Java Applications Part II. In this chapter you will learn:  Different data types( Primitive data types).  How to declare variables?
Project 1 Due Date: September 25 th Quiz 4 is due September 28 th Quiz 5 is due October2th 1.
Programming in R coding, debugging and optimizing Katia Oleinik Scientific Computing and Visualization Boston University
Introduction to MATLAB Session 3 Simopekka Vänskä, THL Department of Mathematics and Statistics University of Helsinki 2011.
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
C++ Programming: From Problem Analysis to Program Design, Fifth Edition Arrays.
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
CPS120: Introduction to Computer Science Decision Making in Programs.
Advanced Topics- Functions Introduction to MATLAB 7 Engineering 161.
Chapter 3 Functions, Events, and Control Structures JavaScript, Third Edition.
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
Computer Simulation Lab Electrical and Computer Engineering Department SUNY – New Paltz SUNY-New Paltz “Lecture 2”
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
GE 211 Dr. Ahmed Telba. // compound assignment operators #include using namespace std; int main () { a =5 int a, b=3; a = b; a+=2; // equivalent to a=a+2.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.
 2008 Pearson Education, Inc. All rights reserved. 1 Arrays and Vectors.
Chapter 8 Arrays. A First Book of ANSI C, Fourth Edition2 Introduction Atomic variable: variable whose value cannot be further subdivided into a built-in.
CPS120: Introduction to Computer Science Decision Making in Programs.
 2007 Pearson Education, Inc. All rights reserved. A Simple C Program 1 /* ************************************************* *** Program: hello_world.
CPS120: Introduction to Computer Science Decision Making in Programs.
CPS120 Introduction to Computer Science Exam Review Lecture 18.
Sudeshna Sarkar, IIT Kharagpur 1 Programming and Data Structure Sudeshna Sarkar Lecture 3.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
The Art of R Programming Chapter 15 – Writing Fast R Code Chapter 16 – Interfacing R to Other languages Chapter 17 – Parallel R.
4 - Conditional Control Structures CHAPTER 4. Introduction A Program is usually not limited to a linear sequence of instructions. In real life, a programme.
Lecture 3.1 Operators and Expressions Structured Programming Instructor: Prof. K. T. Tsang 1.
Control Structures Hara URL:
Programming in R coding, debugging and optimizing Katia Oleinik Scientific Computing and Visualization Boston University
Scripts & Functions Scripts and functions are contained in .m-files
Programmazione I a.a. 2017/2018.
MATLAB DENC 2533 ECADD LAB 9.
Control Structures – Selection
Arithmetic operations, decisions and looping
Introduction to C++ Programming
Chapter 2 - Introduction to C Programming
PHP.
Introduction to C Topics Compilation Using the gcc Compiler
Chapter 2 - Introduction to C Programming
C Programming Getting started Variables Basic C operators Conditionals
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Introduction to C Topics Compilation Using the gcc Compiler
Chapter 2 - Introduction to C Programming
Matlab Basics.
Introduction to C Programming
Presentation transcript:

Programming in R coding, debugging and optimizing Katia Oleinik Scientific Computing and Visualization Boston University October 12,

if Comparison operators: ==equal !=not equal > (<)greater (less) >= (<=)greater (less) or equal 2 if (condition) { command(s) } else { command(s) } Logical operators: &and |or !not

if 3 > # define x > x <- 7 > # simple if statement > if ( x < 0 ) print("Negative") > # simple if-else statement > if ( x < 0 ) print("Negative") else print("Non-negative") [1] "Non-negative" > # if statement may be used inside other constructions > y <- if ( x < 0 ) -1 else 0 > y [1] 0 > # define x > x <- 7 > # simple if statement > if ( x < 0 ) print("Negative") > # simple if-else statement > if ( x < 0 ) print("Negative") else print("Non-negative") [1] "Non-negative" > # if statement may be used inside other constructions > y <- if ( x < 0 ) -1 else 0 > y [1] 0

if 4 > # multiline if - else statement > if ( x < 0 ) { + x <- x+1 + print("Add one") + } else if ( x == 0 ) { + print("Zero") + } else { + print("Positive value") + } [1] positive > # multiline if - else statement > if ( x < 0 ) { + x <- x+1 + print("Add one") + } else if ( x == 0 ) { + print("Zero") + } else { + print("Positive value") + } [1] positive Note: For multiline if-statements braces are necessary even for single statement bodies. The left and right braces must be on the same line with else keyword (in interactive session).

ifelse 5 > # ifelse statement > y <- ifelse ( x < 0, -1, 0 ) > # nested ifelse statement > y 0, 1, 0) ) > # ifelse statement > y <- ifelse ( x < 0, -1, 0 ) > # nested ifelse statement > y 0, 1, 0) ) ifelse (test_condition, true_value, false_value)

ifelse 6 > # ifelse statement on a vector > digits <- 0 : 9 > ifelse( digits > 4, 1, 0 ) [1] > # ifelse statement on a vector > digits <- 0 : 9 > ifelse( digits > 4, 1, 0 ) [1] Best of all – ifelse statement operates on vectors!

ifelse 7 Exercise: define a random vector ranging from -10 to 10: x<- as.integer( runif( 10, -10, 10 ) ) create vector y, such that its elements equal to absolute values of x Note : normally, you would use abs() function to achieve this result

switch 8 > # simple switch statement > x <- 3 > switch( x, 2, 4, 6, 8 ) [1] 6 > switch( x, 2, 4 ) # returns NULL since there are only 2 elements in the list > # simple switch statement > x <- 3 > switch( x, 2, 4, 6, 8 ) [1] 6 > switch( x, 2, 4 ) # returns NULL since there are only 2 elements in the list switch (statement, list)

switch 9 > # switch statement with named list > day <- "Tue" > switch( day, Sun = 0, Mon = 1, Tue = 2, Wed = 3, … ) [1] 2 > # switch statement with a “default” value > food <- "meet" > switch( food, banana="fruit", carrot="veggie", "neither") [1] "neither" > # switch statement with named list > day <- "Tue" > switch( day, Sun = 0, Mon = 1, Tue = 2, Wed = 3, … ) [1] 2 > # switch statement with a “default” value > food <- "meet" > switch( food, banana="fruit", carrot="veggie", "neither") [1] "neither" switch (statement, name1 = str1, name2 = str2, … )

loops 10 There are 3 statements that provide explicit looping: - repeat - for - while Built – in constructs to control the looping: - next - break Note: Use explicit loops only if it is absolutely necessary. R has other functions for implicit looping, which will run much faster: apply(), sapply(), tapply(), and lapply().

repeat 11 repeat { } statement causes repeated evaluation of the body until break is requested. Be careful – infinite loop may occur! > # find the greatest odd divisor of an integer > x <- 84 > repeat{ + print(x) + if( x%2 != 0) break + x <- x/2 + } [1] 84 [1] 42 [1] 21 > > # find the greatest odd divisor of an integer > x <- 84 > repeat{ + print(x) + if( x%2 != 0) break + x <- x/2 + } [1] 84 [1] 42 [1] 21 >

for 12 > # calculate N! - factorial > x <- 7 > y <- 1 > for( j in 2:x ){ + y <- y*j + } > y [1] 5040 > > # calculate N! - factorial > x <- 7 > y <- 1 > for( j in 2:x ){ + y <- y*j + } > y [1] 5040 > for (object in sequence) { command(s) }

for 13 for (object in sequence) { command(s) if (…) next # return to the start of the loop if (…) break # exit from (innermost) loop }

while 14 > # find the largest odd divisor of a given number > x <- 84 > while (x % 2 == 0){ + x <- x/2 + } > x [1] 21 > > # find the largest odd divisor of a given number > x <- 84 > while (x % 2 == 0){ + x <- x/2 + } > x [1] 21 > while (test_statement) { command(s) }

loops 15 Exercise: Using either loop statement print all the numbers from 0 to 30 divisible by 7. Use % - modular arithmetic operator to check divisibility.

function 16 myFun <- function (ARG, OPT_ARGs ){ statement(s) } ARG: vector, matrix, list or a data frame OPT_ARGs: optional arguments Functions are a powerful R elements. They allows you to expand on existing functions by writing your own custom functions.

function 17 myFun <- function (ARG, OPT_ARGs ){ statement(s) } Naming: Variable naming rules apply. Avoid usage of existing (built-in) functions Arguments: Argument list can be empty. Some (or all) of the arguments can have a default value ( arg1 = TRUE ) The argument ‘…’ can be used to allow one function to pass on argument settings to another function. Return value: The value returned by the function is the last value computed, but you can also use return() statement.

function 18 > # simple function: calculate (x+1) 2 > f1 <- function (x) { + x^2 + 2*x } > f1(3) [1] 16 > > # simple function: calculate (x+1) 2 > f1 <- function (x) { + x^2 + 2*x }+ } > f1(3) [1] 16 >

function 19 > # function with default arguments: calculate (x+a) 2 > f2 <- function (x, a=1) { + x^2 + 2*x*a + a^2 + } > f2(3) [1] 16 > f2(3,2) [1] 25 > > # arguments can be called using their names ( and out of order!!!) > f2( a = 2, x = 1) [1] 9 > # function with default arguments: calculate (x+a) 2 > f2 <- function (x, a=1) { + x^2 + 2*x*a + a^2 + }+ } > f2(3) [1] 16 > f2(3,2) [1] 25 > > # arguments can be called using their names ( and out of order!!!) > f2( a = 2, x = 1) [1] 9

function 20 > # Some optional arguments can be specified as ‘…’ to pass them to another function > f3 <- function (x, … ) { + plot (x, … ) + } > > # print all the words together in one sentence > f3 <- function ( … ) { + print(paste ( … ) ) + } > f3("Hello", " R! ") [1] "Hello R! " > # Some optional arguments can be specified as ‘…’ to pass them to another function > f3 <- function (x, … ) { + plot (x, … ) + }+ } > > # print all the words together in one sentence > f3 <- function ( … ) { + print(paste ( … ) ) + }+ } > f3("Hello", " R! ") [1] "Hello R! "

function 21 > # define a function > f <- function (x) { + cat ("u=", u, "\n") # this variable is local ! + u<-u+1 # this will not affect the value of variable outside f() + cat ("u=", u, "\n") + } > > u <- 2 # define a variable > f(u) #execute the function u= 2 u= 3 > > cat ("u=", u, "\n") # print the value of the variable u= 2 > # define a function > f <- function (x) { + cat ("u=", u, "\n") # this variable is local ! + u<-u+1 # this will not affect the value of variable outside f() + cat ("u=", u, "\n") + }+ } > > u <- 2 # define a variable > f(u) #execute the function u= 2 u= 3 > > cat ("u=", u, "\n") # print the value of the variable u= 2 Local and global variables: All variables appearing inside a function are treated as local, except their initial value will be of that of the global (if such variable exists).

function 22 > # define a function > f <- function (x) { + cat ("u=", u, "\n") # this variable is local ! + u <<- u+1 # this WILL affect the value of variable outside f() + cat ("u=", u, "\n") + } > > u <- 2 # define a variable > f(u) #execute the function u= 2 u= 3 > > cat ("u=", u, "\n") # print the value of the variable u= 3 > > # define a function > f <- function (x) { + cat ("u=", u, "\n") # this variable is local ! + u <<- u+1 # this WILL affect the value of variable outside f() + cat ("u=", u, "\n") + }+ } > > u <- 2 # define a variable > f(u) #execute the function u= 2 u= 3 > > cat ("u=", u, "\n") # print the value of the variable u= 3 > Local and global variables: If you want to access the global variable – you can use the super- assignment operator <<-. You should avoid doing this!!!

function 23 > # define a function > f <- function (x) { + x <- 2 + print (x) + } > > x <- 3 # assign value to x > y <- f(x) # call the function [1] 2 > > print(x) # print value of x [1] 3 > > # define a function > f <- function (x) { + x <- 2 + print (x) + }+ } > > x <- 3 # assign value to x > y <- f(x) # call the function [1] 2 > > print(x) # print value of x [1] 3 > Call vector variables: Functions do not change their arguments.

function 24 > # define a function > f <- function (x) { + x <- 2 + print (x) + } > > x <- 3 # assign value to x > x <- f(x) # call the function [1] 2 > > print(x) # print value of x [1] 2 > > # define a function > f <- function (x) { + x <- 2 + print (x) + }+ } > > x <- 3 # assign value to x > x <- f(x) # call the function [1] 2 > > print(x) # print value of x [1] 2 > Call vector variables: If you want to change the value of the function’s argument, reassign the return value to the argument.

function 25 > # get the source code of lm() function > lm function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset,...) { ret.x <- x ret.y <- y cl <- match.call()... z } > > # get the source code of lm() function > lm function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset,...) { ret.x <- x ret.y <- y cl <- match.call()... z } > Finding the source code: You can find the source code for any R function by printing its name without parentheses.

function 26 > # get the source code of mean() function > mean function (x,...) UseMethod("mean") > > # get the source code of mean() function > mean function (x,...) UseMethod("mean") > Finding the source code: For generic functions there are many methods depending on the type of the argument.

function 27 > # get the source code of mean() function > methods("mean") [1] mean.Date mean.POSIXct mean.POSIXlt mean.data.frame [5] mean.default mean.difftime > > # get source code > mean.default function (x, trim = 0, na.rm = FALSE,...) { if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {... z } > # get the source code of mean() function > methods("mean") [1] mean.Date mean.POSIXct mean.POSIXlt mean.data.frame [5] mean.default mean.difftime > > # get source code > mean.default function (x, trim = 0, na.rm = FALSE,...) { if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {... z } Finding the source code: You can first explore different methods and then chose the one you need.

apply 28 apply (OBJECT, MARGIN, FUNCTION, ARGs ) object: vector, matrix or a data frame margin: 1 – rows, 2 – columns, c(1,2) – both function: function to apply args: possible arguments Description: Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix

apply 29 > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > Example: Create matrix and apply different functions to its rows and columns.

apply 30 > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > # find median of each row > apply (x, 1, median) [1] > > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > # find median of each row > apply (x, 1, median) [1] > Example: Create matrix and apply different functions to its rows and columns.

apply 31 > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > # find mean of each column > apply (x, 2, mean) [1] > > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > # find mean of each column > apply (x, 2, mean) [1] > Example: Create matrix and apply different functions to its rows and columns.

apply 32 > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > # create a new matrix with values 0 or 1 for even and odd elements of x > apply (x, c(1,2), function (x) x%2) [,1] [,2] [,3] [,4] [1,] [2,] [3,] > > # create 3x4 matrix > x <- matrix( 1:12, nrow = 3, ncol = 4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > # create a new matrix with values 0 or 1 for even and odd elements of x > apply (x, c(1,2), function (x) x%2) [,1] [,2] [,3] [,4] [1,] [2,] [3,] > Example: Create matrix and apply different functions to its rows and columns.

lapply 33 > # create a list > x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE)) > # compute the list mean for each list element > lapply (x, mean) $a [1] 5.5 $beta [1] $logic [1] > > # create a list > x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE)) > # compute the list mean for each list element > lapply (x, mean) $a [1] 5.5 $beta [1] $logic [1] > l lapply() function returns a list: lapply(X, FUN,...)

sapply 34 > # create a list > x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE)) > # compute the list mean for each list element > sapply (x, mean) a beta logic > > # create a list > x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE)) > # compute the list mean for each list element > sapply (x, mean) a beta logic > l sapply() function returns a vector or a matrix: sapply(X, FUN,..., simplify = TRUE, USE.NAMES = TRUE)

code sourcing 35 source ("file", … ) file: file with a source code to load (usually with extension.r ) echo: if TRUE, each expression is printed after parsing, before evaluation.

code sourcing 36 katana:~ % emacs foo_source.r # dummy function foo <- function(x){ x+1 } # dummy function foo <- function(x){ x+1 } > # load foo.r source file > source ("foo_source.r") > # create a vector > x <- c(3,5,7) > # call function > foo(x) [1] > # load foo.r source file > source ("foo_source.r") > # create a vector > x <- c(3,5,7) > # call function > foo(x) [1] Linux prompt Text editor R session

code sourcing 37 > # load foo.r source file > source ("foo_source.r", echo = TRUE) > # dummy function > foo <- function(x){ + x+1; + } > # create a vector > x <- c(3,5,7) > # call function > foo(x) [1] > # load foo.r source file > source ("foo_source.r", echo = TRUE) > # dummy function > foo <- function(x){ + x+1; + } > # create a vector > x <- c(3,5,7) > # call function > foo(x) [1] 4 6 8

code sourcing 38 Exercise: - write a function that computes a logarithm of inverse of a number log(1/x) - save it in the file with.r extension - load it into your workspace - execute it - try execute it with input vector ( 2, 1, 0, -1 ).

debugging 39 R package includes debugging tools. cat () & print () – print out the values browser () – pause the code execution and “browse” the code debug (FUN) – execute function line by line undebug (FUN) – stop debugging the function

debugging 40 # dummy function inv_log <- function(x){ y <- 1/x browser() y <- log(y) } # dummy function inv_log <- function(x){ y <- 1/x browser() y <- log(y) } > # load foo.r source file > source ("inv_log.r", echo = TRUE) > # dummy function > inv_log <- function(x){ + y<-1/x; + browser(); + y<-log(y); + } > inv_log (x) # call function Called from: inv_log(x) Browse[1]> y # check the values of local variables [1] Inf > # load foo.r source file > source ("inv_log.r", echo = TRUE) > # dummy function > inv_log <- function(x){ + y<-1/x; + browser(); + y<-log(y); + } > inv_log (x) # call function Called from: inv_log(x) Browse[1]> y # check the values of local variables [1] Inf inv_log.r

debugging 41 Go to the next statement if the function is being debugged. Continue execution if the browser was invoked. c or cont Continue execution without single stepping. n Execute the next statement in the function. This works from the browser as well. where Show the call stack. Q Halt execution and jump to the top-level immediately. To view the value of a variable whose name matches one of these commands, use the print() function, e.g. print(n).

debugging 42 # dummy function inv_log <- function(x){ y <- 1/x browser() y <- log(y) } # dummy function inv_log <- function(x){ y <- 1/x browser() y <- log(y) } > # load foo.r source file > source ("inv_log.r", echo = TRUE) > # dummy function > inv_log <- function(x){ + y<-1/x; + browser(); + y<-log(y); + } > inv_log (x) # call function Called from: inv_log(x) Browse[1]> y [1] Inf Browse[1]> n debug: y <- log(y) Browse[2]> Warning message: In log(y) : NaNs produced > > # load foo.r source file > source ("inv_log.r", echo = TRUE) > # dummy function > inv_log <- function(x){ + y<-1/x; + browser(); + y<-log(y); + } > inv_log (x) # call function Called from: inv_log(x) Browse[1]> y [1] Inf Browse[1]> n debug: y <- log(y) Browse[2]> Warning message: In log(y) : NaNs produced > inv_log.r

debugging 43 # dummy function inv_log <- function(x){ y <- 1/x y <- log(y) } # dummy function inv_log <- function(x){ y <- 1/x y <- log(y) } > # load foo.r source file > source ("inv_log.r", echo = TRUE) > # dummy function > inv_log <- function(x){ + y<-1/x; + y<-log(y); + } > debug(inv_log) # debug mode > inv_log (x) # call function Called from: inv_log(x) debugging in: inv_log(x) debug: { y <- 1/x y <- log(y) } Browse[2]>... > undebug(inv_log) # exit debugging mode > # load foo.r source file > source ("inv_log.r", echo = TRUE) > # dummy function > inv_log <- function(x){ + y<-1/x; + y<-log(y); + } > debug(inv_log) # debug mode > inv_log (x) # call function Called from: inv_log(x) debugging in: inv_log(x) debug: { y <- 1/x y <- log(y) } Browse[2]>... > undebug(inv_log) # exit debugging mode inv_log.r

timing 44 Use system.time() functions to measure the time of execution. > # make a function > g <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + } > # make a function > g <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + }

timing 45 Use system.time() functions to measure the time of execution. > # make a function > g <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + } > # execute the function, measuring the time of the execution > system.time( g(100000) ) user system elapsed > # make a function > g <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + } > # execute the function, measuring the time of the execution > system.time( g(100000) ) user system elapsed

optimization 46 How to speed up the code?

optimization 47 How to speed up the code? Use vectors !

optimization 48 How to speed up the code? Use vectors ! > # using vectors > x <- (1:100000) > g2 <- function(x) { + x/(x+1) + } > > # using vectors > x <- (1:100000) > g2 <- function(x) { + x/(x+1) + } > > # using loops > g1 <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + } > # using loops > g1 <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + }

optimization 49 How to speed up the code? Use vectors ! > # using vectors > x <- (1:100000) > g2 <- function(x) { + x/(x+1) + } > # execute the function > system.time( g2(x) ) user system elapsed > # using vectors > x <- (1:100000) > g2 <- function(x) { + x/(x+1) + } > # execute the function > system.time( g2(x) ) user system elapsed > # using loops > g1 <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + } > # execute the function > system.time( g1(100000) ) user system elapsed > # using loops > g1 <- function(x) { + y = vector(length=x) + for (i in 1:x) y[i]=i/(i+1) + y + } > # execute the function > system.time( g1(100000) ) user system elapsed

optimization 50 How to speed up the code? Avoid dynamically expanding arrays

optimization 51 How to speed up the code? Avoid dynamically expanding arrays > vec2 <- vector( + mode=“numeric”,length=100000) > vec2 <- vector( + mode=“numeric”,length=100000) > vec1<-NULL

optimization 52 How to speed up the code? Avoid dynamically expanding arrays > vec2 <- vector( + mode=“numeric”,length=100000) > # execute the command > system.time( + for(i in 1:100000) + vec2[i] <- mean(1:100)) user system elapsed > vec2 <- vector( + mode=“numeric”,length=100000) > # execute the command > system.time( + for(i in 1:100000) + vec2[i] <- mean(1:100)) user system elapsed > vec1<-NULL > # execute the command > system.time( + for(i in 1:100000) + vec1 <- c(vec1,mean(1:100))) user system elapsed > vec1<-NULL > # execute the command > system.time( + for(i in 1:100000) + vec1 <- c(vec1,mean(1:100))) user system elapsed

optimization 53 How to speed up the code? Avoid dynamically expanding arrays > f2<-function(x){ + vec2 <- vector( + mode="numeric",length=100000) + for(i in 1:100000) + vec2[i] <- mean(1:10) + } > # execute the command > system.time( f2(0) ) user system elapsed > f2<-function(x){ + vec2 <- vector( + mode="numeric",length=100000) + for(i in 1:100000) + vec2[i] <- mean(1:10) + } > # execute the command > system.time( f2(0) ) user system elapsed > f1<-function(x){ + vec1 <- NULL + for(i in 1:100000) + vec1 <- c(vec1,mean(1:10)) + } > # execute the command > system.time( f1(0) ) user system elapsed > f1<-function(x){ + vec1 <- NULL + for(i in 1:100000) + vec1 <- c(vec1,mean(1:10)) + } > # execute the command > system.time( f1(0) ) user system elapsed

optimization 54 How to speed up the code? Use optimized R-functions, i.e. rowSums(), rowMeans(), table(), etc. In some simple cases – it is worth it to write your own!

optimization 55 How to speed up the code? Use optimized R-functions, i.e. rowSums(), rowMeans(), table(), etc. In some simple cases – it is worth it to write your own! > matx <- matrix + (rnorm( ),100000,10) > # execute the command > system.time(rowMeans(matx)) user system elapsed > matx <- matrix + (rnorm( ),100000,10) > # execute the command > system.time(rowMeans(matx)) user system elapsed > matx <- matrix + (rnorm( ),100000,10) > # execute the command > system.time(apply(matx,1,mean)) user system elapsed > matx <- matrix + (rnorm( ),100000,10) > # execute the command > system.time(apply(matx,1,mean)) user system elapsed

optimization 56 How to speed up the code? Use optimized R-functions, i.e. rowSums(), rowMeans(), table(), etc. In some simple cases – it is worth it to write your own! > system.time( + for(i in 1:100000) + sum(1:100) / length(1:100) ) user system elapsed > system.time( + for(i in 1:100000) + sum(1:100) / length(1:100) ) user system elapsed > system.time( + for(i in 1:100000)mean(1:100)) user system elapsed > system.time( + for(i in 1:100000)mean(1:100)) user system elapsed

optimization 57 How to speed up the code? Use vectors Avoid dynamically expanding arrays Use optimized R-functions, i.e. rowSums(), rowMeans(), table(), etc. In some simple cases – it is worth it to write your own implementation!

optimization 58 How to speed up the code? Use vectors Avoid dynamically expanding arrays Use optimized R-functions, i.e. rowSums(), rowMeans(), table(), etc. In some simple cases – it is worth it to write your own implementation! Use R - compiler or C/C++ code

compiling 59 Use library(compiler) : cmpfun() - compile existing function cmpfile() - compile source file loadcmp() - load compiled source file

compiling 60 # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s }

compiling 61 # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } > # load compiler library > library (compiler) > > # load compiler library > library (compiler) >

compiling 62 # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } > # load compiler library > library (compiler) > # load function from a source file (if necessary) > source ("fsum.r") > > # load compiler library > library (compiler) > # load function from a source file (if necessary) > source ("fsum.r") > fsum.r

compiling 63 # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } # dummy function fsum <- function(x){ s <- 0 for ( n in x) s <- s+n s } > # load compiler library > library (compiler) > # load function from a source file (if necessary) > source (“fsum.r”) > # load function from a source file (if necessary) > fsumcomp <- cmpfun(fsum) > # load compiler library > library (compiler) > # load function from a source file (if necessary) > source (“fsum.r”) > # load function from a source file (if necessary) > fsumcomp <- cmpfun(fsum) fsum.r

compiling 64 > # run non-compiled version > system.time(fsum(1:100000)) user system elapsed > # run non-compiled version > system.time(fsum(1:100000)) user system elapsed Using compiled functions decreases the time of computation. > # run compiled version > system.time(fsumcomp(1:100000)) user system elapsed > # run compiled version > system.time(fsumcomp(1:100000)) user system elapsed

compiling 65 A source file can be compiled with cmpfile(). The resulting file has to then be loaded with loadcmp(). > # compile source file > cmpfile("fsum.r") saving to file "fsum.Rc"... Done > # load compiled source > loadcmp("fsum.Rc") > # compile source file > cmpfile("fsum.r") saving to file "fsum.Rc"... Done > # load compiled source > loadcmp("fsum.Rc")

profiling 66 Profiling is a tool, which can be used to find out how much time is spent in each function. Code profiling can give a way to locate those parts of a program which will benefit most from optimization. Rprof() – turn profiling on Rprof(NULL) – turn profiling off summaryRprof("Rprof.out") – Summarize the output of the Rprof() function to show the amount of time used by different R functions.

profiling 67 # slow version of BM function bmslow <- function (x, steps){ BM <- matrix(x, nrow=length(x)) for (i in 1:steps){ # sample from normal distribution z <- rnorm(2) # attach a new column to the output matrix BM <- cbind (BM,z) } return(BM) } # slow version of BM function bmslow <- function (x, steps){ BM <- matrix(x, nrow=length(x)) for (i in 1:steps){ # sample from normal distribution z <- rnorm(2) # attach a new column to the output matrix BM <- cbind (BM,z) } return(BM) } Brownian Motion simulation. Input: x - initial position, steps - number of steps bm.R

profiling 68 # a faster version of BM function bm <- function (x, steps){ # allocate enough space to hold the output matrix BM <- matrix(nrow = length(x), ncol=steps+1) # add initial point to the matrix BM[,1] = x # sample from normal distribution (delX, delY) z <- matrix(rnorm(steps*length(x)),nrow=length(x)) for (i in 1:steps) BM[,i+1] <- BM[,i] + z[,i] return(BM) } # a faster version of BM function bm <- function (x, steps){ # allocate enough space to hold the output matrix BM <- matrix(nrow = length(x), ncol=steps+1) # add initial point to the matrix BM[,1] = x # sample from normal distribution (delX, delY) z <- matrix(rnorm(steps*length(x)),nrow=length(x)) for (i in 1:steps) BM[,i+1] <- BM[,i] + z[,i] return(BM) } Brownian Motion simulation. Input: x - initial position, steps - number of steps bm.R

profiling 69 > # load compiler library (if you have not done it before) > require (compiler) > # compile function from a source file > cmpfun ("bm.R") > # load function from a compiled file > loadcmp ("bm.Rc") > # load compiler library (if you have not done it before) > require (compiler) > # compile function from a source file > cmpfun ("bm.R") > # load function from a compiled file > loadcmp ("bm.Rc")

profiling 70 > # simulate 100 steps > BMsmall <- bm(c(0,0),100) > # plot the result > plot(BMsmall[1,],BMsmall[2,],…) > # simulate 100 steps > BMsmall <- bm(c(0,0),100) > # plot the result > plot(BMsmall[1,],BMsmall[2,],…)

profiling 71 > # start profiling slow function > Rprof("bmslow.out") # optional – provide output file name > # run function > BMS <- bmslow(c(0,0), ) > # finish profiling > Rprof(NULL) > # start profiling slow function > Rprof("bmslow.out") # optional – provide output file name > # run function > BMS <- bmslow(c(0,0), ) > # finish profiling > Rprof(NULL)

profiling 72 > # start profiling faster function > Rprof("bm.out") # optional – provide output file name > # run function > BM <- bm(c(0,0), ) > # finish profiling > Rprof(NULL) > # start profiling faster function > Rprof("bm.out") # optional – provide output file name > # run function > BM <- bm(c(0,0), ) > # finish profiling > Rprof(NULL)

profiling 73 > summaryRprof("bmslow.out") $by.self self.time self.pct total.time total.pct "cbind" "rnorm" "bmslow" … > summaryRprof("bmslow.out") $by.self self.time self.pct total.time total.pct "cbind" "rnorm" "bmslow" … > summaryRprof("bm.out") $by.self self.time self.pct total.time total.pct "bm" "rnorm" "matrix" "+" ":" … > summaryRprof("bm.out") $by.self self.time self.pct total.time total.pct "bm" "rnorm" "matrix" "+" ":" …

R-C/C++ programming 74 Goal – performance enhancement. Benefits – use of existing C/C++ libraries and memory management Base R package provides 3 types of interfaces between R and C/C++.C().Call().External() – used to create R packages There are other R packages that provide interface between R and C/C++ (and other languages such as FORTRAN and Python): Rcpp

R-C/C++ programming 75.C() interface /* exC1.c – example C function to be called from R */ void exampleC1(int *iVec){ iVec[0] = 7; return; } /* exC1.c – example C function to be called from R */ void exampleC1(int *iVec){ iVec[0] = 7; return; } exC1.c Important: Function returns no values – it is VOID All the values that need to be changed must be passed through the input vector

R-C/C++ programming 76.C() interface katana:~ % R CMD SHLIB exC1.c gcc -std=gnu99 -I/usr/local/IT/R /lib64/R/include - I/usr/local/include -fpic -g -O2 -c exC1.c -o exC1.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o exC1.so exC1.o katana:~ % katana:~ % R CMD SHLIB exC1.c gcc -std=gnu99 -I/usr/local/IT/R /lib64/R/include - I/usr/local/include -fpic -g -O2 -c exC1.c -o exC1.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o exC1.so exC1.o katana:~ % Important: In linux (and R) environment commands are case sensitive!

R-C/C++ programming 77.C() interface Note: In windows after the function is compiled it will be named exC1.dll > # load C function to R workspace > dyn.load("exC1.so") > > # load C function to R workspace > dyn.load("exC1.so") >

R-C/C++ programming 78.C() interface > # load C function to R workspace > dyn.load("exC1.so") > # create a vector > iv <- 1:3 > # load C function to R workspace > dyn.load("exC1.so") > # create a vector > iv <- 1:3

R-C/C++ programming 79.C() interface > # load C function to R workspace > dyn.load("exC1.so") > # create a vector > iv <- 1:3 > # call c-function > out <-.C("exampleC1", newVec = as.integer(iv)) > # load C function to R workspace > dyn.load("exC1.so") > # create a vector > iv <- 1:3 > # call c-function > out <-.C("exampleC1", newVec = as.integer(iv))

R-C/C++ programming 80.C() interface > # load C function to R workspace > dyn.load("exC1.so") > # create a vector > iv <- 1:3 > # call c-function > out <-.C("exampleC1", newVec = as.integer(iv)) > out $newVec [1] > # load C function to R workspace > dyn.load("exC1.so") > # create a vector > iv <- 1:3 > # call c-function > out <-.C("exampleC1", newVec = as.integer(iv)) > out $newVec [1] 7 2 3

R-C/C++ programming 81.C() interface Note: R has to allocate memory for the arrays passed to and from C. R has to pass objects of correct type R copies its arguments prior to passing them to C and then creates a copy of the values passed back from C.

R-C/C++ programming 82 /* exC2.c – example C function to be called from R */ /* normalize the vector */ include void exampleC2(char **c, double *A, double *B, int *ierr){ double len = 0; /*local variable – vector length */ int i; for (i=0; i<3; i++) len += pow( A[i]), 2); /* check if the vector is degenerate */ if ( len < ){ ierr[0] = -1; /*error – null vector */ stncpy(c, “Error”, 5); return; } /* calculate output vector len = pow(len, 0.5); for (i=0; i<3; i++) B[i] = A[i] / len ; ierr[0] = 0; strncpy(c, “OK”, 2); return; } /* exC2.c – example C function to be called from R */ /* normalize the vector */ include void exampleC2(char **c, double *A, double *B, int *ierr){ double len = 0; /*local variable – vector length */ int i; for (i=0; i<3; i++) len += pow( A[i]), 2); /* check if the vector is degenerate */ if ( len < ){ ierr[0] = -1; /*error – null vector */ stncpy(c, “Error”, 5); return; } /* calculate output vector len = pow(len, 0.5); for (i=0; i<3; i++) B[i] = A[i] / len ; ierr[0] = 0; strncpy(c, “OK”, 2); return; } exC2.c

R-C/C++ programming 83.C() interface katana:~ % R CMD SHLIB exC2.c gcc -std=gnu99 -I/usr/local/IT/R /lib64/R/include - I/usr/local/include -fpic -g -O2 -c exC2.c -o exC2.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o exC2.so exC2.o katana:~ % R CMD SHLIB exC2.c gcc -std=gnu99 -I/usr/local/IT/R /lib64/R/include - I/usr/local/include -fpic -g -O2 -c exC2.c -o exC2.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o exC2.so exC2.o

R-C/C++ programming 84.C() interface > # load C function to R workspace > dyn.load("exC2.so") > # create error vector > ierr_in <- 0 > # create input vector > A_in <- c(2, 3, 6) > # create output vector > B_in <- c(0, 0, 0) > # create message vector (make sure it is long enough!) > C_in <- c(" ") > # load C function to R workspace > dyn.load("exC2.so") > # create error vector > ierr_in <- 0 > # create input vector > A_in <- c(2, 3, 6) > # create output vector > B_in <- c(0, 0, 0) > # create message vector (make sure it is long enough!) > C_in <- c(" ")

R-C/C++ programming 85.C() interface > # execute C function > out <-.C("exampleC2", "exampleC2", + C_out = as.character(C_in), + A_out = as.numeric(A_in), + B_out = as.numeric(B_in), + ierr_out = as.integer(ierr_in)) > # execute C function > out <-.C("exampleC2", "exampleC2", + C_out = as.character(C_in), + A_out = as.numeric(A_in), + B_out = as.numeric(B_in), + ierr_out = as.integer(ierr_in))

R-C/C++ programming 86.C() interface > out $C_out [1] "OK " $A_out [1] $B_out [1] $ierr_out [1] 0 > out $C_out [1] "OK " $A_out [1] $B_out [1] $ierr_out [1] 0

R-C/C++ programming 87.C() interface > # create input vector > A_in <- c(0, 0, 0) > # execute C function > out <-.C("exampleC2", "exampleC2", + C_out = as.character(C_in), + A_out = as.numeric(A_in), + B_out = as.numeric(B_in), + ierr_out = as.integer(ierr_in)) > # create input vector > A_in <- c(0, 0, 0) > # execute C function > out <-.C("exampleC2", "exampleC2", + C_out = as.character(C_in), + A_out = as.numeric(A_in), + B_out = as.numeric(B_in), + ierr_out = as.integer(ierr_in))

R-C/C++ programming 88.C() interface > out $C_out [1] "error " $A_out [1] $B_out [1] $ierr_out [1] -1 > out $C_out [1] "error " $A_out [1] $B_out [1] $ierr_out [1] -1

R-C/C++ programming 89.Call() interface does not copy arguments before and after calling c-function it is possible to find the length of the input vector inside c-function an easier access to wide-range of R – objects NA (missing values) handling Access to vectors’ attributes

R-C/C++ programming 90.Call() interface – passing a value /* exC3.c – example C function to be called from R with.Call interface*/ /* access R object (scalar value) inside c-function */ include /* 2 standard includes for.Call interface) */ include SEXP exampleC3 ( SEXP iValue ){ return (R_NilValue); /* “void” function must return “NULL” value */ } /* exC3.c – example C function to be called from R with.Call interface*/ /* access R object (scalar value) inside c-function */ include /* 2 standard includes for.Call interface) */ include SEXP exampleC3 ( SEXP iValue ){ return (R_NilValue); /* “void” function must return “NULL” value */ } exC3.c Note: All objects passed between R and C/C++ are of type SEXP – Simple EXPression. 2 standard includes needed for.Call interface If function is void it should return R_NilValue object.

R-C/C++ programming 91.Call() interface – passing a value /* exC3.c – example C function to be called from R with.Call interface*/ /* access R object (scalar value) inside c-function */ include SEXP exampleC3 ( SEXP iValue ){ int local_iValue; /* convert R object to c-accessible variable */ local_iValue = INTEGER_VALUE(iValue); return (R_NilValue); } /* exC3.c – example C function to be called from R with.Call interface*/ /* access R object (scalar value) inside c-function */ include SEXP exampleC3 ( SEXP iValue ){ int local_iValue; /* convert R object to c-accessible variable */ local_iValue = INTEGER_VALUE(iValue); return (R_NilValue); } exC3.c

R-C/C++ programming 92.Call() interface – passing a value /* exC3.c – example C function to be called from R with.Call interface*/ /* access R object (scalar value) inside c-function */ include SEXP exampleC3 ( SEXP iValue ){ int local_iValue; /* convert R object to c-accessible variable */ local_iValue = INTEGER_VALUE(iValue); /* print value of the local variable*/ printf(" In exampleC3 iValue = %d\n", local_iValue); return (R_NilValue); } /* exC3.c – example C function to be called from R with.Call interface*/ /* access R object (scalar value) inside c-function */ include SEXP exampleC3 ( SEXP iValue ){ int local_iValue; /* convert R object to c-accessible variable */ local_iValue = INTEGER_VALUE(iValue); /* print value of the local variable*/ printf(" In exampleC3 iValue = %d\n", local_iValue); return (R_NilValue); } exC3.c

R-C/C++ programming 93.Call() interface – passing a value > # load C function to R workspace – same as before > dyn.load("exC3.so") > # call C function > out <-.Call("exampleC3", 7) In exampleC3 iValue = 7 > # explore output > out NULL > # load C function to R workspace – same as before > dyn.load("exC3.so") > # call C function > out <-.Call("exampleC3", 7) In exampleC3 iValue = 7 > # explore output > out NULL

R-C/C++ programming 94.Call() interface – passing a vector /* exC4.c - example C function to be called from R */ /* normalize the vector and return its length */ include SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; return (rLen); /* return a value */ } /* exC4.c - example C function to be called from R */ /* normalize the vector and return its length */ include SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; return (rLen); /* return a value */ } exC4.c Note: Rmath.h include provides access to many R-functions include rnorm(),rgamma(), etc. Function should return SEXP object..Call() interface allows for changing the function arguments – be careful!

R-C/C++ programming 95 SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; /* output value – length of a vector */ double * pVector; /* local variable - pointer to the input vector */ double vLen = 0; /* local variable to calculate intermediate values */ int len; /* local variable – size of the input vector */ int i; /* local variable – loop index */ return (rLen); /* return a value */ } SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; /* output value – length of a vector */ double * pVector; /* local variable - pointer to the input vector */ double vLen = 0; /* local variable to calculate intermediate values */ int len; /* local variable – size of the input vector */ int i; /* local variable – loop index */ return (rLen); /* return a value */ } exC4.c

R-C/C++ programming 96 SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; /* get the pointer to the vector */ pVector = NUMERIC_POINTER(Vector); return (rLen); /* return a value */ } SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; /* get the pointer to the vector */ pVector = NUMERIC_POINTER(Vector); return (rLen); /* return a value */ } exC4.c Note: Use INTEGER_POINTER() and CHARACTER_POINTER() to get pointer to integer and character arrays respectfully

R-C/C++ programming 97 SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; /* get the pointer to the vector */ pVector = NUMERIC_POINTER(Vector); /* number of elements in the array */ len = length(Vector); return (rLen); /* return a value */ } SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; /* get the pointer to the vector */ pVector = NUMERIC_POINTER(Vector); /* number of elements in the array */ len = length(Vector); return (rLen); /* return a value */ } exC4.c Note: We can get the size of the input R-vector !

R-C/C++ programming 98 SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; pVector = NUMERIC_POINTER(Vector); len = length(Vector); /* allocate storage for integer variable (array works also!) */ PROTECT(rLen = NEW_NUMERIC(1)); UNPROTECT(1); return (rLen); /* return a value */ } SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; pVector = NUMERIC_POINTER(Vector); len = length(Vector); /* allocate storage for integer variable (array works also!) */ PROTECT(rLen = NEW_NUMERIC(1)); UNPROTECT(1); return (rLen); /* return a value */ } exC4.c Note: To allocate integer and character arrays use NEW_INTEGER(len) and NEW_CHARACTER(len) functions respectfully PROTECT() and UNPROTECT() command must be balanced!

R-C/C++ programming 99 SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; pVector = NUMERIC_POINTER(Vector); len = length(Vector); PROTECT(rLen = NEW_NUMERIC(1)); /* calculate the length */ for( i=0; i < len; i++)vLen += pow(pVector[i], 2); if ( vLen > ){ vLen = pow( vLen,0.5 ); /* Here we are working with a pointer - it WILL change R vector */ for( i=0; i < len; i++ )pVector[i] /= vLen; } /* copy the value of local variable into R-object */ REAL(rLen)[0] = vLen; UNPROTECT(1); return (rLen); /* return a value */ } SEXP exampleC4 ( SEXP Vector ){ SEXP rLen; double * pVector; double vLen = 0; int len; int i; pVector = NUMERIC_POINTER(Vector); len = length(Vector); PROTECT(rLen = NEW_NUMERIC(1)); /* calculate the length */ for( i=0; i < len; i++)vLen += pow(pVector[i], 2); if ( vLen > ){ vLen = pow( vLen,0.5 ); /* Here we are working with a pointer - it WILL change R vector */ for( i=0; i < len; i++ )pVector[i] /= vLen; } /* copy the value of local variable into R-object */ REAL(rLen)[0] = vLen; UNPROTECT(1); return (rLen); /* return a value */ } exC4.c

R-C/C++ programming 100.Call() interface – passing an array > # load C function to R workspace – same as before > dyn.load("exC4.so") > # define and input array > A_in <- c( 2, 3, 6) > # call C function > out <-.Call("exampleC4", A_in) > # input array changed !!! > A_in [1] > out [1] 7 > # load C function to R workspace – same as before > dyn.load("exC4.so") > # define and input array > A_in <- c( 2, 3, 6) > # call C function > out <-.Call("exampleC4", A_in) > # input array changed !!! > A_in [1] > out [1] 7

101 This tutorial has been made possible by Scientific Computing and Visualization group at Boston University. Katia Oleinik