Download presentation
Presentation is loading. Please wait.
Published bySharon Bryant Modified over 6 years ago
1
Programming in R Intro, data and programming structures
2
About R R is an open source comprehensive statistical package, widely used around the world. R is a real object-oriented programming language (compared to SAS or Minitab). R is available for Windows, Mac and Linux. Can be compared to Matlab by structure, by origin – free version of S. + Easy and flexible programming. Has a lot of contributed packages covering most statistical and machine learning methods (also recent ones!). Slower than C and Fortran. Cannot handle as large data sets as for example Perl and Python. Programming in R
3
Installing R R project web site: http://www.r-project.org/
Programming in R
4
Programming in R
5
Starting R (Windows) Programming in R
6
Important menu items Set working directory alt. setwd("C:/…")
Programming in R
7
Important menu items For script mode For inter- active mode
Programming in R
8
Important menu items Run a script Programming in R
9
Important menu items Install contributed packages from CRAN
Programming in R
10
Getting help Specific function Help browser
help(function) Help browser help.start() Search for something in help help.search(“expression”) Quick reminder of function arguments: args(function) Examples of how to use function: example(function) If some method is not installed on the computer: RSiteSearch(”expression") Programming in R
11
Preliminaries R is case-sensitive Comments: Start with hash-mark
#R is a cool language! Data assignment: Use -> or <- or = a <- 3 3 -> b c = 3 Variable types: called modes, ex. integer, numeric, character, logical, complex Programming in R
12
Working with vectors Vectors are the ‘workhorses’ of R
The function c() combines individual values (comma-spaced) to a vector Print on screen by entering the variable name or use the function print() [1]is the row number, and in this case x is interpreted as a row vector The length of the vector is obtained with the length() function The mode of the vector is obtained with the mode() function Programming in R
13
Listing and removing objects
Listing defined objects (vectors, matrices, data frames): Use the function ls() with no arguments Compact display of an R object. Useful for complex data frames and lists. Removing objects: Use the function rm() Programming in R
14
Sequences Use of ‘ : ‘ , seq() and rep() Programming in R
15
Indexing Finding elements satisfying specific conditions
Indexing follows format vector1[vector2] Finding elements satisfying specific conditions Programming in R
16
Filtering Filtering follows the indexing principles Programming in R
17
Set operations on vectors
Programming in R
18
Other operations on vectors
Important: In R, operations with vectors are performed element- by-element Some operations: Element-wise: +-*/^ log exp sin cos sqrt length –number of elements sum - sum of all elements mean max min order Logicals: TRUE or FALSE: a<-TRUE; Programming in R
19
Working with matrices Use the function matrix() a<-matrix(values,nrow=m,ncol=n) values is a list of values enclosed in c(), i.e. a row vector or an already defined vector. m is the number of rows and n is the number of columns of the matrix. The number of values must be dividable by both m and n. The values are entered column-wise. The identifiers nrow= and ncol= can be omitted Note the double indexing, first number for row and second number for column Programming in R
20
Matrix filtering Follows the same principles as for vectors Column 2
Programming in R
21
Editing sizes of matrices
Adding and deleting rows and columns of matrices Programming in R
22
Matrix operations Matrix/vector multiplication Matrix/matrix
Programming in R
23
Matrix operations Matrix transpose b = aT Matrix inverse b = a-1
Programming in R
24
Lists A list is a collection of objects
Can be of different modes (ex. "logical", "integer", "double", "character”) List indexing is different from vector and matrix indexing Programming in R
25
Lists Adding and deleting components of lists Programming in R
26
Lists Accessing list components and values
unlist coerces to a common mode Programming in R
27
Data frames Data frames are two-dimensional analogs of matrices that can contain columns of different mode Use the function data.frame(object 1, object 2, … , object k) Matrices need to be protected , otherwise each column of a matrix will be identified as a single object in the data frame. Protection is made with the function I() Programming in R
28
Data frames Accessing components and values of data frames
(both list and matrix indexing works) Programming in R
29
Data frames Combining data frames Programming in R
30
Factors A factor can be seen as a vector with certain levels
Programming in R
31
Sorting Sorting according to either numeric values or characters
Programming in R
32
Conditional execution
if (expr) { … } else{ If you need to connect several conditions, use ’&’ , ’&&’, ’| ’ or ’||’ Programming in R
33
Loops for (name in expr1 ) { … } while (condition) Programming in R
34
Avoiding loops The *apply() family of functions is among the most important in R for speeding up computations apply(m,dimcode,f,fargs) - m the matrix - dimcode 1 applies the function to rows 2 applies the function to columns - f the function to be applied - fargs optional arguments to be supplied to f Programming in R
35
Avoiding loops Use apply over matrices and data frames Use lapply and sapply over lists Programming in R
36
Writing your own functions
Function writing must always end with writing the value which should be returned You may also use ”return(value)” to show what value the function should return Programming in R
37
Writing your own functions
If several arguments need to be returned, list may be used Programming in R
38
Writing your own functions
Obligatory arguments and arguments by default Variables can be specified in any order (provided you specify the name) when you call the function Programming in R
39
Simulation The key to simulation is random number generation (RNG). R has several RNG implemented, default is the “Mersenne-Twister“ RNG. The sample() function can be used for random sampling from vectors and matrices Programming in R
40
Simulation R has functions to generate variates from many distributions ex. rnorm(),rbinom(),rchisq(),rpois(),rgamma(),mvrnorm() Programming in R
41
Simulation The set.seed() function can be used to generate reproducible sequences of random numbers Programming in R
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.