R Course 1st Lecture
Information about the Course Matthew Exact - 3rd year MORSE M.Exact@warwick.ac.uk Wednesday 1-2pm R0.41 (Except Week 7 which is R0.39) Slides will be emailed out after class
What is R? Open source, free language for statistical computing Widely used in data science and statistics. Used to apply statistical techniques and visualise data Many packages which can be installed to extend the functionality of R You can either use R or RStudio Found at https://www.r-project.org/ or https://www.rstudio.com/products/rstudio/download/
What can R do? According to a 2015 survey of analytical professionals by Rexer Analytics, 76% of respondents report using R. This is up dramatically from just 23% in 2007. More than a third of respondents (36%) identify R as their primary tool.
Topics we will cover Basics Data types Vectors Vector operations Matrices Matrix operations Factors Lists Data frames Graphics Functions Control structures (if, loop etc.) In depth examples and practice using all above skills
The basics Console Variables e.g. x,y etc. Scripts Comments (##) Data types Operators
R code can be entered into the command line directly or saved to a script, which can be run inside a session using the source() function. To go back to code already entered in console, use up arrow You can run a command directly from a script by placing the cursor inside the command or highlighting the commands and hitting Ctrl-Enter. This will advance the cursor to the next command, where you can hit Ctrl-Enter again to run it, advancing the cursor to the next command… Commands are separated either by a ; or by a new line. R is case sensitive. The # character at the beginning of a line signifies a comment, which is not executed. Commands can extend beyond one line of text. Put operators like + at the end of lines for multi-line commands.
R stores both data and output from data analysis (as well as everything else) in objects. Data are assigned to and stored in objects using the <- or = operator. To print the contents of an object, specify the object’s name alone. A list of all objects in the current session can be obtained with ls() Help files for R functions are accessed by preceding the name of the function with ? (e.g. ?seq). In the help file, we will find a list of Arguments to the function, in a specific order. Values for arguments to functions can be specified either by name or position.
Data types Logical Numeric Character Integer Other less used types are: complex, double (higher precision) and raw
Coercion > as.numeric(TRUE) [1] 1 > as.numeric(FALSE) [1] 0 > as.character(4) [1] "4" > as.numeric("4.5") [1] 4.5 > as.integer("4.5") [1] 4 > as.numeric("Hello") [1] NA
Operators Assignment: x <-10, 2->y Arithmetic: x+y,x-y,x*y,x/y,x^y,x%%y Relational: x<y,x>y,x!=y, x==y, etc. Logical: !, &, &&, |, || Precedence and associativity
Assignment Operator Description Associativity <-,<<-,= Leftwards assignment Right to Left ->,->> Rightwards assignment Left to Right Note, = and <- both assign leftwards, but = is only allowed at the top level (i.e. in the command line)
Arithmetic Operator Description Associativity + Addition Left to Right - Subtraction * Multiplication / Division ^ Exponent Right to Left %% Modulus
# Using R as a calculator 2+3 # assign the number 3 to object called abc abc <- 3 # print contents abc or print(abc)
Task Create a ‘height’ variable with value 3 and a ‘width’ variable with value 6 Assign an ‘area’ variable to be height multiplied by width Then print the area
Relational Operator Description Associativity < Less than Left to Right > Greater than <= Less than or equal to >= Greater than or equal to == Equal to != Not Equal to
Logical Operator Description Associativity ! Logical NOT Left to Right & Element-wise logical AND && Logical AND | Element wise logical OR || Logical OR
x<3 & y>3 TRUE | FALSE !TRUE !(x>5)
Precedence (highest first) :: $ @ ˆ - + (unary) : %xyz% * / + - (binary) > >= < <= == != ! & && | || ~ (unary and binary) -> ->> = (as assignment) <- <<-
Data structures Homogeneous Heterogeneous One-dimensional Atomic vector List 2-dimensional Matrix Dataframe n-dimensional Array
Atomic One dimensional, must be of the same type Created using c() c(“A”,”B”,”C”)
List One dimensional, can be any types x <- list(c(1,2,3), 100, c(TRUE, FALSE, TRUE), list("a", "b", "c"))
Matrix 2-dimensional, 1 type matrix(data=c(1:9),nrow=3,ncol=3) # create a 2x3 matrix, filling down columns matrix(1:6, nrow=2) # now fill across rows matrix(5:14, nrow=2, byrow=TRUE)
Dataframe 2-dimensional, any type df <- data.frame(name = c("Matt", "Joe", "Chris"),age = c(52, 29, 25),relationshipStatus = c("married", "single", "married"))
Array n-dimensional, same type a <- array(data=c(1:6), dim = c(2,2,2)) gives a 2x2x2 array
Using dim() dim() gives the dimensions of the data structure Now try this on all the examples of all the types of data structure
rep() and seq() to generate vectors To create vectors with a predictable sequence of elements, use rep() for repeating elements and seq() for sequential elements. The expression m:n will generate a vector of integers from m to n # second argument is number of repetitions rep(0, times=3) ## [1] 0 0 0 rep("abc", 4) ## [1] "abc" "abc" "abc" "abc" # from, to, by seq(from=1, to=5, by=2) ## [1] 1 3 5 seq(10, 0, -5) ## [1] 10 5 0 # colon operator 3:7 ## [1] 3 4 5 6 7 # you can nest functions rep(seq(1,3,1), times=2) ## [1] 1 2 3 1 2 3 # each vs times rep(seq(1,3,1), each=2) ## [1] 1 1 2 2 3 3
Next class we will do some tasks related to the content covered this class, so you can practice and become more familiar