Stat 251 (2009, Summer) Lab 1 TA: Yu, Chi Wai
Set up lab accounts Introduction to R
Lab account User name: Password: Select any one of the Window 2000 server Press Ctrl+\ and return to the ezConnect Manager if you are in a Unix login page. User name: the first eight (8) letters of the full name that you used to register at UBC. Password: a capital “S” followed by the next seven (7) digits of your student ID number.
Lab account (cont’) CHANGE YOUR PASSWORD: Type in simultaneously, Ctl+Alt+Del, which will bring you back to a similar login page as before. At the bottom right of the window, there will be an option for changing your password there.
Introduction to R Open R by double-clicking your desktop Icon : R is a free statistical programming language. Download at: http://www.r-project.org. Open R by double-clicking your desktop Icon :
Basic commands in R: 1. Basic operations: 2+3 # addition 2*3 # multiplication 2/3 # division 2^3 # power log(5) # natural logarithms sqrt(2) # square root exp(0.2) # exponential function abs(-7) # absolute value
c(……): stands for concatenation Basic commands in R: 2. Basic Vector Manipulation: Create a column vector y = OR c(……): stands for concatenation y = OR y = c(“a”,”b”,”c”) c(0,1,3) <- Use t() to convert a column vector to a row vector For example: t(y)
2. Basic Vector Manipulation: Two fast ways to create a vector with sequences: a) Use Colon : to create a vector of consecutive integers x = 10:16 Step size = 1 Output: 10, 11, 12, 13, 14, 15, 16 How to create a sequence of odd numbers, or a vector of (2, 5, 8, 11)’ ?
2. Basic Vector Manipulation: b) Use seq(from, to, by=) to create a sequence with a chosen step size. x = seq(11,20, by=2) Step size = 2 Output: 11, 13, 15, 17, 19 x = seq(11,10, by=-0.2) Output: 11.0 10.8 10.6 10.4 10.2 10.0
2. Basic Vector Manipulation: Additionally, rep(x, B) replicates the values in x B times For example: rep(1, 5) (1,1,1,1,1) rep(c(1,2),3) (1,2,1,2,1,2) rep(“Mike”, 3) (“Mike”, “Mike”, “Mike”)
2. Basic Vector Manipulation: Add extra elements to an existing vector: For example, y = c(2,5,9), add 19 to y to create a new vector called z. In general, stick several vectors together to make a new one. For example, y =c(2,5,9) and x = c(1,3,10), then z= c(y,19) u1= c(y,x) add 19 and 23 to y to create another vector w. Output: 2 5 9 1 3 10 w= c(y,19,23) u2 = c(x,y) Output: 1 3 10 2 5 9
2. Basic Vector Manipulation: Delete elements in a vector: x = seq(11,20, by=2) drop the 2nd element of x to create a new vector w. Drop the 2nd and 4th elements of x. x[i] : the ith element of x. x[-c(2,4)] (11,15,19) For example: x[2] = 13; x[c(3,5)] = (15,19) x[-i] : drop the ith element of x. For example: w= x[-2] w = (11,15,17,19)
3. Algebraic manipulation a) Sum of a vector and a scalar x = c(11,8,3) x + 7 ? Each element of x will be added to 7 to form a new vector. x+7 (18, 15, 10)
3. Algebraic manipulation b) Multiplication of a vector by a scalar x = c(11,8,3) x*7 ? Each element of x will be multiplied by 7 to form a new vector. x*7 (77, 56, 21) c) Division of a vector by a scalar x/7 (11/7, 8/7, 3/7)
3. Algebraic manipulation d) Vector Sum (sum elementwisely) x = c(11,8,3), and y = c(2,6,3) x + y (11+2, 8+6, 3+3) e) Vector Subtraction x - y (11-2, 8-6, 3-3)
3. Algebraic manipulation f1) Vector Multiplication (elementwisely) x = c(11,8,3), and y = c(2,6,3) x * y (11*2, 8*6, 3*3) f2) Vector Multiplication (usual),i.e. xTy t(x)%*%y 11*2+8*6+3*3 t(x)%*%y = sum(x*y)
3. Algebraic manipulation g) Vector Division (elementwisely) x = c(11,8,3), and y = c(2,6,3) x/y (11/2, 8/6, 3/3)
4. Summary statistics a) Measure the location of data: i) Sample mean: x = c(0.18, -0.07, 0.09, -0.25, -0.76, -0.58) a) Measure the location of data: i) Sample mean: mean(x) ii) Sample median/ the 0.5th sample quantile (Q(0.5)): median(x) or quantile(x, 0.5)
4. Summary statistics b) Measure the dispersion of data: x = c(0.18, -0.07, 0.09, -0.25, -0.76, -0.58) b) Measure the dispersion of data: i) Sample standard deviation and sample variance: sd(x) and var(x) ii) Sample IQR, i.e. Q(0.75)-Q(0.25): quantile(x, 0.75)- quantile(x, 0.25)
4. Summary statistics c) Other measures: max(x): a maximum of x x = c(0.18, -0.07, 0.09, -0.25, -0.76, -0.58) c) Other measures: max(x): a maximum of x min(x): a minimum of x range(x): a range of x summary(x): a summary of x
5. Graphics a) Histogram: hist(x) x = rnorm(20) Generate 20 data randomly from the standard normal distribution. x = rnorm(20) a) Histogram: hist(x)
5. Graphics b) Boxplot: boxplot(x) The distribution is skewed to the left; The distribution has a heavy tail on the left; Some small-valued observations are far away from the majority of the data.
5. Graphics c) Scatterplot: plot(x,y) x = rnorm(20)