Download presentation
Presentation is loading. Please wait.
Published byLoraine Stafford Modified over 9 years ago
1
R objects All R entities exist as objects They can all be operated on as data We will cover: Vectors Factors Lists Data frames Tables Indexing R packages and datasets
2
Vectors Think of vectors as being equivalent to a single column of numbers in a spreadsheet You can create a vector using the c( ) function (concatenate) as follows: x <- c( ) For example: x <- c(1,2,4,8) creates a column of the numbers 1,2,4,8
3
Vectors Other ways of creating columns of numbers (vectors): The seq function seq(1,10,1) = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 seq(1,4,0.5) = 1, 1.5, 2, 2.5, 3, 3.5, 4 x:y 1:10 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 2 * 1:10 = 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 The rep function rep(2,4) = 2, 2, 2, 2 ?seq() ?rep()
4
Indexing Referencing (indexing) specific ‘cells’ in a column: Example: if x is the vector 1, 2, 5 then x [1] = 1, x [2] = 2, x [3] = 5 and x [1:2] = 1, 2first two listed items in x x [2:3] = 2, 52 nd & 3 rd listed items in x x [x>2] = 5use of ‘>’ and ‘<‘ characters Example: if x is the vector 1, 2, 5 then x [1] = 1, x [2] = 2, x [3] = 5 and x [1:2] = 1, 2first two listed items in x x [2:3] = 2, 52 nd & 3 rd listed items in x x [x>2] = 5use of ‘>’ and ‘<‘ characters
5
Performing simple operations on vectors In R, when you carry out simple operations (+ - * /) on vectors that have the same number of entries, R just performs the normal operations on the numbers in the vector, entry by entry If the vectors don’t have the same number of entries, then R will cycle through the vector with the smaller number of entries
6
Performing simple operations on vectors Example:
7
Performing simple operations on vectors Examples:
8
Performing simple operations on vectors Example:
9
Performing simple operations on vectors Vectors (columns of numbers) can be assigned by putting together other vectors, for example:
10
Functions R functions take arguments (information that you put into the function which goes between the brackets) and can perform a range of tasks In the case of the ‘help’ function the task is to display information from the R documentation files A comprehensive list of R functions can be obtained from the R reference manual under the help menu
11
Simple statistic functions R comes with some useful functions: sqrt ( ) square root mean ( )arithmetic mean hist ( ) calculating & plotting histograms sqrt ( ) square root mean ( )arithmetic mean hist ( ) calculating & plotting histograms R also comes with pre-loaded datasets, which we’ll discuss later….
12
Basic statistic functions on vectors > X1 <- c(1.1, 4.3, 5, 2, 1, 4, 9.5) > sum(X1)sum = 26.9 > mean(X1)mean = 3.842857 > median(X1)median = 4 > var(X1)variance = 8.762857 > sd(X1)standard deviation = 2.960212 > summary(X1) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.000 1.550 4.000 3.843 4.650 9.500 > quantile(X1) 0% 25% 50% 75% 100% 1.00 1.55 4.00 4.65 9.50
13
Mixing vectors and scalars R has the very convenient feature of having operators that work with vectors It is even possible to mix vectors and scalars For example: > X1 <- c(1.1, 4.3, 5, 2, 1, 4, 9.5) > X1 + 1 [1] 2.1 5.3 6.0 3.0 2.0 5.0 10.5 > X1 * 2 [1] 2.2 8.6 10.0 4.0 2.0 8.0 19.0
14
Vectors to record data > x = c(45,43,46,48,51,46,50,47,46,45) > length(x) [1] 10 > x = c(x,48,49,51,50,49) # append values to x > length(x) [1] 15 > x[16] = 41 # add to a specified index > length(x) [1] 16 > mean(x) [1] 47.1875 > x[17:20] = c(40,38,35,40) # add to many specified indices > length(x) [1] 20 > mean(x) [1] 45.4
15
Factors A factor is a vector that encodes information about the group to which a particular observation belongs Categorical data is often used to classify data into various levels or factors To make a factor is easy, using the factor function
16
Factors – smoking survey example A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > x # print out values in x [1] "Yes" "No" "No" "Yes" "Yes" > factor(x) # print out value in factor(x) [1] Yes No No Yes Yes Levels: No Yes # notice levels are printed. A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > x # print out values in x [1] "Yes" "No" "No" "Yes" "Yes" > factor(x) # print out value in factor(x) [1] Yes No No Yes Yes Levels: No Yes # notice levels are printed. Notice the difference in how R treats factors with this example
17
Factors – student height example Suppose the recorded height of South African and British students are as follows heights <- c(1.7,1.95,1.63,1.54,1.29) You make a new vector fac_heights, to record the nationality that each observation pertains to fac_heights <- factor(c(“GB”, “SA”, “GB”, “GB”, “SA”)) Suppose the recorded height of South African and British students are as follows heights <- c(1.7,1.95,1.63,1.54,1.29) You make a new vector fac_heights, to record the nationality that each observation pertains to fac_heights <- factor(c(“GB”, “SA”, “GB”, “GB”, “SA”)) Useful when testing for differences between groups
18
Factors – gender survey example Consider a survey that has data on 691 females and 692 males > gender <- c(rep("female",691), rep("male",692))# create vector > gender <- factor(gender) # change vector to factor Consider a survey that has data on 691 females and 692 males > gender <- c(rep("female",691), rep("male",692))# create vector > gender <- factor(gender) # change vector to factor Once stored as a factor, the space required for storage is reduced Values “female” and “male” are the levels of the factor > levels(gender) # assumes gender is a factor [1] "female" "male" Once stored as a factor, the space required for storage is reduced Values “female” and “male” are the levels of the factor > levels(gender) # assumes gender is a factor [1] "female" "male" Internally, the factor ‘gender’ is stored as 691 1’s, followed by 692 2’s. It has stored with it a table that looks like this:
19
Lists A set of objects (e.g. vectors) can be combined under a single name as a list (similar to a spreadsheet in Excel) Example: x <- c (1, 7, 8, 9, 10) y <- c (“red”, “yellow”, “blue”, “green”) example_list <- list (size = x, colour = y) Example: x <- c (1, 7, 8, 9, 10) y <- c (“red”, “yellow”, “blue”, “green”) example_list <- list (size = x, colour = y) Note: vectors can consist of characters (i.e. letters/words) instead of numbers, but never numbers AND characters
20
Data frames The function data.frame( ): This is a special kind of list, in which the entries in a specific position in the elements of the list correspond to one another Each element of the list has the same length It is a rectangular table, with rows and columns
21
Data frames Example 1: Simple data frames can be created Enter the following information at the prompt line: h <- c (150, 170, 168, 179, 130) w <- c (65, 70, 72, 80, 51) patient_data <- data.frame (weight=w, height=h) Type in patient_data to see what’s just been created…
22
Access of elements in data frames Individual elements can be accessed using a pair of square brackets “[ ]” and by specifying their index, or name Here are some ways to access a cell, row or column: patient_data$heightaccesses a column patient_data [, i]accesses the i th column patient_data [ i, ]accesses the i th row patient_data$height [i] i is the cell position in height column patient_data [ i, j ]looking for the j th cell in the i th column
23
Data frames More complex tables can be created Data within each column must have the same type (e.g., number, text), but different columns may have different types – like a spreadsheet, as in the example:
24
Data frames Accessing specific cells, or data: Note: "$" is a shortcut; minus "-" sign means not.
25
Tables We often view categorical data with tables The table function allows us to look at tables Its simplest usage is table(x) where x is a categorical variable
26
Tables Example: smoking survey A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > table(x) x No Yes 2 3 A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > table(x) x No Yes 2 3 The table command simply adds up the frequency of each unique value of the data
27
View a list of R packages:library() Access datasets with the data function data( ) provides a list of all the datasets data (Titanic) loads the Titanic dataset summary (Titanic) provides summary information about the Titanic dataset attributes(Titanic) provides more information Titanicdataset name will display the data List all datasets in a package, e.g., data(package='stats') R packages and datasets
28
List preloaded datasets in R:data( ) Display the “women” dataset :women Now let’s access specific data…… Access data from each column: women$height or women[,1] women$weight or women[,2] Access data from individual rows: women[1, ] or women[10,] etc. Try it……. Working through some examples
29
Now that you can access sample data, let’s work with it: Get the mean weight and height of the women in our example….. Remember the help function: help(mean) Also, R can show an example:example(mean) Working through some examples
30
Common useful functions print()# prints a single R object cat()# prints multiple objects, one after the other length()# number of elements in a vector, or of a list mean() median() range() unique()# gives the vector of distinct values sort()# sort elements into order order()# x[order(x)] orders elements of x rev()# reverse the order of vector elements print()# prints a single R object cat()# prints multiple objects, one after the other length()# number of elements in a vector, or of a list mean() median() range() unique()# gives the vector of distinct values sort()# sort elements into order order()# x[order(x)] orders elements of x rev()# reverse the order of vector elements
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.