Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jefferson Davis Research Analytics

Similar presentations


Presentation on theme: "Jefferson Davis Research Analytics"— Presentation transcript:

1 Jefferson Davis Research Analytics
R Bootcamp Day 2 Jefferson Davis Research Analytics

2 Day 1 stuff From yesterday
R values have types/classes such as numeric, character, and logical Much of R functionality is in libraries For help on a function run ? sin() from the R console

3 R: Creating variables We will spend the next few slides building up the ornithological data set below

4 R: Creating variables Start by creating each column as a vector.. Wingcrd <- c(59, 55, 53.5, 55, 52.5, 57.5, 53, 55)] Tarsus <- c(22.3, 19.7, 20.8, 20.3, 20.8, 21.5, 20.6, 21.5) Head <- c(31.2, 30.4, 30.6, 30.3, 30.3, 30.8, 32.5, NA) Wt <- c(9.5, 13.8, 14.8, 15.2, 15.5, 15.6, 15.6, 15.7)

5 R: Creating variables We can access values in the vector by indexing with brackets. Wingcrd[1] [1] 59 Wingcrd[3] [1] Wingcrd[1:3] [1] Wingcrd[-2] [1] sum(Wingcrd) [1] 440.5

6 R: Creating variables We have some variables now. Find the following values The 3rd entry in Wt The sum of the entries in Tarsus The sum of the entries in Head

7 R: Creating variables Note the missing value in Head cascades through other calculations sum(Head) [1] NA We can work around this if we want to sum(Head, na.rm = TRUE) [1] 216.1

8 R: Creating Dataframes
To assemble the data together we can bind our vectors coulmn-wise with cbind() BirdDF <- cbind(Wingcrd, Tarsus, Head, Wt) BirdDF Wingcrd Tarsus Head Wt [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] NA 15.7

9 R: Creating Dataframes
For typing reasons we us as.data.frame() to make sure we have a dataframe BirdDF <- as.data.frame(cbind(Wingcrd, Tarsus, Head, Wt)) Wingcrd Tarsus Head Wt NA

10 R: Creating Dataframes
We can pick out values by specifying rows and columns BirdDF[1, 2] Tarsus BirdDF[1, 2:4] Tarsus Head Wt BirdDF[, 3] [1] NA BirdDF[1, c(2, 3)] Tarsus Head

11 R: Creating Dataframes
Now the column names give you access to the data. BirdDF$Wingcrd [1]

12 R: Creating Dataframes
Keep in mind the dataframes look like matrices but they aren't. BirdDF %*% t(BirdDF) Error in BirdDF %*% t(BirdDF) : requires numeric/complex matrix/vector arguments Casting from from datatype to the other is easy as.matrix(BirdDF) as.data.frame(BirdData) The R code below works, although it's a bit pointless. as.matrix(BirdDF) %*% t(as.matrix(BirdDF))

13 R: Importing Data Most data sets will be created outside of R and imported. We create a csv file to import by writing the data to the disk getwd() [1] "/Users/majdavis" write.table(BirdDF, "BirdDF.csv", sep=",") BirdDF2 <- read.csv("BirdData.csv", header = TRUE)

14 R: Altering Data You will need to work with your data. This can be done at a low-level gBirdDF2 <- BirdDF BirdDF$Wt <- BirdDF$Wt * 28 #ounces to grams The package plyr and relatives can be useful library(plyr) BirdDF2 <- mutate(BirdDF, Wt= -Wt)

15 R: Altering Data This is handy with the forward pipe operator %>% from the magrittr library library(magrittr) BirdDF3 <- BirdDF %>% mutate(Wt = Wt*28) %>% tail(2) #Compare to BirdDF3 <- tail(mutate(BirdDF, Wt = Wt*28), 2)

16 R: Plotting Data The most basic plotting command in R is plot()
As a high-level function it will create axes, tick marks, etc. Many user-written classes will have default plot() functions that act reasonably

17 R: Plotting Data We'll explore this with a dataset of car speeds and stopping distances from the 1920s in R as cars (Ezekiel, M. (1930) Methods of Correlation Analysis. Wiley. ) head(cars) speed dist

18 R: Plotting Data By default plot() produces a scatterplot plot(cars)
Axis labels are from the names in the data frame Axis scale is from the range of the data

19 R: Plotting Data plot(cars$speed, cars$dist, main = "A Title", xlab = "The Speeds", ylab = "The Distances", col="steel blue")

20 R: Plotting Data Each plot() call creates its own axes, labels, etc
Combining plot() calls can be messy plot(cars) par(new=TRUE) plot((lowess(cars), type="l", col="red")) The graphs labels are clearly bad. While not obvious, the scales don’t match either.

21 R: Plotting Data To add details it’s better to use so-called low-level functions. plot(cars) line(lowess(cars), col="red")

22 R: Plotting Data The default plot type for a dataframe is a scatterplot plot(BirdDF)

23 R: Plotting Data For dataframes with lots of data scatterplots can be a bit much plot(mtcars)

24 R: Day 2 end There are a handful of exercises on Datacamp. Have fun and see you tomorrow!


Download ppt "Jefferson Davis Research Analytics"

Similar presentations


Ads by Google