Download presentation
Presentation is loading. Please wait.
Published byRandell Sullivan Modified over 9 years ago
1
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Working In R (with speaker notes)
2
2 An Introduction – UCF, Methods in Ecology, Fall 2008 What We Will Learn More About Dataframes Slicing, Dicing and Sorting Data Manipulating Dataframes & Aggregating Data Simple Iterative Processing Basic Data Visualization
3
3 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes Dataframes Reviewed –Are objects that consist of rows and columns –Closely related to MS Access or Excel tables –Specific requirements for dataframes Observations are in rows The response variable and explanatory variables are in columns Same variable results go into the same column
4
4 An Introduction – UCF, Methods in Ecology, Fall 2008 ObservationResponseTreatment Obs11.5A Obs21.6B Obs31.3C ……… Dataframes Reviewed –Specific requirements for dataframes (continued) Observations are in rows The response variable and explanatory variables are in columns More About Dataframes
5
5 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes ControlX1X2 1.31.41.3 1.11.31.6 1.41.51.8 IDResponseTreatment11.3Control 21.1Control 31.4Control 41.4X1 51.3X1 61.5X1 71.3X2 81.6X2 91.8X2 Dataframes Reviewed –Specific requirements for dataframes (continued) Same variable results go into the same column Good dataframe conventions
6
6 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes Loading Dataframes with Data –Use the read.table() series of functions Particularly simple and efficient is the read.delim function –Load Snake dataset and get acquainted snake <- read.delim(“c:\\pract_i-t\\snake_data.txt", row.names="ID") names(snake) snake[1:5,] summary(snake)
7
7 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes Loading Dataframes with Data (continued) –Identifying columns as categorical variables During read operations alpha-numeric fields are automatically encoded as factors Numeric fields are automatically assumed to be continuous values Identify numeric fields as factors using: –snake$landc <- factor(snake$landc) –snake$SEX <- factor(snake$SEX) summary(snake)
8
8 An Introduction – UCF, Methods in Ecology, Fall 2008 Slicing, Dicing and Sorting Data Subscripts –Performing data extraction by indexing: snake[1:5,] snake[40,] snake[,1] snake[,c(7, 1, 2)] snake[-c(10, 20, 30, 40), c("Name", "SEX", "ha")]
9
9 An Introduction – UCF, Methods in Ecology, Fall 2008 Slicing, Dicing and Sorting Data Subscripts (continued) –Performing data extraction by using filtering: snake[snake$landc == 1,] snake[snake$landc %in% c(1, 3),] snake[snake$mcp > 200,] snake[snake$mcp > 200 & snake$times 200 & snake$times < 60,] snake[grep("^m", snake$Name ),]
10
10 An Introduction – UCF, Methods in Ecology, Fall 2008 Slicing, Dicing and Sorting Data Sorting Data –snake[order(snake$Name),] –snake[order(snake$landc, -snake$mcp),] –snake[order(-snake$times),][1:10,]
11
11 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data Adding Columns to a Dataframe –Close out and restart R –Load the rapid fish dataset fish <- read.delim("c:\\pract_i-t\\rapid_fish.txt", row.names="ID") rapid$impoundment <- factor(rapid$impoundment) rapid$season <- factor(rapid$season) rapid$open_veg <- factor(rapid$open_veg) rapid$sea_code <- factor(rapid$sea_code) rapid$imp_code <- factor(rapid$imp_code) rapid$cov_code <- factor(rapid$cov_code)
12
12 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data Adding Columns to a Dataframe (cont.) –Add “unique” column rapid$unique <- paste(rapid$Point, rapid$season, sep = "") rapid$unique <- factor(rapid$unique) –Add log transformation of count column rapid$lncount <- log(rapid$count + 1)
13
13 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data Overview the Rapid Fish Dataset –names(rapid) –rapid[1:5,] –summary(rapid) Filtered Summaries –summary(rapid[rapid$open_veg == "open",]) –summary(rapid[rapid$open_veg == "vegetated",])
14
14 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data Cross Tabulation –Using the table() function table(rapid$impoundment) table(rapid[, c("open_veg", "impoundment")]) table(rapid[, c("open_veg", "impoundment", "season")])
15
15 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data Data Aggregation –Using the aggregation() function aggregate(rapid$count, list(impoundment=rapid$impoundment), mean) aggregate(rapid$count, list(open_veg=rapid$open_veg, impoundment=rapid$impoundment, season=rapid$season), mean)
16
16 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data Merging Results –Using the merge() function n <- table(rapid[, c("open_veg", "impoundment", "season")]) m <- aggregate(rapid$count, list(open_veg=rapid$open_veg, impoundment=rapid$impoundment, season=rapid$season), mean) names(m)[4] <- "mean" s <- aggregate(rapid$count, list(open_veg=rapid$open_veg, impoundment=rapid$impoundment, season=rapid$season), sd) names(s)[4] <- "sd" nm <- merge(n, m) names(nm)[4] <- "n" habimpsea <- merge(nm, s) habimpsea
17
17 An Introduction – UCF, Methods in Ecology, Fall 2008 Simple Iterative Processing Repeating a Process –Using the for() control-flow construct site <- sort(unique(rapid$impoundment)) for (i in 1:length(site)) { print (summary(rapid[rapid$impoundment == site[i],])) }
18
18 An Introduction – UCF, Methods in Ecology, Fall 2008 Basic Data Visualization Visualizing Data –Using the plot() function plot(as.numeric(rapid$sea_code), rapid$count) –Using the boxplot() function boxplot(rapid$count~rapid$impoundment) –Using the hist() function par(mfrow=c(1,2)) hist(rapid$count) hist(rapid$lncount)
19
19 An Introduction – UCF, Methods in Ecology, Fall 2008 The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.