Download presentation
Presentation is loading. Please wait.
Published byTeguh Gunawan Modified over 5 years ago
1
Data Management Module: Creating, Adding and Dropping Variables
Programming in R Data Management Module: Creating, Adding and Dropping Variables
2
Data Management Module
Importing and Exporting Imputting data directly into R Creating, Adding and Dropping Variables Assigning objects Subsetting and Formatting Working with SAS Files Using SQL in R
3
Managing Variables: Accessing Variables
There are two basic ways to access variables: You can reference the column number or the variable number You can reference the variable name. PRIESTLEY/STAT4030
4
Managing Variables: Accessing Variables
Data frames in R are special matrices. Matrices have the concept of rows and columns (rXc). To access a cell X[r,c] To access a column X[,c] To access a row X[r,] PRIESTLEY/STAT4030
5
Managing Variables: Accessing Variables
For example, this will access all rows and column 1: fallsurvey[,1] This will access row 1 and all columns: fallsurvey[1,] This will access just the first obs in the first column: fallsurvey[1,1] PRIESTLEY/STAT4030
6
Managing Variables - Keep
There are a few different ways to specify the variables to keep in the data set We can specify specific columns by the column number We can also keep columns by specifying the name PRIESTLEY/STAT4030
7
Managing Variables - Keep
# Keep based on column number # fallsurvey1 <- fallsurvey[,1:2] head(fallsurvey1) str(fallsurvey1) # Keep based on variable name # fallsurvey2 <- fallsurvey[,c("Sem...Year","Adj.GPA")] head(fallsurvey2) PRIESTLEY/STAT4030
8
Managing Variables - Drop
Similarly, there are a few different ways to specify the variables to keep in the data set. We can specify specific columns by the column number: # Drop based on column number fallsurvey1 <- fallsurvey[,-2] head(fallsurvey1) fallsurvey2 <- fallsurvey[,names(fallsurvey)[c(-2,-3)]] head(fallsurvey2) While variables CAN be dropped by referencing the variable name, this is more difficult and will be covered later. PRIESTLEY/STAT4030
9
Managing Variables - Renaming
To rename a variable, you need to use the “names” function. Basically, here you are calling the function, identifying the dataset, then identifying the vector (column) that you are renaming: names(fallsurvey)[names(fallsurvey)=="Sem...Year"] <- "Sem/Year" str(fallsurvey) PRIESTLEY/STAT4030
10
Managing Variables - New variables
Variables can be easily created from other variables available in the data set. Variables can be created in the context of SQL statements. Care must be taken to ensure the resulting variable appears in the expected location. PRIESTLEY/STAT4030
11
Managing Variables - New variables
Think about the difference between these two sets of code… total.drinks <- fallsurvey$Drinks.before.Noon+fallsurvey$Drinks.after.Noon head(total.drinks) ls() fallsurvey$total.drinks <- fallsurvey$Drinks.before.Noon+fallsurvey$Drinks.after.Noon head(fallsurvey$total.drinks) head(fallsurvey) PRIESTLEY/STAT4030
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.