Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management Module: Creating, Adding and Dropping Variables

Similar presentations


Presentation on theme: "Data Management Module: Creating, Adding and Dropping Variables"— Presentation transcript:

1 Data Management Module: Creating, Adding and Dropping Variables
Programming in R Data Management Module: Creating, Adding and Dropping Variables

2 Data Management Module
Importing and Exporting Imputting data directly into R Creating, Adding and Dropping Variables Assigning objects Subsetting and Formatting Working with SAS Files Using SQL in R

3 Managing Variables: Accessing Variables
There are two basic ways to access variables: You can reference the column number or the variable number You can reference the variable name. PRIESTLEY/STAT4030

4 Managing Variables: Accessing Variables
Data frames in R are special matrices. Matrices have the concept of rows and columns (rXc). To access a cell X[r,c] To access a column X[,c] To access a row X[r,] PRIESTLEY/STAT4030

5 Managing Variables: Accessing Variables
For example, this will access all rows and column 1: fallsurvey[,1] This will access row 1 and all columns: fallsurvey[1,] This will access just the first obs in the first column: fallsurvey[1,1] PRIESTLEY/STAT4030

6 Managing Variables - Keep
There are a few different ways to specify the variables to keep in the data set We can specify specific columns by the column number We can also keep columns by specifying the name PRIESTLEY/STAT4030

7 Managing Variables - Keep
# Keep based on column number # fallsurvey1 <- fallsurvey[,1:2] head(fallsurvey1) str(fallsurvey1) # Keep based on variable name # fallsurvey2 <- fallsurvey[,c("Sem...Year","Adj.GPA")] head(fallsurvey2) PRIESTLEY/STAT4030

8 Managing Variables - Drop
Similarly, there are a few different ways to specify the variables to keep in the data set. We can specify specific columns by the column number: # Drop based on column number fallsurvey1 <- fallsurvey[,-2] head(fallsurvey1) fallsurvey2 <- fallsurvey[,names(fallsurvey)[c(-2,-3)]] head(fallsurvey2) While variables CAN be dropped by referencing the variable name, this is more difficult and will be covered later. PRIESTLEY/STAT4030

9 Managing Variables - Renaming
To rename a variable, you need to use the “names” function. Basically, here you are calling the function, identifying the dataset, then identifying the vector (column) that you are renaming: names(fallsurvey)[names(fallsurvey)=="Sem...Year"] <- "Sem/Year" str(fallsurvey) PRIESTLEY/STAT4030

10 Managing Variables - New variables
Variables can be easily created from other variables available in the data set. Variables can be created in the context of SQL statements. Care must be taken to ensure the resulting variable appears in the expected location. PRIESTLEY/STAT4030

11 Managing Variables - New variables
Think about the difference between these two sets of code… total.drinks <- fallsurvey$Drinks.before.Noon+fallsurvey$Drinks.after.Noon head(total.drinks) ls() fallsurvey$total.drinks <- fallsurvey$Drinks.before.Noon+fallsurvey$Drinks.after.Noon head(fallsurvey$total.drinks) head(fallsurvey) PRIESTLEY/STAT4030


Download ppt "Data Management Module: Creating, Adding and Dropping Variables"

Similar presentations


Ads by Google