DATA MANAGEMENT MODULE: Getting Data Into and Out of R STAT 4030 – Programming in R DATA MANAGEMENT MODULE: Getting Data Into and Out of R Jennifer Lewis Priestley, Ph.D. Kennesaw State University 1
DATA MANAGEMENT MODULE Importing and Exporting Imputting data directly into R Creating, Adding and Dropping Variables Assigning objects Subsetting and Formatting Working with SAS Files Using SQL in R 2 2 2
Importing and Exporting: Data types Scalar – a single value Vectors – a single column (or row )of data Example – a numeric vector containing test scores of students Matrix – a collection of vectors but must all be of the same data type Data frame – a special matrix that can contain both numeric and character columns.
The easiest way to add data to R is to code it in… Importing and Exporting: Inputting Data in R The easiest way to add data to R is to code it in… names <- c("Bob","Gene","Valerie") age <- c(30, 40, 19) hometown <- c("Dallas, TX", "Little Rock, AR", "Dayton, OH") Note: How are the categorical values in the vector “names” coded? How are the numeric values in the vector “age” coded?
Importing and Exporting: reading data into R c() – creates a column vector (like creating a variable for analysis) read.csv – reads in a CSV file. read.table – more generic function that can read in files with any delimiter, such as tab delimited. read.fwf – reads in fixed record formats. 5
Importing and Exporting: reading data into R In order to import data into R, we need to know how the file was created: Excel SAS SPSS Delimited Fixed field or record 6
Easiest to save Excel as a CSV (comma delimited) file. Importing and Exporting: Excel Easiest to save Excel as a CSV (comma delimited) file. Click File > Save As On the next GUI, select “CSV” from the “save as type” option. Use the read.csv function to get the data into R. 7
Importing and Exporting: CSV 7
Importing and Exporting: CSV Files widge <- read.csv("C:\\Path here\\WidgeOne.csv") head(widge) 8
Importing and Exporting: Exporting data Depending on what kind of file format you want to export back to, there are several options – delimited, fixed field or record or proprietary formats like .sas7bdat or .sav names <- c("Bob","Gene","Valerie") age <- c(30, 40, 19) hometown <- c("Dallas, TX", "Little Rock, AR", "Dayton, OH") customers<- data.frame(names,age,hometown) write.table(customers, “pathhere", sep=",", col.names=NA, row.names = TRUE) 9