Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reading a file R can read a wide variety of input formats Text,

Similar presentations


Presentation on theme: "Reading a file R can read a wide variety of input formats Text,"— Presentation transcript:

1 Reading a file R can read a wide variety of input formats Text,
Statistical package formats (e.g., SAS) DBMS

2 Reading a text file Delimited text file, such as CSV
Creates a data frame Specify as required Presence of header Separator Row names It will not find this local file on your computer. Mac require(readr) t <- read.csv("~/Dropbox/Carolina/Paper2/Fixed Encoding Data/changeBrasil.txt", stringsAsFactors=FALSE) t <- read.csv("C:\\Dropbox\Carolina\\Paper2\\Fixed Encoding Data\\changeBrasil.txt", stringsAsFactors=FALSE) PC

3 Reading a text file Can read a file using a URL
t <- read.table(url, header=T, sep=',')

4 Learning about an object
Click on the name of the file in the top-right window to see its content url <- " t <- read.table(url, header=T, sep=',') head(t) # first six rows tail(t) # last six rows dim(t) # dimension str(t) # structure of a dataset class(t) #type of object Click on the blue icon of the file in the top-right window to see its structure

5 Referencing data datasetName$columName Column Data set
# Referencing your data # Qualify with tablename to reference fields mean(t$temperature) sd(t$temperature) max(t$year) range(t$month) Column Data set

6 Creating a new column Formula to transform Fahrenheit to Celsius
# Creating a new column t$Ctemp <- round((t$temperature-32)*5/9,1) head(t)

7 Renaming a column and writing a file
# Renaming a column colnames(t)[3] <- 'Ftemp' # rename third column to indicate Fahrenheit head(t) # Save a file write.table(t,"centralparktempsCF.txt") The file is stored in your default location (maybe documents or the folder where you save the script)

8 sqldf A R package for using SQL with data frames Returns a data frame
Supports MySQL

9 Subset and Sort Selecting rows Selecting columns
Selecting rows and columns Sorting on column name library(sqldf) options(sqldf.driver = "SQLite") # to avoid a conflict with RMySQL trowSQL <- sqldf("select * from t where year = 1999") tcol <- t[,c(1:2,4)] tcolSQL <- sqldf("select year, month, Ctemp from t”) trowcolSQL <- sqldf("select year, month, Ctemp from t where year > 1989 and year < 2000") sSQL <- sqldf("select * from t order by year desc, month")

10 Recoding Some analyses might be facilitated by the recoding of data
Split a continuous measure into two categories t$Category <- 'Other’ head(t) t$Category[t$Ftemp >= 30] <- 'Hot’

11 Deleting information on a column
Assign NULL t$Category <- NA

12 Aggregate data Summarize data using a specified function
Compute the mean monthly temperature for each year # Average F temperate for each year a <- aggregate(t$Ftemp, by=list(t$year), FUN=mean) # Name columns colnames(a) = c('year', 'mean') a sqldf("select year, avg(Ftemp) as mean from t group by year")

13 Exercise Using sqldf Compute the maximum temperature for year 2000

14 Compile a notebook A notebook is a report of an analysis
Interweaves R code and output File > Compile Notebook … Select html, pdf, or Word output Install knitr before use Install suggested packages

15 HTML

16 Resources R books Reference card Quick-R DataCamp
If you ever use R and get an error, DO NOT PANIC. Google your error and search for answers in StackOverFlow—they are usually very good!

17 Key points R is a platform for a wide variety of data analytics
Statistical analysis Data visualization HDFS and MapReduce Text mining Energy Informatics R is a programming language Much to learn


Download ppt "Reading a file R can read a wide variety of input formats Text,"

Similar presentations


Ads by Google