Presentation is loading. Please wait.

Presentation is loading. Please wait.

R: Packages & Data. Presented here are a number of ways to accomplish a task, some are redundant or may not represent the best way to accomplish a task.

Similar presentations


Presentation on theme: "R: Packages & Data. Presented here are a number of ways to accomplish a task, some are redundant or may not represent the best way to accomplish a task."— Presentation transcript:

1 R: Packages & Data

2 Presented here are a number of ways to accomplish a task, some are redundant or may not represent the best way to accomplish a task. However, some “quick & dirty” commands are useful to know for when all the “better” options aren’t working

3 R Packages What is an R package? –A series of programs bundled together Once installed a copy of the package lives on the computer and doesn’t need to be reinstalled Updating R –Must reinstall packages –May loose packages that aren’t kept updated

4 Packages-> Install Package

5 Choose a Mirror Site

6 Choose Package

7 Loading Package/Contents To load a package –library(package name) Contents of package –library(help= package name) For additional documentation –http://cran.r-project.org/http://cran.r-project.org/ Packages  Package Name  Downloads: Reference Manual Note: Some packages may overwrite the contents or functions in another package, when this happens it will be indicated in the log

8 Advanced: Loading Packages To find out what packages are already installed on a computer –installed.packages() To check if a given package is installed –is.installed <- function(mypkg) is.element(mypkg, installed.packages()[,1]) To install a package without clicking through windows –Install.package(“Package Name”) These last two commands are particularly helpful when writing functions for other users

9 Functions within a Package To get help –?FunctionName –??Topic of Interest To see the source code –Function Name To see an example –example(Function Name)

10 Getting Started: Loading Files help(topic) ?topic help.search(“topic”) ??topic str() ls() dir() history() library() library(help=) rm() rm(list=ls()) example() setwd() source() function

11 Data Manipulation: Data Entry Types of Data –Numerical, categorical, logical, factors –mode(variable) Formats of Data –Scalar, vector/array, matrix, data frame, list Ways to enter data –Manually –read.csv,read.table,scan –library(foreign) –library(Hmisc)

12 Importing from SAS Option One: –In SAS proc export DATA=file DBMS=CSV OUTFILE=“destination\name.csv"; run; –In R read.csv()

13 Syntax –read.csv(file, header = TRUE, sep = ",“, dec=".", fill = TRUE,...) File: the name of the file which the data are to be read from. Each row of the table appears as one line of the file. If it does not contain an absolute path, the file name is relative to the current working directory, getwd(). File can also be a complete URL. Header: a logical value indicating whether the file contains the names of the variables as its first line. If missing, the value is determined from the file format: header is set to TRUE if and only if the first row contains one fewer field than the number of columns. Sep: the field separator character. Values on each line of the file are separated by this character. If sep = "" (the default for read.table) the separator is ‘white space’, that is one or more spaces, tabs, newlines or carriage returns. Dec: the character used in the file for decimal points. fill :logical. If TRUE then in case the rows have unequal length, blank fields are implicitly added. See ‘Details’. Additional Options available, see documentation Note: If you’re desperate to read in an unusual data type see “scan”

14 .RData The extension.RData is a way to store objects created in R. Store using the command save(c(object1, object2),file=“Storage.RData”) Access later using load( “Storage.RData”)

15 Advanced: Reading Data directly from SAS or STATA SAS Option Two: –In SAS libname library xport =“destination\name.xpt"; data library.data; set data; run; –In R library(Hmisc) data<-sasexport.get(“destination\name.xpt“) STATA –library(foreign) NOTE: THE PACKAGE FOREGIN CAN HANDLE MULTIPLE FILE TYPES INCLUDING SAS –data.stata<-read.dta(“file.dta")

16 Data Entry c(…) seq(from,to) rep(x,times) data.frame() list() matrix() read.dta() sasxport.get() read.csv() data() data(R DataSet) help(R DataSet) load()

17 Data Information mode() is.character() is.numeric() is.logical() is.factor() class() is.matrix() is.data.frame() names() head() tail() length() dim() nrow() ncol() is.na() dimnames() rownames() colnames() unique() describe() levels()

18 Data Manipulation It is possible to access subsets of a data item using bracketed commands. (e.g. x[n] ) Options to do this includes the everything but command (x[-n]), multiple selections (x[1:n] or x(c(1,2,3)]) Logical Arguments can also be used (x[x > 3 & x < 5]) Lists use a double bracketing structure ( x[[n]] ) Data frame items can be called using two formats –x[[“name”]] –x$name Anything with row and column data uses a double structure to index (x[ i, j ])

19 Data Manipulation as.numeric() as.logical() as.character() as.array() as.data.frame() as.matrix() factor() ordered() t() reshape() cat() rbind() cbind() merge() sort() order() library(reshape) rownames()<-c() colnames()<-c() na.omit() cut()

20 Character & Time Based Data nchar() substr() tolower() toupper() chartr() grep() match() %in% pmatch() charmatch() sub() strsplit() paste() Sys.time() Sys.Date() date() as.Date as.POSIXct()

21 SymbolMeaning %dDay as a number (01-31) %aAbbreviated Weekday (Mon) %HHours as decimal number (00-23) %IHours as decimal number (01-12) %wWeekday as decimal number (0–6, Sunday is 0). %WWeek of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1). The UK convention. %xDate, locale-specific. %XTime, locale-specific. %zTime zone %jDays of year as decimal number (001-366) %MMinute as decimal number (00-59) %pAM/PM indicator in the lcoale (Used in conjunction with %I and not with %H) %SSecond as decimal number (00-61), allowing for up to two leap-seconds %UWeek of the year as a decimal (00-53), using Sunday as the first day 1 of the week %AUnabbreviated Weekday (Monday) %cDate and time, locale-specific. %mMonth (2) %bAbbreviated Month (Feb) %BUnabbreviated Month (February) %yTwo-Digit Year (11) %YFour-Digit Year(2011)

22 Data Export ftable() format() paste() xtable() write.table(data,"clipb oard",sep="\t",col.na mes=NA) write.csv() write.foreign() write.dta sink() save() print() save.image()

23 format() Syntax –format(x, trim = FALSE, digits = NULL, nsmall = 0L, justify = c("left", "right", "centre", "none"), width = NULL, na.encode = TRUE, scientific = NA, big.mark = "", big.interval = 3L, small.mark = "", small.interval = 5L, decimal.mark = ".", zero.print = NULL, drop0trailing = FALSE,...) –X: any R object –Trim: logical, if FALSE numbers are right-justified to a common width, If TRUE the leading blacks for justification are suppressed. –Digits: how many significant digits should be used. –justify: character, vector should be left-justified, right-justified, or centered. –See also format.Date,(methods for dates) format.POSIXct (date-times)

24 Extra Resources

25 Advanced Packages to try –gtools –reshape Journal of Statistical Computing –http://stat-computing.org/http://stat-computing.org/ Journal of Statistical Software –http://www.jstatsoft.org/http://www.jstatsoft.org/

26 http://journal.r-project.org/

27 www.rseek.org

28 http://r-forge.r-project.org/

29 http://www.statmethods.net/index.html


Download ppt "R: Packages & Data. Presented here are a number of ways to accomplish a task, some are redundant or may not represent the best way to accomplish a task."

Similar presentations


Ads by Google