R packages/libraries Data input/output Rachel Carroll Department of Public Health Sciences, MUSC Computing for Research I, Spring 2014.

Slides:



Advertisements
Similar presentations
A gentle introduction to R – how to load in data and produce summary statistics BRC MH Bioinformatics group.
Advertisements

The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week3: Data Input/Output (Import/Export) in R.
Data in R. General form of data ID numberSexWeightLengthDiseased… 112m … 256f3.61 NA1… 3……………… 4……………… n91m5.1711… NOTE: A DATASET IS NOT A MATRIX!
Basics of Using R Xiao He 1. AGENDA 1.What is R? 2.Basic operations 3.Different types of data objects 4.Importing data 5.Basic data manipulation 2.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
R for Research Data Analysis using R Day1: Basic R Baburao Kamble University of Nebraska-Lincoln.
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
Tutorial 8 Sharing, Integrating and Analyzing Data
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
RESEARCH HUB AT THE UNIVERSITY LIBRARIES PENN STATE UNIVERSITY TOUR OF STATISTICAL PACKAGES.
WINKS 7 Tutorial 6 – Opening an Excel data file Permission granted for use for instruction and for personal use. © Alan C. Elliott, 2015.
Add a File with X, Y coordinates to MapWindow
Pasewark & Pasewark 1 Access Lesson 6 Integrating Access Microsoft Office 2007: Introductory.
Access Tutorial 8 Sharing, Integrating, and Analyzing Data
1 Access Lesson 6 Integrating Access Microsoft Office 2010 Introductory Pasewark & Pasewark.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R.
Managing Your Own Data (…if you have to) Kathryn A. Carson, Sc.M. Senior Research Associate Department of Epidemiology Johns Hopkins Bloomberg School of.
Microsoft Word 2000: Mail Merge Basics Peggy Serfazo Marple Molly Calvello Support Professionals Business Applications - Desktop Microsoft Corporation.
Intro to R R is a free version of S-plus R is a free version of S-plus Can be used interactively but script or syntax files are commonly used to record.
Introduction to to R Emily Kalah Gade University of Washington Credit to Kristin Siebel for development of much of this PowerPoint.
Programming in R Getting data into R. Importing data into R In this session we will learn: Some basic R commands How to enter data directly into R How.
Arko Barman with modification by C.F. Eick COSC 4335 Data Mining Spring 2015.
Session 3: More features of R and the Central Limit Theorem Class web site: Statistics for Microarray Data Analysis.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Basic File Input and Output Copyright © Software Carpentry 2011 This work is licensed under the Creative Commons Attribution License See
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
CMPS 1371 Introduction to Computing for Engineers FILE Input / Output.
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
File Input and Output July 2nd, Inputs and Outputs Inputs Keyboard Mouse storage(hard drive) Networks O utputs Graphs Images Videos(image stacks)
Python Mini-Course University of Oklahoma Department of Psychology Lesson 18 Using text files to share data with other programs 5/07/09 Python Mini-Course:
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Comparison of different output options from Stata
MySQL Importing and creating a database. CSV (Comma Separated Values) file CSV = Comma Separated Values – they are simple text files containing data which.
Introduction to R Carol Bult The Jackson Laboratory Functional Genomics (BMB550) Spring 2011.

Matlab Basic. MATLAB Product Family 2 3 Entering & Quitting MATLAB To enter MATLAB double click on the MATLAB icon. To Leave MATLAB Simply type quit.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 

1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
Data Analysis with SPSS Lee Pierce Keith Mulbery Jason Archibald.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
ACCESS LESSON 1 DATABASE BASICS VOCABULARY. BACKSTAGE VIEW A menu of options and commands that allows you to access various screens to perform common.
Copyright 2007, Paradigm Publishing Inc. EXCEL 2007 Chapter 8 BACKNEXTEND 8-1 LINKS TO OBJECTIVES Import data from Access, a Web site, or a CSV text file.
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
Survey Training Pack Session 14 – Transferring CSPro, Access and Excel Files to SPSS.
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Introduction to R Studio
Data File Import / Export
Access Lesson 14 Import and Export Data
Uploading and handling databases
Working with Data in Windows
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
Advanced Data Import & Export Jeff Henrikson
Access Tutorial 8 Sharing, Integrating, and Analyzing Data
Microsoft Excel 2007 – Level 2
Installing Packages Introduction to R, Part II
funCTIONs and Data Import/Export
Basics of R, Ch Functions Help Managing your Objects
CSCI N317 Computation for Scientific Applications Unit R
Spreadsheets, Modelling & Databases
Tutorial 7 – Integrating Access With the Web and With Other Programs
Introduction to Matlab
TransCAD Working with Matrices 2019/4/29.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Presentation transcript:

R packages/libraries Data input/output Rachel Carroll Department of Public Health Sciences, MUSC Computing for Research I, Spring 2014

What is an R package? R package – a collection of R functions, data, and compiled code in a well-defined format R library – directory where the packages are stored R has a standard set of packages Free do download/install library() # see all packages installed on your computer search() # packages currently loaded

Install an R package Complete list of available packages: Follow the steps: 1. Open R, click on Packages menu and then Install Packages 2. Choose a CRAN mirror (USA – TN) 3. Select the package you want to add (e.g., survival) 4. Load it using library(pack.name) One step: install. packages(c(“pack.name1”,”pack.name2”,…))

R package - help library(help=“pack.name”) # information/description of the package unload(“package:pack.name, unload=TRUE) # unload the package ?? function.name # fuzzy matching

Some add-on packages base # base R functions datasets # base R datasets graphics # some functions: plot(), par() splines # regression spline functions and classes stats # some functions: lm(), glm() tools # tools for package development and admin. utils # some functions: load (), data()

Some popular R packages ggplot2 # best graphs zoo # time series analysis Hmisc # sample size computation, imputing missinng data rms # regression analyses and more (Frank Harrell) nlme and lme4 # mixed models ff # working with large arrays fields # spatial statistics

Data Types Scalars and Vectors (numerical, character, logical, complex) Matrices/arrays Lists/data frames Identify the data structure typeof() mode() class() Other useful functions str() # see structure describe() summary()

Data Types (con’t) Vectors – must consist of values of the same data type Factors - encode categorical data Matrices/Arrays Matrix: 2-dimensional array Arrays: stored as 1 dimensional structure; can stack several matrices

Data Types (con’t) Lists – ordered collections of objects of any type or dimension Data frames – store data of various formats with related entries in each row, different attributes in each column Let’s check some of these types by entering data from the keyboard!

Data Input Entering data within R save(object, file=“path//mydata.RData”) # store R data load(“mydata.Rdata”) Reading (importing) data into R read.table(“path//file.dat”) # read table-format data read.csv(“path//file.csv”) # read comma separated values (spreadsheet) read.delim() # read tab-delimited data read.fwf() # read fixed-width format data (.txt) scan() # read data directly into a vector or a list

Data Input – using packages From Excel – the first row of the file needs to contain the variable names library(RODBC) excel.data <- odbcConnectExcel(“path//myexcel.xls") mydata <- sqlFetch(excel.data, "mysheet") # specify the worksheet odbcClose(excel.data) From SAS – save SAS dataset in.xpt format, e.g. “mydata.xpt” library(Hmisc) mydata <- sasxport.get(“path//mydata.xpt") From Stata library(foreign) # can also read data from Minitab, SAS, SPSS mydata <- read.dta(“path/mydata.dta")

Data Input – using packages From SPSS - save SPSS dataset in.por format library(Hmisc) mydata <- spss.get(“path/mydata.por", use.value.labels=TRUE) Other useful packages library(gdata) # read.xls() for importing Excel data

Data Output Export data Tab delimited text file: write.table(mydata, “path//mydata.txt", sep="\t"),rownames=F) Comma separated file: write.csv(mydata, “path//mydata.csv", rownames=F, na=“ “) Will not preserve the special attributes, e.g. column type as character Save data in R binary format (more compact) save(“data”, file=“mydata.Rdata”) load(“mydata.Rdata”)

Data Output (con’t) R functions for data output print() # keep special characters paste() # concatenate vectors after converting to character format() # format an R object for nice printing cat() # output objects, concatenating the representations Sweave - create dynamic reports with the R code mixed with analysis output: tables, graphs, etc. into a nice PDF using LaTeX

Data Output – using packages Export to an Excel spreadsheet library(xlsReadWrite) write.xls(mydata, “path//mydata.xls") Export to SAS library(foreign) write.foreign(mydata, “path//mydata.txt", package="SAS") Export to Stata library(foreign) write.dta(mydata, “path//mydata.dta") Package xtable exports tables to LaTex or HTML

Some guidelines Use read.table() to read into a data frame (data needs to be rectangular) Use write.table() to write a data frame or matrix. It can also append data into a file. Use package RODBC for a universal way of importing all kinds of databases For large amounts of Excel data, convert to CSV and then read with read.csv(). Export with write.csv(). Great resource: