Introduction to R Lecture 1: Getting Started Andrew Jaffe 8/30/10
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator
About the Course Series of 7 seminars Covers the usage of R –Platform for beginning analyses –NOT covering statistics –Good programming etiquette Bring your laptop – there will be breaks to allow you to practice the code
About the Course This seminar is 1 unit pass/fail To pass, attend 5 out of 7 seminars Very little outside work
About the Course Some learning objectives include: –Importing/exporting data –Data management –Performing calculations –Recoding variables –Producing graphics –Installing packages –Writing functions
About the Course Course communication via Lectures and code will be hosted on my webpage – htmlhttp:// html
About the Instructor 3 rd year PhD student in Genetic Epi program, concurrent MHS in Bioinformatics Learned R five years ago, been using regularly the last two
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
What is R? R is a language and environment for statistical computing and graphics R is the open source implementation of the S language, which was developed by Bell laboratories R is both open source and open development
What is R? Pros: –Free –Tons of packages, very flexible –Multiple datasets at any given time Cons: –Much more “programming” oriented –Minimal interface These are my personal opinions
What is R? Often times, a good first step for data cleaning and manipulation Then, export data to STATA or SAS for Epi analyses
What is R? ConsoleScript
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
Installing R
Installing R - Windows Windows: click “base” and download
Installing R - Windows Click the link to the latest build
Installing R - Mac Mac: click the latest package’s.pkg file
Installing R Double click the downloaded file Hit ‘next’ a few times Use default settings Finish installing
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
Installing a Text Editor Windows: R’s built-in text editor is terrible –It’s essentially Window’s notepad –We will download a much better one Mac: R’s built-in text editor is sufficient –Color coding, signals parenthesis closing, etc –I suggest using this until you think you need a better one
Installing a Text Editor I prefer Notepad++: – –Download the current version: npp.5.7.Installer.exe npp.5.7.Installer.exe –Install on your computer using defaults
Installing a Text Editor
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
Interfacing with R Scripts: documents that contain reproducible R code and functions that you can send to the console (and save) –Files are designated with the “.R” extension –You can “source” scripts (more later) Console: Type commands directly into the console –Good for looking at your data, trying things, and plotting
Interfacing with R - Mac Mac: File New Script This opens the default text editor To send a line of code to the R console, press Apple+Enter when the cursor is anywhere on that line Highlight chunks of code and press Apple+Enter to send
Interfacing with R - Windows Using the default text editor, pressing Ctrl+R sends lines to the console However, we want to use Notepad++ We need to download one more thing…
Interfacing with R - Windows “NppToR”: Notepad++ to R It must be running when R and Notepad++ are open When properly configured, press F8 to send lines of code, or highlighted chunks, to the console I will help configure this after class today
Interfacing with R – Windows More detailed instructions for installing NppToR or/index.php?title=Installinghttp://sourceforge.net/apps/mediawiki/nppt or/index.php?title=Installing
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
Writing Scripts The comment symbol is # (pound) in R Comment liberally - you should be able to understand a script after not seeing it for 6 months Lines of #’s are useful to separate sections Useful for designating headers
Writing Scripts ################# # Title: Demo R Script # Author: Andrew Jaffe # Date: 7/30/10 # Purpose: Demonstrate comments in R ################### # this is a comment, nothing to the right of it gets read # this # is still a comment – you can use many #’s as you want # sometimes you have a really long comment, like explaining what you # are doing for a step in analysis. Take it to a second line
Writing Scripts Some common etiquette: –You can use spaces (more generally “white space”) within functions and commands liberally as well –Try to keep a reasonable number of characters per column – many commands can be broken into multiple lines –More to come later…
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
R as a Calculator The R console functions as full calculator Try to play around with it: +, -, /, * are add, subtract, multiply, and divide ^ or ** is power ( and ) work with order of operations
Lecture 1 Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator Assignment
The assignment… operator: assigning a value to a name R accepts two operators “<-” and “=“ –Ie: x=8 (remember whitespace!: x = 8, x <- 8) Variable names are case-sensitive –Ie: X and x are different Set x = 8, and try using calculator functions on x
Assignment ‘Assignment’ literally puts whatever is on the right side of the operator into your left- hand side variable –Note that although you can name variables anything, you might run into some issues naming things the same as default R functions Np++ turns functions red/pink so you know…
Examples of assignment, introducing R data Enough to get R up and running if this is the only class you attend. We will see them in much more detail over the next three sessions
Assignment status <- c(“case”,”case”,”case”, “control”,”control”,”control”) status class(status) table(status) factor(status) [alternatively: status <- c(rep(“case”,3), rep(“control”,3))]
Assignment web <- “ code.R” code.R –class(web) –source(web) You also don’t have to save tables/data you find online to your disk (note read.table works for most things – below aren’t tables though) –scan(web, what=character(0), sep = "\n") –scan(“ what=character(0))
Assignment mat <- matrix(c(1,2,3,4), nrow = 2, ncol = 2, byrow = T) # this is sourced in class(mat) mat mat + mat mat * mat mat %*% mat
Assignment class(dat) # dat is also sourced in head(dat) table(dat$sex, dat$status) …To be continued…
Questions?