Introduction to R Studio Basic Features, Rudimentary Data Analysis, & Graphing Options
Organization 1. What is R Studio & Why should you use it? 2.Where to get R Studio 3. Layout of R Studio 4. Installing Packages 5. Loading Packages 6. Bringing Data into R Studio/Exporting Data 7. Data Types in R 8. Helpful Resources
1-A: What is R Studio? An alternative to base R Benefits of R Studio Friendlier user interface Open source, but paid option as well (not worth it for typical users) Same install process Benefits of R Studio Easier transition from Stata to R Multiple window views at once – Results, Code, Working Environment, and Graphs
1-B: What is R Studio? Benefits of R in General Open Source Means free Can make a working object out of anything Lets you save: Results Graphs This is really helpful – can add elements to graphs after you create them! Data frames Vectors Etc…
1-C: What is R Studio? An Important Note Citing R Studio & Packages If using R or R Studio for analyses that will be published, should always cite the version of R and the versions of the packages you have used Results can change across versions, not typical for versions of R, but fairly typical for package versions Is this a reason not to use R/R Studio? No, Stata has the same problems but no one ever acknowledges it See - http://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm Past users of psmatch2 were obtaining systematically biased results (it has been fixed now, to the extent of my knowledge).
1-D: Why Should You Use R Studio? Versatility Packages are created and updated frequently Adapts to newer methods faster than Stata or SPSS Double-edged blade with this, though Superior graphing capabilities Yup, I said it…. Seriously – graphs may be stored and have plots added/changed after they have been made Great package for excellent plots – ggplot2 (I’ll be demonstrating plots with this later) Knowledge of Code Transferrable to other programs (kind of) E.g. Matlab, SAS, C++, etc… Have I mentioned that it is free?
2-A: Where to get R or R Studio Options for using R Base R package http://cran.r-project.org/bin/windows/base/ An alternative – R Studio http://www.rstudio.com/ I find it to be much more user-friendly Also, an easier transition from Stata Look and feel of the program is similar Differences As far as I can tell, none for the typical user outside of the different interface Important note Must install base R package before R Studio will work
3-A: Layout of R Studio Four Primary Windows Console Source Equivalent of results window in Stata Source Equivalent of Do file editor in Stata Not open by default Environment Closest equivalent is Properties window in Stata File/Plot/Packages/Etc…Viewer Similar to Properties window in Stata but has much more functionality
3-B: Layout of R Studio Default Setup
3-C: Layout of R Studio With Source Code Window
4-A: Installing Packages into R Studio Works similarly to adding packages into Stata (e.g., psmatch2, gllamm, etc…) Two methods: Point and Click Syntax
4-B: Installing Packages into R Studio Point and Click Method Step 1: Tools Install Packages Step 2: Enter package name, check dependencies
4-C: Installing Packages into R Studio Point and Click Method Step 3: Install package Step 4: Verify install in Console
4-D: Installing Packages into R Studio Syntax Method Step 1: Write then run code Step 2: Verify install in Console
5-A: Loading Packages in R Studio Unlike Stata, packages other than the base package are not loaded by default They need to be explicitly loaded by the user This can be adjusted through Tools Global Preferences menu Again, two methods: Point and click Syntax
5-B: Loading Packages in R Studio Point and Click Method Step 1. Packages Check Box Step 2: Verify load in Console
5-C: Loading Packages in R Studio Syntax Method Step 1: Write then run code Step 2: Verify load in Console
6-A: Bringing Data into R Studio Native R data forms .RData or .rda Can import multiple types of files, just like Stata Text data (.txt, .TXT, etc…) Can be imported using the read.table(…) function Spreadsheet data (.csv, etc…) Must use sep option (e.g., read.table(“file name here”, sep=“,”) )for comma separated files
7-A: Basic Features of R Studio Data Storage Types Numeric Equivalent in Stata is “byte” Meaningful numbers Character Equivalent in Stata is “string” Contains alpha characters Logical True or False Special NA (Not available) Typical missing value indicator Can have a variety of classes Inf and –Inf Basically, what is returned when the number is too large or small NaN Not a number Null Distinct from NA Not recognized in vectors, has no class Best thought of as “undefined”
7-B: Basic Features of R Studio Types of Data Vectors May be numeric, character, or logical Matrices Combination of vectors Must all be the same storage type (e.g., numeric) and length Arrays 2+ Dimensional versions of matrices Same restrictions apply Data Frames Most important for our purposes Vectors comprising it may have multiple storage types Lists Serve a number of purposes Can be made up of any combination of data storage types Factors May explicitly tell R Studio that a variable is nominal or ordinal (similar to “encode” function in Stata) Integers are mapped to character values, character value are kept as labels
8-A: Helpful Resources Websites Coursera – Johns Hopkins Data Science Series https://www.coursera.org/learn/r-programming R Website (PDF) https://cran.r-project.org/doc/contrib/usingR.pdf Stack Overflow http://stackoverflow.com/ Swirl – Interactive Learning in R (I used this) http://swirlstats.com/
8-B: Helpful Resources (Cheap!) Books R Cookbook http://smile.amazon.com/Cookbook- OReilly-Cookbooks-Paul- Teetor/dp/0596809158/ref=sr_1_1?ie=UT F8&qid=1460556795&sr=8- 1&keywords=r+cookbook R Graphics Cookbook (uses ggplot2) http://smile.amazon.com/R-Graphics- Cookbook-Winston- Chang/dp/1449316956/ref=sr_1_1?ie=U TF8&qid=1460557744&sr=8- 1&keywords=r+graphics+cookbook