Introduction to R Las Vegas 2015 James McCaffrey Microsoft Research, Advanced Development Tuesday, October 27, 2015 2:15 - 3:30 PM


Introduction to R Las Vegas 2015 James McCaffrey Microsoft Research, Advanced Development Tuesday, October 27, :15 - 3:30 PM

Agenda What is R? Why consider learning R? Three R Development Environments Examples of R vs. C# Summary, Resources, Q&A

What is R ? R is a scripting language, plus an interactive shell environment, plus a large library of math functions. R is open source and has strong support from all key industry, research, government, and academia players.

What is R - The Hello World of R > setwd("C:\\IntroToR") > > t <- read.table("Income.txt", header=TRUE, sep=",") > > head(t, n=3) Occupation Age Tech Income 1 Developer Developer Developer > > m <- lm(t$Income ~ (t$Occupation + t$Age + t$Tech)) > > summary(m) Call: lm(formula = t$Income ~ (t$Occupation + t$Age + t$Tech)) Residuals: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) t$OccupationManager t$OccupationQuality * t$Age t$Tech * --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 3 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 20.6 on 4 and 3 DF, p-value: >

Why R? The most common language for Data Science Microsoft acquisition of RevolutionR Microsoft Azure ML and ML Studio Consulting aspect Big Data and little data (and IoT?) Relatively easy to learn* R Consortium

Installing the Base R Environment

Launching R – Start Menu (Rgui.exe)

Launching R – File Explorer (Rterm.exe)

The RStudio Environment

The Revolution R (Microsoft) Environment

The Revolution R (Microsoft) Environment

R vs. C# (the t-Test)

R vs. C# (the t-Test)

R vs. C# (the t-Test)

R vs. C# (LDA Analysis)

R vs. C# (Graphing)

Programming using R and OOP # file CarClass.R require("R6") Car <- R6Class("Car", public = list( make = NULL, price = NULL, initialize = function(ma, pr) { self$make <- ma self$price <- pr }, setMake = function(ma) { self$make <- ma }, # setPrice = function(pr) { self$price <- pr }, display = function() { cat("Make = ", self$make, " Price = ", self$price, "\n") } ) > source(“CarClass.R”) > > myCar <- Car$new(“Audi”, 40000) > > myCar$display() Make = Audi Price = > > myCar$setMake(“BMW”) > myCar$price = > > print(myCar) Public: display: function initialize : function make: BMW price: setMake: function >

R vs. C# (Packages, Libraries, Scripts) An R package is a collection of file(s) that contain R functions An R library is 1.) R terminology for the location of a package, or 2.) a DLL (on Windows) The install() command installs an R package The library() command loads an R package for use An R script is a set of R commands R has basic control structures (if – else, for, while, repeat) and four different OOP paradigms

Alternatives to R MatLab – very pricey Mathematica - pricey SciLab, Octave – open source versions of MatLab SAS – very pricey SPSS (IBM) – very pricey Python – general purpose (with SciPy library)

Your Four Possible Roles with R Use R in interactive mode for ad hoc data analysis Act as a data expert to help an R consultant Write R scripts to automate recurring data analysis Write R code to create custom data analysis

Summary R is the deeply entrenched default language for “Data Science” RStudio is the most common optional environment Understanding statistics* is the key to R C# is general purpose, R is domain specific Best examples for R are chaotic Web pages

Resources McCaffrey, J., “Introduction to R for C# Programmers”, Microsoft MSDN Magazine, July 2015 (vol. 30, no. 7) McCaffrey, J., “Introduction to R for.NET Developers”, Visual Studio Magazine, December 2015 (vol. 25, no. 12)

Introduction to R Las Vegas 2015 James McCaffrey Microsoft Research, Advanced Development Tuesday, October 27, :15 - 3:30 PM Thank You !