Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.

Slides:



Advertisements
Similar presentations
R for Macroecology Aarhus University, Spring 2011.
Advertisements

Refresh- Caitlin Collins, Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis using
Lab7: Introduction to Arduino
Example of multivariate data What is R? R is available as Free Software under the terms of the Free Software Foundation'sFree Software Foundation GNU General.
Precedence Parentheses Arithemetic ^ * / + - (exception logical not ~ ) Relational > =
Lecture 2 LISAM. Statistical software.. LISAM What is LISAM? Social network for Creating personal pages Creating courses  Storing course materials (lectures,
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Data Objects in R Vector1 dimensionAll elements have the same data types Data types: numeric, character logic, factor Matrix2 dimensions Array2 or more.
DTS Conversion to SSIS Conversion Best Practices Mike Davis
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
R Programming Yang, Yufei. Normal distribution.
9/2/ CS171 -Math & Computer Science Department at Emory University.
Introduction to Programming in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Introduction to Exploratory Descriptive Data Analysis in S-Plus Jagdish S. Gangolly State University of New York at Albany.
Blog: R YOU READY FOR.
Scripting Just Enough SSIS to be Dangerous. 6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation.
Overview of Security Investments in SQL Server 2016 and Azure SQL Database Jamey Johnston 1/15/2016Security Investments in SQL Server 2016 and Azure SQL.
Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.
Blog: R YOU READY FOR.
MIS2502: Data Analytics Introduction to Advanced Analytics and R.
Pinellas County Schools
Review > x[-c(1,4,6)] > Y[1:3,2:8] > island.data fishData$weight[1] > fishData[fishData$weight < 20 & fishData$condition.
16BIT IITR Data Collection Module If you have not already done so, download and install R from download.
Blog: R YOU READY FOR.
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
Machine Learning with SQL Server 2016 & R Dinesh Asanka Senior Architect – Technology VirtusaPolaris.
Web Database Programming Using PHP
Introduction to R and Data Science Tools in the Microsoft Stack
Introduction to R and Data Science Tools in the Microsoft Stack
SQL 2016 R Services a.k.a. leveraging your local data lake
Programming in R Intro, data and programming structures
Enterprise Row Level Security: SQL Server 2016 and Azure SQL DB
Make Power BI Your Own with the Power BI APIs
Introduction to R Samal Dharmarathna.
LISAM. Statistical software.
Overview of Security Investments
Arko Barman COSC 6335 Data Mining Fall 2014
Introduction to R Carolina Salge March 29, 2017.
Introduction to R.
Intro to R & MS Data Science Tools
Web Database Programming Using PHP
Matlab Training Session 4: Control, Flow and Functions
R in Power BI.
T-SQL: Simple Changes That Go a Long Way
Introduction to R Programming with AzureML
Welcome! Power BI User Group (PUG)
Introduction to R Studio
Scripts & Functions Scripts and functions are contained in .m-files
Make Power BI Your Own with the Power BI APIs
Overview of Security Investments
Use of Mathematics using Technology (Maltlab)
Make Power BI Your Own with the Power BI APIs
Code is on the Website Outline Comparison of Excel and R
Logical Operations In Matlab.
Statistics 540 Computing in Statistics
Communication and Coding Theory Lab(CS491)
Installing Packages Introduction to R, Part II
Enterprise RLS in SQL Server in Power BI
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Predictive Models with SQL Server Machine Learning Services
R Course 1st Lecture.
Data analysis with R and the tidyverse
What is New in SQL Server 2016 BI Stack
Python for Data Analysis
Ch 1 .Installing and configuring SQL Server 2005
SSDT, Docker, and (Azure) DevOps
Presentation transcript:

Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston

Agenda  Intro to R  R and RStudio  Basics  Objects in R  Packages  Control Flows  RStudio Overview  MS and R  Azure ML  MS R Server  SQL 2016  Power BI  Resources May 2016Intro to R & Data Science Tools in the MS Stack2 Source:

Jamey Johnston  Data Scientist for an O&G Company  20+ years DBA Experience  TAMU MS in Analytics   Semi-Pro Photographer    Download Code Here! May 2016Intro to R & Data Science Tools in the MS Stack3

R and RStudio  R Project for Statistical Computing   RStudio  May 2016Intro to R & Data Science Tools in the MS Stack4

Basics  # - comment > # Basics  Variable Creation > m <- 3 * 5 > m [1] 15  Help > help(“lm”) # lm is function for Fitting Linear Models > ?lm > lm(y ~ x) May 2016Intro to R & Data Science Tools in the MS Stack5

Objects in R  Variables, Values, Commands, Functions …  Everything in R is an Object  Typical Data in R is stored in:  Vectors (one row, same data type)  Matrices (multiple rows, same data type)  Data Frames (multiple rows, multiple data types)  It’s like a Table!  List (collection of objects) May 2016Intro to R & Data Science Tools in the MS Stack6

Vector  Building Blocks for data objects in R  c (combine) function to create a Vector  v <- c(2, 3, 1.5, 3.1, 49)  seq function generates numeric sequences  s <- seq(from = 0, to = 100, by =.1)  rep function replicates values  r <- rep(c(1,4), times = 4)  : creates a number seq incremented by 1 or -1  colon <- 1:10  length(var) returns length of vector  length(colon) May 2016Intro to R & Data Science Tools in the MS Stack7

Matrix  matrix function used to build matrix  rbind (row bind) and cbind (column bind)  Combine matrices by row or column   Demos May 2016Intro to R & Data Science Tools in the MS Stack8

Data Frame  It is like a table!  rownames – extract row labels  colnames – extract column labels  read.table, read.csv, readxl, RODBCreadxl  Different ways to create data frames  Demos May 2016Intro to R & Data Science Tools in the MS Stack9

List  Combine multiple objects types into one object  vectors, matrices, data frames, list, functions  Typically used by functions to output the model output  e.g. the output from the lm function  Demo May 2016Intro to R & Data Science Tools in the MS Stack10

Missing Data  NA is used to represent Missing Data  The is.na and which functions are used to manage NA > x <- c(1.3,2.3,3.4,NA) > print(x) [1] NA > > # Returns integer location of values (not the values) > n <- which (is.na(x)) > v <- which (!is.na(x)) > print(n) [1] 4 > print(v) [1] > > # y will be set to the values not = NA > y <- x[!is.na(x)] > print(y) [1] May 2016Intro to R & Data Science Tools in the MS Stack11

Packages  Add-ons for R  library()  List packages already installed  install.package(“dplyr2”, “ggplot2”)  Install new packages  library(dplyr2)  Load package to be used in R May 2016Intro to R & Data Science Tools in the MS Stack12

Conditional Operators  Comparisons return logical vector > 1:10 == 2 [1] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > 1:10 != 2 [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > 1:10 > 2 [1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > 1:10 >= 2 [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > 1:10 < 2 [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > 1:10 <= 2 [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > x <- 2 > x > 1 [1] TRUE May 2016Intro to R & Data Science Tools in the MS Stack13

Logical Operations > x <- 1:4 > x [1] > > (x > 2) | (x <= 3) [1] TRUE TRUE TRUE TRUE > > (x > 2) & (x <= 3) [1] FALSE FALSE TRUE FALSE > > xor((x > 2), (x < 4)) [1] TRUE TRUE FALSE TRUE > > 0:5 %in% x [1] FALSE TRUE TRUE TRUE TRUE FALSE May 2016Intro to R & Data Science Tools in the MS Stack14

Control Flows  IF … ELSE x <- 4 if (x < 3) print("true") else print("false") ifelse ((x < 3), print("true"), print("false"))  FOR Loops for(i in 1:10) print(1:i) for (i in 1:nrow(df)) print(df[i,])  WHILE Loops i <- 1 while (i <= 10) { print(i) i <- i + 1 } May 2016Intro to R & Data Science Tools in the MS Stack15

RStudio May 2016Intro to R & Data Science Tools in the MS Stack16  Run Options  CTL+Enter  Ctl+Alt+R  Built-In Docs  Version Control  Projects

RStudio Debugging  Breakpoints (Shift+F9)  R Functions  browser()  debugonce()  Environment Pane  Traceback(Callstack)  Console  Step into function (Shift+F4)  Finish Function (Shift+F6)  Continue Running (Shift+F5)  Stop Debugging (Shift+F8) May 2016Intro to R & Data Science Tools in the MS Stack17

Azure ML  Azure Machine Learning  R Integration May 2016Intro to R & Data Science Tools in the MS Stack18

MS R Server  Enterprise Class R  Built on Revolution Analytics acquistion  SQL Server 2016 R Support via R Server  May 2016Intro to R & Data Science Tools in the MS Stack19 Source: Microsoft Website (URL above)

SQL 2016 and R  Leverages the MS R Server  Setup and Installation  Install Advanced Analytics Extensions   Install R Packages and Providers for SQL Server R Services   Post-Installation Server Configuration  May 2016Intro to R & Data Science Tools in the MS Stack20

SQL 2016 and R  SQL Server R Services Tutorials   DEMO - iris-sepal-example.sql  sp_execute_external_script (Transact-SQL)  = = = ] 'input_data_1' = ] N'input_data_1_name' ] = 'output_data_1_name' ] [ WITH [,...n ] ] [;] May 2016Intro to R & Data Science Tools in the MS Stack21

Power BI  Running R Scripts in Power BI Desktop    Demo – mtcars.pbix May 2016Intro to R & Data Science Tools in the MS Stack22 Options Needed

Resources  UCLA idre   R-Bloggers (sign up for daily )   Quick-R   R in Action (book to go with website)  Hadley Wickham  May 2016Intro to R & Data Science Tools in the MS Stack23

May 2016Intro to R & Data Science Tools in the MS Stack24 Thank You Sponsors! Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation form at the end of the day in the Registration area to be entered in additional drawings.

Questions? Thank you for attending!   Download Demos and PPT May 2016Intro to R & Data Science Tools in the MS Stack25