Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.

Slides:



Advertisements
Similar presentations
R for Macroecology Aarhus University, Spring 2011.
Advertisements

Refresh- Caitlin Collins, Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis using
Introduction to Matlab Workshop Matthew Johnson, Economics October 17, /13/20151.
Lab7: Introduction to Arduino
Example of multivariate data What is R? R is available as Free Software under the terms of the Free Software Foundation'sFree Software Foundation GNU General.
Precedence Parentheses Arithemetic ^ * / + - (exception logical not ~ ) Relational > =
Precedence Parentheses Arithemetic ^ * / + - (exception logical not ~ ) Relational > =
Lecture 2 LISAM. Statistical software.. LISAM What is LISAM? Social network for Creating personal pages Creating courses  Storing course materials (lectures,
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
SQL Server Reporting Services
LISA Short Course Series Basics of R Lin Zhang Feb. 16, 2015 LISA: Basics of RFeb. 16, 2015.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
LISA Short Course Series R Basics Ana Maria Ortega Villa Fall 2013 LISA: R BasicsFall 2013.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Arko Barman with modification by C.F. Eick COSC 4335 Data Mining Spring 2015.
1 SQL Server 2000 Administration Kashef Mughal MSB.
We will start soon. Feel free to ask (chat window) anything you want before we start.
Data Objects in R Vector1 dimensionAll elements have the same data types Data types: numeric, character logic, factor Matrix2 dimensions Array2 or more.
DTS Conversion to SSIS Conversion Best Practices Mike Davis
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
R Programming Yang, Yufei. Normal distribution.
Introduction to Programming in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone
XP Tutorial 10New Perspectives on HTML and XHTML, Comprehensive 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties Tutorial.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
STAT 534: Statistical Computing Hari Narayanan
Introduction to Exploratory Descriptive Data Analysis in S-Plus Jagdish S. Gangolly State University of New York at Albany.
8 th Semester, Batch 2009 Department Of Computer Science SSUET.
Blog: R YOU READY FOR.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
Scripting Just Enough SSIS to be Dangerous. 6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation.
Critical Path SQL Thane Schaffer: Database Administrator/Architect, Transcore, Albuquerque NM
Overview of Security Investments in SQL Server 2016 and Azure SQL Database Jamey Johnston 1/15/2016Security Investments in SQL Server 2016 and Azure SQL.
Blog: R YOU READY FOR.
MIS2502: Data Analytics Introduction to Advanced Analytics and R.
Power BI is Awesome! Steve Wake DW/BI Engineer, National CineMedia (NCM) Chapter Leader, Mile Hi Power BI User Group.
Pinellas County Schools
16BIT IITR Data Collection Module If you have not already done so, download and install R from download.
Blog: R YOU READY FOR.
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.
Machine Learning with SQL Server 2016 & R Dinesh Asanka Senior Architect – Technology VirtusaPolaris.
Introduction to R and Data Science Tools in the Microsoft Stack
Introduction to R and Data Science Tools in the Microsoft Stack
SQL 2016 R Services a.k.a. leveraging your local data lake
Programming in R Intro, data and programming structures
Introduction to R Samal Dharmarathna.
LISAM. Statistical software.
Introduction to R Carolina Salge March 29, 2017.
Introduction to R.
Intro to R & MS Data Science Tools
R in Power BI.
Introduction to R Programming with AzureML
Welcome! Power BI User Group (PUG)
Introduction to R Studio
R Integration in Microsoft Solutions
Dane Stubben QuintilesIMS Database Manager
Overview of Security Investments
Statistics 540 Computing in Statistics
Communication and Coding Theory Lab(CS491)
Installing Packages Introduction to R, Part II
Enterprise RLS in SQL Server in Power BI
Predictive Models with SQL Server Machine Learning Services
R Course 1st Lecture.
Introduction to Matlab
Data analysis with R and the tidyverse
What is New in SQL Server 2016 BI Stack
Python for Data Analysis
Ch 1 .Installing and configuring SQL Server 2005
Presentation transcript:

Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston

Agenda  Intro to R  R and RStudio  Basics  Objects in R  Packages  Control Flows  RStudio Overview  MS and R  Azure ML  MS R Server  SQL 2016  Power BI  Resources Feb 2016Intro to R & Data Science Tools in the MS Stack2 Source:

Jamey Johnston  Data Scientist for an O&G Company  20+ years DBA Experience  TAMU MS in Analytics (2016)   Professional Photographer   Feb 2016Intro to R & Data Science Tools in the MS Stack3

R and RStudio  R Project for Statistical Computing   RStudio  Feb 2016Intro to R & Data Science Tools in the MS Stack4

Basics  # - comment > # Basics  Variable Creation > m <- 3 * 5 > m [1] 15  Help > help(“lm”) # lm is function for Fitting Linear Models > ?lm > lm(y ~ x) Feb 2016Intro to R & Data Science Tools in the MS Stack5

Objects in R  Variables, Values, Commands, Functions …  Everything in R is an Object  Typical Data in R is stored in:  Vectors (one row, same data type)  Matrices (multiple rows, same data type)  Data Frames (multiple rows, multiple data types)  It’s like a Table!  List (collection of objects) Feb 2016Intro to R & Data Science Tools in the MS Stack6

Vector  Building Blocks for data objects in R  c (combine) function to create a Vector  v <- c(2, 3, 1.5, 3.1, 49)  seq function generates numeric sequences  s <- seq(from = 0, to = 100, by =.1)  rep function replicates values  r <- rep(c(1,4), times = 4)  : creates a number seq incremented by 1 or -1  colon <- 1:10  length(var) returns length of vector  length(colon) Feb 2016Intro to R & Data Science Tools in the MS Stack7

Matrix  matrix function used to build matrix  rbind (row bind) and cbind (column bind)  Combine matrices by row or column   Demos Feb 2016Intro to R & Data Science Tools in the MS Stack8

Data Frame  It is like a table!  rownames – extract row labels  colnames – extract column labels  read.table, read.csv, readxl, RODBCreadxl  Different ways to create data frames  Demos Feb 2016Intro to R & Data Science Tools in the MS Stack9

List  Combine multiple objects types into one object  vectors, matrices, data frames, list, functions  Typically used by functions to output the model output  e.g. the output from the lm function  Demo Feb 2016Intro to R & Data Science Tools in the MS Stack10

Missing Data  NA is used to represent Missing Data  The is.na and which functions are used to manage NA > x <- c(1.3,2.3,3.4,NA) > print(x) [1] NA > > # Returns integer location of values (not the values) > n <- which (is.na(x)) > v <- which (!is.na(x)) > print(n) [1] 4 > print(v) [1] > > # y will be set to the values not = NA > y <- x[!is.na(x)] > print(y) [1] Feb 2016Intro to R & Data Science Tools in the MS Stack11

Packages  Add-ons for R  library()  List packages already installed  install.package(“dplyr2”, “ggplot2”)  Install new packages  library(dplyr2)  Load package to be used in R Feb 2016Intro to R & Data Science Tools in the MS Stack12

Conditional Operators  Comparisons return logical vector > 1:10 == 2 [1] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > 1:10 != 2 [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > 1:10 > 2 [1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > 1:10 >= 2 [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > 1:10 < 2 [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > 1:10 <= 2 [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > x <- 2 > x > 1 [1] TRUE Feb 2016Intro to R & Data Science Tools in the MS Stack13

Logical Operations > x <- 1:4 > x [1] > > (x > 2) | (x <= 3) [1] TRUE TRUE TRUE TRUE > > (x > 2) & (x <= 3) [1] FALSE FALSE TRUE FALSE > > xor((x > 2), (x < 4)) [1] TRUE TRUE FALSE TRUE > > 0:5 %in% x [1] FALSE TRUE TRUE TRUE TRUE FALSE Feb 2016Intro to R & Data Science Tools in the MS Stack14

Control Flows  IF … ELSE x <- 4 if (x < 3) print("true") else print("false") ifelse ((x < 3), print("true"), print("false"))  FOR Loops for(i in 1:10) print(1:i) for (i in 1:nrow(df)) print(df[i,])  WHILE Loops i <- 1 while (i <= 10) { print(i) i <- i + 1 } Feb 2016Intro to R & Data Science Tools in the MS Stack15

RStudio Feb 2016Intro to R & Data Science Tools in the MS Stack16  Run Options  CTL+Enter  Ctl+Alt+R  Built-In Docs  Version Control  Projects

RStudio Debugging  Breakpoints (Shift+F9)  R Functions  browser()  debugonce()  Environment Pane  Traceback(Callstack)  Console  Step into function (Shift+F4)  Finish Function (Shift+F6)  Continue Running (Shift+F5)  Stop Debugging (Shift+F8) Feb 2016Intro to R & Data Science Tools in the MS Stack17

Azure ML  Azure Machine Learning  R Integration Feb 2016Intro to R & Data Science Tools in the MS Stack18

MS R Server  Enterprise Class R  Built on Revolution Analytics acquistion  SQL Server 2016 R Support via R Server  Feb 2016Intro to R & Data Science Tools in the MS Stack19 Source: Microsoft Website (URL above)

SQL 2016 and R  Leverages the MS R Server  Setup and Installation  Install Advanced Analytics Extensions   Install R Packages and Providers for SQL Server R Services   Post-Installation Server Configuration  Feb 2016Intro to R & Data Science Tools in the MS Stack20

SQL 2016 and R  SQL Server R Services Tutorials   DEMO - iris-sepal-example.sql  sp_execute_external_script (Transact-SQL)  = = = ] 'input_data_1' = ] N'input_data_1_name' ] = 'output_data_1_name' ] [ WITH [,...n ] ] [;] Feb 2016Intro to R & Data Science Tools in the MS Stack21

Power BI  Running R Scripts in Power BI Desktop    Demo – mtcars.pbix Feb 2016Intro to R & Data Science Tools in the MS Stack22 Options Needed

Resources  UCLA idre   R-Bloggers (sign up for daily )   Quick-R   R in Action (book to go with website) Feb 2016Intro to R & Data Science Tools in the MS Stack23

Thanks to our Sponsors! Feb 2016Intro to R & Data Science Tools in the MS Stack24

Join us at our Networking happy hour immediately following the closing remarks today Fox & Hound 4301 The Lane at 25 NE, Albuquerque, NM Drinks, Light appetizers, Raffle drawings, Pool tables, dart boards, and more networking with speakers, organizers, and attendees. See you there! Feb 2016Intro to R & Data Science Tools in the MS Stack25

Questions? Thank you for attending!   Download Demos and PPT Feb 2016Intro to R & Data Science Tools in the MS Stack26