Presentation is loading. Please wait.

Presentation is loading. Please wait.

R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC

Similar presentations


Presentation on theme: "R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC"— Presentation transcript:

1 R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer Work @ : Proterra in Greenville SC kiranmath@outlook.com

2 MOTIVATION

3 GOAL Raw Sensor Data Tidy Data

4 ZILLOW

5 INSTALLATION Comprehensive R Archive Network (CRAN) https://www.cran.r-project.org/ R Studio https://www.rstudio.com/

6 R <- CORE && R <-PACKAGES ggPlot2 sqldf Base Packages rodbc dplyr stringR ggPlot2 reshape2 tidyR lubridate

7 BASICS 1 - VECTOR # Define a Variable a <- 25 # Call a Variable a ## [1] 25 # Do something to it a + 10 ## [1] 35 # Create a vector - Numeric x <- c(0.5, 0.6,0.7) ## call it x ## 0.5 0.6 0.7 # Do something to the vector mean(x) ## [1] 0.6

8 BASICS 2 - FUNCTIONS Functions are blocks of code that allow R to be a modular and facilitate code reuse Funct_name <- function ( arg1,arg2,..){ ### do something } ## Compute the mean of the vector of numbers meanX <- function(a_vector) { s <- sum(a_vector) l <- length(a_vector) m <- s/l return(m) } ### create a vector v <- c(1,2,3,4,5) ### Find the mean meanX(v) ## [1] 3

9 HOME SALE Question : I have a 3000 sql ft house and how much it will sale for?

10 Visualize Model Transform Get & Tidy Transform @ hadleywickham

11 GET DATA – FROM SQL SERVER

12 GET DATA – FROM CSV FILE

13 DATA FRAME dat[5,3] To Preview the data frame head(dat) Tail(dat) Variables Observations dat Number of Rows

14 R –STR() Str(object,...) dat$SaleDate <- as.Date(dat$SaleDate) Compactly display the internal str ucture of an R object, a diagnostic function Change the class of column SaleDate tDat

15 R – SUMMARY() summary(object) distribution of your variables in the dataset tDat

16 RESHAPING DATA - DPLYR Select Subset variables (Columns). tDat Dat

17 FILTER DATA - DPLYR Filter() allows you to select a subset of rows in a data frame.

18 PIPING- DPLYR %>% Passes object on LHS as first argument to function on RHS

19 RESHAPING DATA - TIDYR Gather Spread ~ does the opposite Gather columns into Rows gDat tDat

20 MAKE NEW VARIABLE (COLUMN) Mutate Compute and appends or or more new columns gDat

21 RESHAPING DATA - TIDYR Separate Spread ~ does the opposite Separate one column into several. gDat tDat

22 Visualize Model Transform Get & Tidy Transform @ hadleywickham

23 DATA VISUALIZATION – GGPLOT2 ggplot2 Based of Grammar of Graphics One can build every graph from same few components Data set Set of Geom – visual marks that represent the data Coordinate system

24 DATA VISUALIZATION – GGPLOT2 ggplot2 To display data values, map the variables in the dataset to aesthetic properties geom  color, size and x and y locations

25 DATA VISUALIZATION – GGPLOT2 Qplot()

26 DATA VISUALIZATION – GGPLOT2 ggplot() Add Layer elements with +

27 DATA VISUALIZATION – GGPLOT2 ggplot() Add Layer elements with +

28 LINEAR REGRESSION MODEL

29 LEAST SQUARE METHOD R Function Lm()

30 MODEL - CORRELATION Cor() Is Area correlated to Sale Price? The value o/p is between 0 and 1

31 MODEL - PREDICTION

32 DATA VISUALIZATION – GGPLOT2 Lm()

33 HOME SALE Question : I have a 3000 sql ft house and how much it will sale for? Answer : $198,000

34 THANK YOU


Download ppt "R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC"

Similar presentations


Ads by Google