Presentation is loading. Please wait.

Presentation is loading. Please wait.

Common Linear & Classification for Machine Learning using Microsoft R

Similar presentations


Presentation on theme: "Common Linear & Classification for Machine Learning using Microsoft R"— Presentation transcript:

1 Common Linear & Classification for Machine Learning using Microsoft R
Venus Lin (Xiuqing Lin) @azssugsqlpass Assistant: Fred Benardella

2 Content iris Example Opening talks Machine learning RevoScaleR Package
linear and classification iris Example Linear model Prediction Visualization: what can we do in SSRS Decision Tree and Clustering quick demo Q & A

3 Yearn for the sea

4 Machine Learning What do you do with unlimited data?
What is the right question to ask about this data? Accuracy of prediction Human Understandable Formats Structured and unstructured data (text, image, & media) Netflix recommendation system contest award $1M to the algorithm 10% better than the Cinematch in 2009

5 RevoScaleR Package for Microsoft R
Scalable, Distributed and Parallel Computation, Available along with Microsoft R Server and in-Database R Services. With Prefix – rx Name Description rxLinMod Linear regression rxDtree Decision tree rxKmeans Classification

6 More about RevoScaleR

7 Linear Regression Name In RevoScaleR Package In Basci R Description
rxLinMod lm Linear regression rxDtree rpart Decision tree rxKmeans kmeans Cluster

8 Work Flow Develop in R IDE – RStudio
Create a store procedure in SQL Server 2016 Visualize in SSRS

9 Iris – Data Frame Study

10 A Normal Analytics Project Process
Problem definition Understand Iris species petal and sepal Data exploration Scatterplot & summary statistics Data preparation (N/A) Modelling Linear regression Validation of data Prediction Implementation and tracking

11 Iris – Data Frame Study

12 Plot(iris)

13 iris[,3] >>> Petal.Length iris[,4] >>> Petal.Width
plot(iris[,4],iris[,3],type="p",pch=16,cex=1,xlab="Petal.Width",ylab="Petal.Length",main="Iris",col=iris$Species)

14 Linear Regression -Some basic statistic elements Dependent variable
iris[,3] >>> Petal.Length Independent variable (predictor) iris[,4] >>> Petal.Width Formula Petal.Length = a + b*Petal.Width

15 - How well does the model fit the data?
PPModel1<-rxLinMod(Petal.Length~Petal.Width,data=iris,covCoef = T) PPModel1 summary(PPModel1) names(PPModel1) - How well does the model fit the data? R Squared : percentage of the variances explained by the model Significance of the models (P Value) AIC / VIF / ANOVA / Residual Plot

16 Linear Regression - rxLinMod

17 Petal.Length = a + b*Petal.Width
Petal.Length = *Petal.Width plot(iris[,4],iris[,3],type="p",pch=16,cex=1,xlab="Petal.Width",y lab="Petal.Length",main="Iris",col=iris$Species) abline(PPModel1,lwd=5,col="red")

18 Prediction > new1<-data.frame(iris[,3:4])
> ppModel1_Predict<-rxPredict(PPModel1,data=new1,outData = new1,computeResiduals = T,interval = "prediction",writeModelVars = T,computeStdErrors = T) > names(ppModel1_Predict) [1] "Petal.Length" "Petal.Width" "Petal.Length_Pred" [4] "Petal.Length_StdErr" "Petal.Length_Lower" "Petal.Length_Upper" [7] "Petal.Length_Resid" > plot(iris[,4],iris[,3],type="p",pch=16,cex=1,xlab="Petal.Width",ylab="Petal.Length",main="Iris",col=iris$Species) > abline(PPModel1,lwd=5,col="red") > points(ppModel1_Predict[,2],ppModel1_Predict[,5],col="green",type="l",lwd="4") > points(ppModel1_Predict[,2],ppModel1_Predict[,6],col="green",type="l",lwd="4")

19 Create a stored procedure in SQL Sever Management Studio

20 SSRS Report

21 Name In RevoScaleR Package In Basci R Description rxLinMod lm Linear regression rxDtree rpart Decision tree rxKmeans kmeans Cluster

22 Common Classification Model
rxDTree { from RevoScaleR Package} Fit classification and regression trees Demo library(RevoScaleR) library(RevoTreeView) IrisTreeexample<-rxDTree(Species~Sepal.Length + Sepal.Width +Petal.Length +Petal.Width,data = iris) plot(createTreeView(IrisTreeexample))

23 Name In RevoScaleR Package In Basci R Description rxLinMod lm Linear regression rxDtree rpart Decision tree rxKmeans kmeans Cluster

24 K means clustering iris_rxKmeans_3<-rxKmeans(formula = ~Petal.Length+Petal.Width,data=iris,numClusters = 3,outFile = "iriscluster.xdf",outColName = "Cluster") iris_rxKmeans_3$cluster<-as.factor(iris_rxKmeans_3$cluster) library(ggplot2) ggplot(iris, aes(Petal.Width, Petal.Length, color = iris_rxKmeans_3$cluster)) + geom_point(size=5)+ggtitle("after K Means 3 cluster")

25

26 Plot(iris) – Sepal.Length /Width

27 iris_rxKmeans<-rxKmeans(formula = ~Sepal. Length+Sepal
iris_rxKmeans<-rxKmeans(formula = ~Sepal.Length+Sepal.Width,data=iris,numClusters = 3,outFile = "iriscluster1.xdf",outColName = "Cluster") iris_rxKmeans$cluster<-as.factor(iris_rxKmeans$cluster) library(ggplot2) ggplot(iris, aes(Sepal.Length, Sepal.Width, color = iris_rxKmeans$cluster)) + geom_point(size=5)+ggtitle("after K Means 3 cluster")

28

29 numClusters = 5 Github: tomaztk/Compare_kmeans_rxKmeans
Is it possible to use RevoScaleR package in Power BI? -From

30 Thank you Sponsors!

31 Thank you!


Download ppt "Common Linear & Classification for Machine Learning using Microsoft R"

Similar presentations


Ads by Google