Download presentation
Presentation is loading. Please wait.
Published byPamela Little Modified over 7 years ago
1
Common Linear & Classification for Machine Learning using Microsoft R
Venus Lin (Xiuqing Lin) @azssugsqlpass Assistant: Fred Benardella
2
Content iris Example Opening talks Machine learning RevoScaleR Package
linear and classification iris Example Linear model Prediction Visualization: what can we do in SSRS Decision Tree and Clustering quick demo Q & A
3
Yearn for the sea
4
Machine Learning What do you do with unlimited data?
What is the right question to ask about this data? Accuracy of prediction Human Understandable Formats Structured and unstructured data (text, image, & media) Netflix recommendation system contest award $1M to the algorithm 10% better than the Cinematch in 2009
5
RevoScaleR Package for Microsoft R
Scalable, Distributed and Parallel Computation, Available along with Microsoft R Server and in-Database R Services. With Prefix – rx Name Description rxLinMod Linear regression rxDtree Decision tree rxKmeans Classification
6
More about RevoScaleR
7
Linear Regression Name In RevoScaleR Package In Basci R Description
rxLinMod lm Linear regression rxDtree rpart Decision tree rxKmeans kmeans Cluster
8
Work Flow Develop in R IDE – RStudio
Create a store procedure in SQL Server 2016 Visualize in SSRS
9
Iris – Data Frame Study
10
A Normal Analytics Project Process
Problem definition Understand Iris species petal and sepal Data exploration Scatterplot & summary statistics Data preparation (N/A) Modelling Linear regression Validation of data Prediction Implementation and tracking
11
Iris – Data Frame Study
12
Plot(iris)
13
iris[,3] >>> Petal.Length iris[,4] >>> Petal.Width
plot(iris[,4],iris[,3],type="p",pch=16,cex=1,xlab="Petal.Width",ylab="Petal.Length",main="Iris",col=iris$Species)
14
Linear Regression -Some basic statistic elements Dependent variable
iris[,3] >>> Petal.Length Independent variable (predictor) iris[,4] >>> Petal.Width Formula Petal.Length = a + b*Petal.Width
15
- How well does the model fit the data?
PPModel1<-rxLinMod(Petal.Length~Petal.Width,data=iris,covCoef = T) PPModel1 summary(PPModel1) names(PPModel1) - How well does the model fit the data? R Squared : percentage of the variances explained by the model Significance of the models (P Value) AIC / VIF / ANOVA / Residual Plot
16
Linear Regression - rxLinMod
17
Petal.Length = a + b*Petal.Width
Petal.Length = *Petal.Width plot(iris[,4],iris[,3],type="p",pch=16,cex=1,xlab="Petal.Width",y lab="Petal.Length",main="Iris",col=iris$Species) abline(PPModel1,lwd=5,col="red")
18
Prediction > new1<-data.frame(iris[,3:4])
> ppModel1_Predict<-rxPredict(PPModel1,data=new1,outData = new1,computeResiduals = T,interval = "prediction",writeModelVars = T,computeStdErrors = T) > names(ppModel1_Predict) [1] "Petal.Length" "Petal.Width" "Petal.Length_Pred" [4] "Petal.Length_StdErr" "Petal.Length_Lower" "Petal.Length_Upper" [7] "Petal.Length_Resid" > plot(iris[,4],iris[,3],type="p",pch=16,cex=1,xlab="Petal.Width",ylab="Petal.Length",main="Iris",col=iris$Species) > abline(PPModel1,lwd=5,col="red") > points(ppModel1_Predict[,2],ppModel1_Predict[,5],col="green",type="l",lwd="4") > points(ppModel1_Predict[,2],ppModel1_Predict[,6],col="green",type="l",lwd="4")
19
Create a stored procedure in SQL Sever Management Studio
20
SSRS Report
21
Name In RevoScaleR Package In Basci R Description rxLinMod lm Linear regression rxDtree rpart Decision tree rxKmeans kmeans Cluster
22
Common Classification Model
rxDTree { from RevoScaleR Package} Fit classification and regression trees Demo library(RevoScaleR) library(RevoTreeView) IrisTreeexample<-rxDTree(Species~Sepal.Length + Sepal.Width +Petal.Length +Petal.Width,data = iris) plot(createTreeView(IrisTreeexample))
23
Name In RevoScaleR Package In Basci R Description rxLinMod lm Linear regression rxDtree rpart Decision tree rxKmeans kmeans Cluster
24
K means clustering iris_rxKmeans_3<-rxKmeans(formula = ~Petal.Length+Petal.Width,data=iris,numClusters = 3,outFile = "iriscluster.xdf",outColName = "Cluster") iris_rxKmeans_3$cluster<-as.factor(iris_rxKmeans_3$cluster) library(ggplot2) ggplot(iris, aes(Petal.Width, Petal.Length, color = iris_rxKmeans_3$cluster)) + geom_point(size=5)+ggtitle("after K Means 3 cluster")
26
Plot(iris) – Sepal.Length /Width
27
iris_rxKmeans<-rxKmeans(formula = ~Sepal. Length+Sepal
iris_rxKmeans<-rxKmeans(formula = ~Sepal.Length+Sepal.Width,data=iris,numClusters = 3,outFile = "iriscluster1.xdf",outColName = "Cluster") iris_rxKmeans$cluster<-as.factor(iris_rxKmeans$cluster) library(ggplot2) ggplot(iris, aes(Sepal.Length, Sepal.Width, color = iris_rxKmeans$cluster)) + geom_point(size=5)+ggtitle("after K Means 3 cluster")
29
numClusters = 5 Github: tomaztk/Compare_kmeans_rxKmeans
Is it possible to use RevoScaleR package in Power BI? -From
30
Thank you Sponsors!
31
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.