Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM

Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM
Peter Fox Data Analytics ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960 Group 3 Lab 1, October 26, 2018

Weighted KNN group3/lab1_kknn1.R Make sure you look carefully at the results Apply it to other datasets! We will discuss and interpret in class

Rpart – recursive partitioning and Conditional Inference
group3/lab1_rpart1.R group3/lab1_rpart2.R group3/lab1_rpart3.R group3/lab1_rpart4.R Try rpart for “Rings” on the Abalone dataset group3/lab1_ctree1.R group3/lab1_ctree2.R group3/lab1_ctree3.R

randomForest group3/lab1_randomforest1.R Do your own Random Forest , i.e. different implementations, cforest {party} on the other datasets

Trees for the Titanic data(Titanic) rpart, ctree, hclust, randomForest for: Survived ~ .

Run through these demos
library(EDR) # effective dimension reduction library(dr) library(clustrd) install.packages("edrGraphicalTools") library(edrGraphicalTools) demo(edr_ex1) demo(edr_ex2) demo(edr_ex3) demo(edr_ex4)

Some examples – group3/ lab1_dr1.R lab1_dr2.R lab1_dr3.R lab1_dr4.R

MDS – group3/ lab1_mds1.R lab1_mds2.R lab1_mds3.R

R – many ways (of course)
library(igraph) g <- graph.full(nrow(dist.au)) V(g)$label <- city.names layout <- layout.mds(g, dist = as.matrix(dist.au)) plot(g, layout = layout, vertex.size = 3)

Work through these… lecture next week
lab1_svm1.R –> lab1_svm11.R lab1_svm_rpart1.R Exercise various parts of SVM, parameters, kernels, etc… Karatzoglou et al

Ozone > library(e1071) > library(rpart)
> data(Ozone, package=“mlbench”) > # # for field codes > ## split data into a train and test set > index <- 1:nrow(Ozone) > testindex <- sample(index, trunc(length(index)/3)) > testset <- na.omit(Ozone[testindex,-3]) > trainset <- na.omit(Ozone[-testindex,-3]) > svm.model <- svm(V4 ~ ., data = trainset, type=“C-classification”,cost = 1000, gamma = ) > svm.pred <- predict(svm.model, testset[,-3]) > crossprod(svm.pred - testset[,3]) / length(testindex) See:

Glass library(e1071) library(rpart) data(Glass, package="mlbench") index <- 1:nrow(Glass) testindex <- sample(index, trunc(length(index)/3)) testset <- Glass[testindex,] trainset <- Glass[-testindex,] svm.model <- svm(Type ~ ., data = trainset, cost = 100, gamma = 1) # cost = “C” coefficient (Lagrange multiplier) svm.pred <- predict(svm.model, testset[,-10])

> table(pred = svm.pred, true = testset[,10]) true pred

Example lab1_svm1.R n <- 150 # number of data points p <- 2 # dimension sigma <- 1 # variance of the distribution meanpos <- 0 # centre of the distribution of positive examples meanneg <- 3 # centre of the distribution of negative examples npos <- round(n/2) # number of positive examples nneg <- n-npos # number of negative examples # Generate the positive and negative examples xpos <- matrix(rnorm(npos*p,mean=meanpos,sd=sigma),npos,p) xneg <- matrix(rnorm(nneg*p,mean=meanneg,sd=sigma),npos,p) x <- rbind(xpos,xneg) # Generate the labels y <- matrix(c(rep(1,npos),rep(-1,nneg))) # Visualize the data plot(x,col=ifelse(y>0,1,2)) legend("topleft",c('Positive','Negative'),col=seq(2),pch=1,text.col=seq(2))

Example 1

Train/ test ntrain <- round(n*0.8) # number of training examples tindex <- sample(n,ntrain) # indices of training samples xtrain <- x[tindex,] xtest <- x[-tindex,] ytrain <- y[tindex] ytest <- y[-tindex] istrain=rep(0,n) istrain[tindex]=1 # Visualize plot(x,col=ifelse(y>0,1,2),pch=ifelse(istrain==1,1,2)) legend("topleft",c('Positive Train','Positive Test','Negative Train','Negative Test'),col=c(1,1,2,2), pch=c(1,2,1,2), text.col=c(1,1,2,2))

Comparison of test classifier

Example ctd svp <- ksvm(xtrain,ytrain,type="C-svc", kernel='vanilladot', C=100,scaled=c()) # General summary svp # Attributes that you can access attributes(svp) # did you look? # For example, the support vectors alpha(svp) alphaindex(svp) b(svp) # remember b? # Use the built-in function to pretty-plot the classifier plot(svp,data=xtrain) > # For example, the support vectors > alpha(svp) [[1]] [1] > alphaindex(svp) [1] > b(svp) [1]

Do SVM for iris

SVM for Swiss

e.g. Probabilities… library(kernlab) data(promotergene) ## create test and training set ind <- sample(1:dim(promotergene)[1],20) genetrain <- promotergene[-ind, ] genetest <- promotergene[ind, ] ## train a support vector machine gene <- ksvm(Class~.,data=genetrain,kernel="rbfdot",\ kpar=list(sigma=0.015),C=70,cross=4,prob.model=TRUE) ## predict gene type probabilities on the test set genetype <- predict(gene,genetest,type="probabilities")

Result > genetype + - [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] [12,] [13,] [14,] [15,] [16,] [17,] [18,] [19,] [20,]

R-SVM http://www.stanford.edu/group/wonglab/RSVMpage/r-svm.tar.gz
Read/ skim the paper Explore this method on a dataset of your choice, e.g. one of the R built-in datasets

kernlab http://aquarius.tw.rpi.edu/html/DA/svmbasic_notes.pdf
Some scripts: lab1_svm12.R, lab1_svm13.R

kernlab, svmpath and klaR
Start at page 9 (bottom)

Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM

Similar presentations

Presentation on theme: "Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM

Similar presentations

Presentation on theme: "Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM"— Presentation transcript:

Similar presentations

About project

Feedback