Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 9b, April 1, 2016

Slides:



Advertisements
Similar presentations
Support Vector Machines Elena Mokshyna Odessa 2014.
Advertisements

1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 10b, April 10, 2015 Labs: Cross Validation, RandomForest, Multi- Dimensional Scaling, Dimension Reduction,
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
SVM Lab material borrowed from tutorial by David Meyer FH Technikum Wien, Austria see:
Advisor: Prof. Tony Jebara
An Example of Course Project Face Identification.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 10a, April 1, 2014 Support Vector Machines.
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 10b, April 4, 2014 Lab: More on Support Vector Machines, Trees, and your projects.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 11a, April 7, 2014 Support Vector Machines, Decision Trees, Cross- validation.
JBR1 Support Vector Machines Classification Venables & Ripley Section 12.5 CSU Hayward Statistics 6601 Joseph Rickert & Timothy McKusick December 1, 2004.
WEKA Machine Learning Toolbox. You can install Weka on your computer from
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
SVM Lab material borrowed from tutorial by David Meyer FH Technikum Wien, Austria see:
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Oliver Schulte Machine Learning 726 Decision Tree Classifiers.
LECTURE 20: SUPPORT VECTOR MACHINES PT. 1 April 11, 2016 SDS 293 Machine Learning.
1 Peter Fox Data Analytics – 4600/6600 Week 9a, March 29, 2016 Dimension reduction and MD scaling, Support Vector Machines.
1 Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 11a, April 12, 2016 Interpreting: MDS, DR, SVM Factor Analysis; and Boosting.
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
1 C.A.L. Bailer-Jones. Machine Learning. Model selection and combination Machine learning, pattern recognition and statistical data modelling Lecture 10.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Constructing a Predictor to Identify Drug and Adverse Event Pairs
Peter Fox and Greg Hughes
Support Vector Machine 04/26/17
Machine Learning Models
Evaluating Classifiers
Zhenshan, Wen SVM Implementation Zhenshan, Wen
CS240A Final Project 2.
Predicting E. Coli Promoters Using SVM
Classification, Clustering and Bayes…
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Interpreting: MDS, DR, SVM Factor Analysis
Basic machine learning background with Python scikit-learn
Labs: Dimension Reduction, Factor Analysis
Labs: Dimension Reduction, Factor Analysis
Labs: Dimension Reduction, Multi-dimensional Scaling, SVM
Machine Learning Week 1.
INTRODUCTION TO SUPPORT VECTOR MACHINES
Finding Clusters within a Class to Improve Classification Accuracy
Group 1 Lab 2 exercises and Assignment 2
Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600
CSSE463: Image Recognition Day 20
Balance Scale Data Set This data set was generated to model psychological experimental results. Each example is classified as having the balance scale.
Labs: Dimension Reduction, Multi-dimensional Scaling, SVM
PCA, Clustering and Classification by Agnieszka S. Juncker
Interpreting: MDS, DR, SVM Factor Analysis
Implementing AdaBoost
Classification, Clustering and Bayes…
Data Analytics – ITWS-4600/ITWS-6600/MATP-4450
An update on scikit-learn
Local Regression, LDA, and Mixed Model Lab
Interpreting: MDS, DR, SVM Factor Analysis
Labs: Trees, Dimension Reduction, Multi-dimensional Scaling, SVM
Cross-validation Brenda Thomson/ Peter Fox Data Analytics
Comparison of the csEN algorithm to existing predictive methods and model reduction. Comparison of the csEN algorithm to existing predictive methods and.
Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 10b, April 8, 2016
ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960
Classification, Clustering and Bayes…
Local Regression, LDA, and Mixed Model Lab
Group 1 Lab 2 exercises and Assignment 2
An introduction to neural network and machine learning
Machine Learning for Cyber
Machine Learning for Cyber
Presentation transcript:

Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 9b, April 1, 2016 Lab: MDS, DR and SVM Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 9b, April 1, 2016

On your own… library(EDR) # effective dimension reduction library(dr) library(clustrd) install.packages("edrGraphicalTools") library(edrGraphicalTools) demo(edr_ex1) demo(edr_ex2) demo(edr_ex3) demo(edr_ex4)

Some examples Lab8b_dr1_2016.R Lab8b_dr2_2016.R Lab8b_dr3_2016.R

MDS Lab8b_mds1_2016.R Lab8b_mds2_2016.R Lab8b_mds3_2016.R http://www.statmethods.net/advstats/mds.html http://gastonsanchez.com/blog/how-to/2013/01/23/MDS-in-R.html

Work through these… Lab9b_svm1_2015.R –> Lab9b_svm11_2015.R Lab9b_svm_rpart1_2016.R Exercise various parts of SVM, parameters, kernels, etc… Karatzoglou et al. 2006 - http://aquarius.tw.rpi.edu/html/DA/v15i09.pdf

Ozone > library(e1071) > library(rpart) > data(Ozone, package=“mlbench”) > # http://math.furman.edu/~dcs/courses/math47/R/library/mlbench/html/Ozone.html # for field codes > ## split data into a train and test set > index <- 1:nrow(Ozone) > testindex <- sample(index, trunc(length(index)/3)) > testset <- na.omit(Ozone[testindex,-3]) > trainset <- na.omit(Ozone[-testindex,-3]) > svm.model <- svm(V4 ~ ., data = trainset, type=“C-classification”,cost = 1000, gamma = 0.0001) > svm.pred <- predict(svm.model, testset[,-3]) > crossprod(svm.pred - testset[,3]) / length(testindex) See: http://cran.r-project.org/web/packages/e1071/vignettes/svmdoc.pdf

Glass library(e1071) library(rpart) data(Glass, package="mlbench") index <- 1:nrow(Glass) testindex <- sample(index, trunc(length(index)/3)) testset <- Glass[testindex,] trainset <- Glass[-testindex,] svm.model <- svm(Type ~ ., data = trainset, cost = 100, gamma = 1) svm.pred <- predict(svm.model, testset[,-10])

> table(pred = svm.pred, true = testset[,10]) true pred 1 2 3 5 6 7 1 12 9 1 0 0 0 2 6 19 6 5 2 2 3 1 0 2 0 0 0 5 0 0 0 0 0 0 6 0 0 0 0 1 0 7 0 1 0 0 0 4

Example Lab9b_svm1_2016.R n <- 150 # number of data points p <- 2 # dimension sigma <- 1 # variance of the distribution meanpos <- 0 # centre of the distribution of positive examples meanneg <- 3 # centre of the distribution of negative examples npos <- round(n/2) # number of positive examples nneg <- n-npos # number of negative examples # Generate the positive and negative examples xpos <- matrix(rnorm(npos*p,mean=meanpos,sd=sigma),npos,p) xneg <- matrix(rnorm(nneg*p,mean=meanneg,sd=sigma),npos,p) x <- rbind(xpos,xneg) # Generate the labels y <- matrix(c(rep(1,npos),rep(-1,nneg))) # Visualize the data plot(x,col=ifelse(y>0,1,2)) legend("topleft",c('Positive','Negative'),col=seq(2),pch=1,text.col=seq(2))

Example 1

Train/ test ntrain <- round(n*0.8) # number of training examples tindex <- sample(n,ntrain) # indices of training samples xtrain <- x[tindex,] xtest <- x[-tindex,] ytrain <- y[tindex] ytest <- y[-tindex] istrain=rep(0,n) istrain[tindex]=1 # Visualize plot(x,col=ifelse(y>0,1,2),pch=ifelse(istrain==1,1,2)) legend("topleft",c('Positive Train','Positive Test','Negative Train','Negative Test'),col=c(1,1,2,2), pch=c(1,2,1,2), text.col=c(1,1,2,2))

Comparison of test classifier

Example ctd svp <- ksvm(xtrain,ytrain,type="C-svc", kernel='vanilladot', C=100,scaled=c()) # General summary svp # Attributes that you can access attributes(svp) # did you look? # For example, the support vectors alpha(svp) alphaindex(svp) b(svp) # remember b? # Use the built-in function to pretty-plot the classifier plot(svp,data=xtrain) > # For example, the support vectors > alpha(svp) [[1]] [1] 71.05875 28.94125 100.00000 > alphaindex(svp) [1] 10 74 93 > b(svp) [1] -17.3651

Do SVM for iris

SVM for Swiss

e.g. Probabilities… library(kernlab) data(promotergene) ## create test and training set ind <- sample(1:dim(promotergene)[1],20) genetrain <- promotergene[-ind, ] genetest <- promotergene[ind, ] ## train a support vector machine gene <- ksvm(Class~.,data=genetrain,kernel="rbfdot",\ kpar=list(sigma=0.015),C=70,cross=4,prob.model=TRUE) ## predict gene type probabilities on the test set genetype <- predict(gene,genetest,type="probabilities")

Result > genetype + - [1,] 0.205576217 0.794423783 [2,] 0.150094660 0.849905340 [3,] 0.262062226 0.737937774 [4,] 0.939660586 0.060339414 [5,] 0.003164823 0.996835177 [6,] 0.502406898 0.497593102 [7,] 0.812503448 0.187496552 [8,] 0.996382257 0.003617743 [9,] 0.265187582 0.734812418 [10,] 0.998832291 0.001167709 [11,] 0.576491204 0.423508796 [12,] 0.973798660 0.026201340 [13,] 0.098598411 0.901401589 [14,] 0.900670101 0.099329899 [15,] 0.012571774 0.987428226 [16,] 0.977704079 0.022295921 [17,] 0.137304637 0.862695363 [18,] 0.972861575 0.027138425 [19,] 0.224470227 0.775529773 [20,] 0.004691973 0.995308027

R-SVM http://www.stanford.edu/group/wonglab/RSVMpage/r-svm.tar.gz http://www.stanford.edu/group/wonglab/RSVMpage/R-SVM.html Read/ skim the paper Explore this method on a dataset of your choice, e.g. one of the R built-in datasets

kernlab http://aquarius.tw.rpi.edu/html/DA/svmbasic_notes.pdf Some scripts: Lab9b_svm12_2016.R, Lab9b_svm13_2016.R

kernlab, svmpath and klaR http://aquarius.tw.rpi.edu/html/DA/v15i09.pdf Start at page 9 (bottom)