Download presentation
Presentation is loading. Please wait.
1
Multivariate statistics
PCA: principal component analysis Correspondence analysis Canonical correlation Discriminant function analysis LDA (linear discriminant function analysis) QDA(quadratic …) Cluster analysis MANOVA Xuhua Xia
2
lda in MASS package library(MASS)
fit<-lda(Species~.,data=iris,CV=T) ct<-table(iris$Species,fit$class) prop.table(ct,1) # each row sum to 1 prop.table(ct,2) # each column sum to 1 diag(prop.table(ct,1)) sum(diag(prop.table(ct))) fit<-lda(Species~.,data=iris) pred<-predict(fit,iris) nd<-data.frame(LD1=pred$x[,1],LD2=pred$x[,2],Sp=iris$Species) library(ggplot2) ggplot(data=nd)+geom_point(aes(x=LD1,y=LD2,color=Sp)) install.packages("klaR") library(klaR) partimat(Species~.,data=iris,method="qda") CV=T will do leave-one-out cross validation. The predicted group ID is in fit$class Xuhua Xia
3
-2 -1 1 2 3 -10 -5 5 10 LD1 LD2 Sp setosa versicolor virginica
4
qda in MASS package library(MASS)
fit<-qda(Species~.,data=iris,CV=T) fit2<-qda(Species~.,data=iris,CV=T,method="mle") ct<-table(iris$Species,fit2$class) diag(prop.table(ct,1)) sum(diag(prop.table(ct))) library(klaR) partimat(Species~.,data=iris,method="qda") Xuhua Xia
6
Cross-validation Holdout method with jackknifing
K-fold cross validation: Divide data in each category into K sets. Each set will be used a test set and the rest will be pooled as training data. Every data point gets to be in a test set exactly once, and gets to be in a training set k-1 times. Leave-one-out cross validation: K-fold cross validation taken to its extreme, with K equal to N, the number of data points in the set. Xuhua Xia
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.