Proximity matrices and scaling Purpose of scaling Similarities and dissimilarities Classical Euclidean scaling Non-Euclidean scaling Horseshoe effect Non-Metric.

Slides:



Advertisements
Similar presentations
Chapter 4 Euclidean Vector Spaces
Advertisements

CS 450: COMPUTER GRAPHICS LINEAR ALGEBRA REVIEW SPRING 2015 DR. MICHAEL J. REALE.
Component Analysis (Review)
Algebraic, transcendental (i.e., involving trigonometric and exponential functions), ordinary differential equations, or partial differential equations...
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Eigenvalues and Eigenvectors
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Lecture 6 Ordination Ordination contains a number of techniques to classify data according to predefined standards. The simplest ordination technique is.
Procrustes analysis Purpose of procrustes analysis Algorithm R code Various modifications.
Chapter 5 Orthogonality
Principal component regression
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Factor Analysis Purpose of Factor Analysis Maximum likelihood Factor Analysis Least-squares Factor rotation techniques R commands for factor analysis References.
Factor Analysis Purpose of Factor Analysis
Principal component analysis (PCA)
L16: Micro-array analysis Dimension reduction Unsupervised clustering.
Contingency tables and Correspondence analysis
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Procrustes analysis Purpose of procrustes analysis Algorithm Various modifications.
Canonical correlations
Face Recognition Jeremy Wyatt.
SVD and PCA COS 323. Dimensionality Reduction Map points in high-dimensional space to lower number of dimensionsMap points in high-dimensional space to.
Introduction Given a Matrix of distances D, (which contains zeros in the main diagonal and is squared and symmetric), find variables which could be able,
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Basics of discriminant analysis
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Contingency tables and Correspondence analysis Contingency table Pearson’s chi-squared test for association Correspondence analysis using SVD Plots References.
Proximity matrices and scaling Purpose of scaling Similarities and dissimilarities Classical Euclidean scaling Non-Euclidean scaling Horseshoe effect Non-Metric.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Linear and generalised linear models
MOHAMMAD IMRAN DEPARTMENT OF APPLIED SCIENCES JAHANGIRABAD EDUCATIONAL GROUP OF INSTITUTES.
Principal component analysis (PCA)
Linear and generalised linear models
Basics of regression analysis
Principal component analysis (PCA) Purpose of PCA Covariance and correlation matrices PCA using eigenvalues PCA using singular value decompositions Selection.
Proximity matrices and scaling Purpose of scaling Classical Euclidean scaling Non-Euclidean scaling Non-Metric Scaling Example.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Boot Camp in Linear Algebra Joel Barajas Karla L Caballero University of California Silicon Valley Center October 8th, 2008.
Elements of cluster analysis Purpose of cluster analysis Various clustering techniques Agglomerative clustering Individual distances Group distances Other.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Stats & Linear Models.
Separate multivariate observations
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Compiled By Raj G. Tiwari
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Eigenvalues and Eigenvectors
Digital Image Processing, 3rd ed. © 1992–2008 R. C. Gonzalez & R. E. Woods Gonzalez & Woods Matrices and Vectors Objective.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
Elementary Linear Algebra Anton & Rorres, 9th Edition
Chapter 5 MATRIX ALGEBRA: DETEMINANT, REVERSE, EIGENVALUES.
Eigenvectors and Linear Transformations Recall the definition of similar matrices: Let A and C be n  n matrices. We say that A is similar to C in case.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Eigenvalues The eigenvalue problem is to determine the nontrivial solutions of the equation Ax= x where A is an n-by-n matrix, x is a length n column.
Visualizing and Exploring Data 1. Outline 1.Introduction 2.Summarizing Data: Some Simple Examples 3.Tools for Displaying Single Variable 4.Tools for Displaying.
1 Chapter 8 – Symmetric Matrices and Quadratic Forms Outline 8.1 Symmetric Matrices 8.2Quardratic Forms 8.3Singular ValuesSymmetric MatricesQuardratic.
Matrices CHAPTER 8.9 ~ Ch _2 Contents  8.9 Power of Matrices 8.9 Power of Matrices  8.10 Orthogonal Matrices 8.10 Orthogonal Matrices 
Unsupervised Learning II Feature Extraction
Boot Camp in Linear Algebra TIM 209 Prof. Ram Akella.
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Principal component analysis (PCA)
Matrices and Vectors Review Objective
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Clustering and Multidimensional Scaling
Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors.
Feature space tansformation methods
Generally Discriminant Analysis
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Proximity matrices and scaling Purpose of scaling Similarities and dissimilarities Classical Euclidean scaling Non-Euclidean scaling Horseshoe effect Non-Metric Scaling Example

Proximity matrices There are many situations when dissimilarities (or similarities) between individuals are measured. Purpose of multidimensional scaling is to find relative orientation (in some sense) of the individuals. If dissimilarities are ordinary Euclidean distances then purpose could be finding relative coordinates of the individuals. Scaling is also used for seriation purposes. E.g. in archeology, history. In this case only one dimensional configuration is sought. In general dissimilarity matrix is called proximity matrix. I.e. how close are individuals to each other. Euclidean distances are simplest of all and they relatively easy to work with. Distances are metric if for all triples (triangular inequality) (i,j,k): Matrix with elements of these distances is denoted as D. This distance matrix is said to be Euclidean if n points corresponding to this matrix can be embedded in an Euclidean space so that distances between the points in the Euclidean space is equal to the corresponding elements of the matrix (it can be modified to add weights). If distances correspond to Euclidean distances then there is an elegant solution to the problem of finding configuration from the distance matrix.

Similarity and dissimilarity There are situations when instead of distances (similarities) between of object similarities are given. Similarities can be defined as a matrix with elements with conditions: For example correlation matrix can be considered as a similarity matrix. We can calculated dissimilarities (distances) from the similarities using: This matrix obeys the conditions required by the metric distances

Metric scaling Suppose that we have an nxn matrix of square of pairwise distances that are Euclidean. Denote this matrix and its elements with D and d ij 2. We want to find n points in k dimensional Euclidean space so that distances between these points are equal to the elements of D. Denote nxk matrix of points by X. Let us define Q=XX T. Then for distances and elements of Q we can write: D is the matrix with elements equal to square of d rs. Then between elements of D and Q the following relation can be written: We can find the positions of the points if we assume that centroid of the elements of X is 0, i.e. Then we can write the following relations: Here we used the fact that centroid of the points is 0. Furthermore we can write: Using these identities we can express diagonal elements of the Q using elements of D:

Metric scaling: Cont. Let us denote matrix of diagonal elements of Q by E=diag(q 11,q 22,,,,q nn ). Then relation between elements of D and Q could be written in a matrix form: If we will use the relation between diagonal terms of Q and elements of D we can write: Then we can write relation between Q and D: Thus we see that if we have the matrix of dissimilarities D with elements of d ij we can find matrix Q using the relation: Since D is Euclidean then matrix Q is positive semi-definite and symmetric. Then we can use spectral decomposition of Q (decomposition using eigenvalues and eigenvectors): If Q is positive semi-definite then all eigenvalues are non-negative. We can write (recall the form of the Q=XX T ): This gives matrix of n coordinates (principal coordinates). Of course this configuration is not unique.

Metric scaling: Cont. Algorithm for metric scaling (finding principal coordinates) can be written as follows: 1)Given matrix of dissimilarities form the matrix  with elements of –1/2d ij 2)Subtract from each elements of this matrix average of raw and column elements where it is located. Denote it by  3)Find the eigenvalues of this matrix and corresponding eigenvectors of . The dimensionality of the representation corresponds to the number of non-zero eigenvalues of this matrix. 4)Normalise eigenvectors so that:

Dimensionality “Goodness of fit” of k dimensional configuration to the original matrix of dissimilarity is measured using “per cent trace” defined as: Note that principal coordinate and principal component analysis are similar in some sense. If X is nxp data matrix and we have nxn dissimilarity matrix then principal component scores coincide with the coordinates calculated using scaling. It might happen that dissimilarity matrix is not Euclidean. Then some of the calculated eigenvalues may become negative and the coordinate representation may include imaginary coordinates. If these eigenvalues are small then they can be ignored and set to 0. Another way of avoiding this problem is finding best possible approximation of the dissimilarity matrix by Euclidean dissimilarity matrix. If some eigenvalues are negative then goodness of fit can be modified in two different ways:

Horseshoe effect In many situations close distances between objects are measured more accurately than long distances. As a result scaling techniques pulls objects further from each other closer to each other. Relative positions of objects close to each other are defined better than for object far from each other. This is called horseshoe effect. This effect is present in many other dimension reduction techniques also. Define similarity matrix 51x51 between objects: Then distance matrix is derived from this matrix using the relation give above. For this matrix the result of metric scaling is:

Other types of scaling Classical scaling is one of the techniques used to find configuration from the dissimilarity matrix. There are several other techniques. Although they do not have direct algebraic solution with modern computer they can be implemented and used. One of these techniques minmises the function: Where  -s are calculated distances. There is no algebraic solution exists. In this case initial dimension is chosen then starting configuration is postulated and then this function is minimisied. If dimension is changed then whole procedure is repeated again. Then values of the function at the minimum for different dimensions can be used for scree-plot to decide dimensionality. One of the techniques uses following modification of the calculated distances: Then it used in two types of functions. First of them is the standardised residual sum of squares: Another one is the modified version of this: Both these functions must be minimised iteratively using one of the optimisation techniques. For this technique we do not have to use Euclidean distances.

Non-metric scaling Although Euclidean and metric scaling can be applied for wide range of the cases there might be cases then requirement of these techniques might not be satisfied. For example: 1)Dissimilarity of the objects may not be true distance but only order of similarity. I.e. we can only say that for a certain objects we should have (M=1/2 n(n-1)) 2)Measurement may have such large error that we can be sure only on order of distances 3)When we use metric scaling we assume that true configuration exists and what we measure is an approximation for interpoint distances for this configurations. It may happen that we measure some function of the interpoint distances. These cases are usually handled using non-metric scaling. One of the techniques is to use the function STRESS with constraints: This technique is called non-metric to distinguish it from the previously described techniques. If the observed distances are equal then usual technique is to exclude them from constraints. Non-metric scaling techniques can handle missing distances simply by ignoring them.

Example: U.K. distances Here is table of some of the intercity distances: Bristol Cardiff Dover Exeter Hull Leeds London Bristol 0 Cardiff 47 0 Dover Exeter Hull Leeds London

Example: Result of the metric scaling. Two dimensional coordinates: [,1] [,2] [1,] [2,] [3,] [4,] [5,] [6,] [7,]

Example: Plot Two dimensional plot gives. In this case we see that it gives representation of the UK map.

R commands for scaling Classical metric scaling is in the library mva library(mva) cc = cmdscale(d, k = 2, eig = FALSE, add = FALSE, x.ret = FALSE) d is a distance matrix, k is the number of the dimensions required. You can plot using plot(cc) or using the following set of commands: x = cc[,1] (or x = cc$points[,1]) y = cc[,2] (or y = cc$points[,2]) plot(x,y,main( “ Metric scaling results ” ) text(x,y,names(cc)) It is a good idea to have a look if number of the dimension requested is sufficient. It can be done by requesting eigenvalues and comparing them.

R commands for scaling Non-metric scaling can be done using isoMDS from the library(MASS) library(MASS) ?isoMDS then you can use cc1 = isoMDS(a,cmdscale(a,k),k=2). The second argument is for the initial configuration. Then we can plot using x = cc1$points[,1] y = cc1$points[,2] plot(x,y,main=“isoMDS scaling”) text(x,y,names(a)) Another non-metric scaling command is sammon(a,cmdscale(a,k),k=2) I f you have a data matrix X then you can calculate distances using the command dist dist(X,method=“euclidean”) then result of this command can be used for analysis with cmdscale, isoMDS or sammon.

References 1)Krzanowski WJ and Marriout FHC. (1994) Multivatiate analysis. Kendall’s library of statistics 2)Mardia, K.V. Kent, J.T. and Bibby, J.M. (2003) Multivariate analysis