Curvilinear Component Analysis and Bregman divergences Jigang Sun Colin Fyfe Malcolm Crowe 28 April 2010 University of the West of Scotland
Multidimensional Scaling(MDS) A group of information visualisation methods that projects data from high dimensional space, to a low dimensional space, often two or three dimensions, keeping inter-point dissimilarities (e.g. distances) in low dimensional space as close as possible to the original dissimilarities in high dimensional space. When Euclidean distances are used, it is Metric MDS.
Visualising 18 dimensional data
Basic MDS Improve the Sammon mapping with Bregman divergence The basic MDS, the stress function to be minimised Sammon Mapping (1969) Improve the Sammon mapping with Bregman divergence
Bregman divergence Intuitively, it is the difference between the value of F at point p and the value of the first-order Taylor expansion of F around point q evaluated at point p. q p
2 representations When F is in one variable, the Bregman Divergence is truncated Taylor series Two useful properties for MDS 1. Non-negativity 2. Non-symmetry Except in special cases such as F(x)=x^2
Improving Sammon Mapping with Bregman divergences Recall the classical Sammon Mapping (1969) Choose a base convex function Important to say the first term in last line is Sammon mapping common term: the first term of ExtendedSammon is Sammon, not considering constant coefficients
An Experiment on Swiss roll data set
Two groups of Convex functions No 1 is for the Extended Sammon mapping.
OpenBox, Sammon and FirstGroup Important to spend time saying this is a smooth deformation of box
SecondGroup on OpenBox Take time to show this slide
Curvilinear Component Analysis (CCA) and Bregman Divergences W( .) has argument the inter-point distance in latent space Good at unfolding strongly nonlinear structures Stochastic gradient descent updating rule
A version of CCA One weight function can be Updating rule
Rewriting stress function for CCA using right Bregman divergences Given convex function Emphasise that latent distances in right position. Updating rule is the same
The common term between BasicCCA and Real CCA = The first term is common with
Real CCA vs Basic CCA Note that basic cca takes latent further away if they are to the right of critical point
Conclusions We introduced The Extended Sammon mapping vs the Sammon mapping We create two groups of left Bregman divergences and experiment on artificial data sets. A right Bregman divergence redefines the stress function for Curvilinear Component Analysis Any questions?