Image Manifolds : Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson
Images as Vectors = m n n*m
Importance of Alignment = m n n*m = =?
Text Synthesis [ Shannon,’48] proposed a way to generate English- looking text using N-grams: Assume a generalized Markov model Use a large text to compute prob. distributions of each letter given N-1 previous letters Starting from a seed repeatedly sample this Markov chain to generate new letters Also works for whole words WE NEEDTOEATCAKE
Mark V. Shaney (Bell Labs) Results (using alt.singles corpus): “As I've commented before, really relating to someone involves standing next to impossible.” “One morning I shot an elephant in my arms and kissed him.” “I spent an interesting evening recently with a grain of salt”
Video Textures Arno Schödl Richard Szeliski David Salesin Irfan Essa Microsoft Research, Georgia Tech
Video textures
Our approach How do we find good transitions?
Finding good transitions Compute L 2 distance D i, j between all frames Similar frames make good transitions frame ivs. frame j
Markov chain representation Similar frames make good transitions
Transition costs Transition from i to j if successor of i is similar to j Cost function: C i j = D i+1, j
Transition probabilities Probability for transition P i j inversely related to cost: P i j ~ exp ( – C i j / 2 ) high low
Preserving dynamics
Cost for transition i j C i j = w k D i+k+1, j+k
Preserving dynamics – effect Cost for transition i j C i j = w k D i+k+1, j+k
Video sprite extraction
Video sprite control Augmented transition cost:
Interactive fish
Advanced Perception David R. Thompson manifold learning with applications to object recognition
plenoptic function manifolds in vision
appearance variation manifolds in vision images from hormel corp.
deformation manifolds in vision images from
Find a low-D basis for describing high-D data. X ~= X' S.T. dim(X') << dim(X) uncovers the intrinsic dimensionality manifold learning
If we knew all pairwise distances… ChicagoRaleighBostonSeattleS.F.AustinOrlando Chicago0 Raleigh6410 Boston Seattle S.F Austin Orlando Distances calculated with geobytes.com/CityDistanceTool
Multidimensional Scaling (MDS) For n data points, and a distance matrix D, D ij =...we can construct a m-dimensional space to preserve inter-point distances by using the top eigenvectors of D scaled by their eigenvalues j i
MDS result in 2D
Actual plot of cities
Don’t know distances
Don’t know distnaces
1. data compression 2. “curse of dimensionality” 3. de-noising 4. visualization 5. reasonable distance metrics why do manifold learning?
reasonable distance metrics ?
? linear interpolation
reasonable distance metrics ? manifold interpolation
Isomap for images Build a data graph G. Vertices: images (u,v) is an edge iff SSD(u,v) is small For any two images, we approximate the distance between them with the “shortest path” on G
Isomap 1. Build a sparse graph with K-nearest neighbors D g = (distance matrix is sparse)
Isomap 2. Infer other interpoint distances by finding shortest paths on the graph (Dijkstra's algorithm). D g =
Isomap shortest-distance on a graph is easy to compute
Isomap results: hands
- preserves global structure - few free parameters - sensitive to noise, noise edges - computationally expensive (dense matrix eigen-reduction) Isomap: pro and con
Leakage problem
Find a mapping to preserve local linear relationships between neighbors Locally Linear Embedding
Locally Linear Embedding
1. Find weight matrix W of linear coefficients: Enforce sum-to-one constraint. LLE: Two key steps
2. Find projected vectors Y to minimize reconstruction error must solve for whole dataset simultaneously LLE: Two key steps
LLE: Result preserves local topology PCA LLE
- no local minima, one free parameter - incremental & fast - simple linear algebra operations - can distort global structure LLE: pro and con