Multimedia Databases SVD II
SVD - Detailed outline Motivation Definition - properties Interpretation Complexity Case studies SVD properties More case studies Conclusions
SVD - detailed outline... Case studies SVD properties more case studies google/Kleinberg algorithms Conclusions
Optimality of SVD Def: The Frobenius norm of a n x m matrix M is (reminder) The rank of a matrix M is the number of independent rows (or columns) of M Let A=U V T and A k = U k k V k T (SVD approximation of A) Theorem: [Eckart and Young] Among all n x m matrices C of rank at most k, we have that:
SVD - Other properties - summary can produce orthogonal basis can solve over- and under-determined linear problems (see C(1) property) can compute ‘fixed points’ (= ‘steady state prob. in Markov chains’) (see C(4) property)
SVD -outline of properties (A): obvious (B): less obvious (C): least obvious (and most powerful!)
Properties - by defn.: A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] A(1): U T [r x n] U [n x r ] = I [r x r ] (identity matrix) A(2): V T [r x n] V [n x r ] = I [r x r ] A(3): k = diag( 1 k, 2 k,... r k ) (k: ANY real number) A(4): A T = V U T
Less obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = ??
Less obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U 2 U T symmetric; Intuition?
Less obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U 2 U T symmetric; Intuition? ‘document-to-document’ similarity matrix B(2): symmetrically, for ‘V’ (A T ) [m x n] A [n x m] = V 2 V T Intuition?
Less obvious properties B(3): ( (A T ) [m x n] A [n x m] ) k = V 2k V T and B(4): (A T A ) k ~ v 1 1 2k v 1 T for k>>1 where v 1 : [m x 1] first column (eigenvector) of V 1 : strongest eigenvalue
Less obvious properties B(4): (A T A ) k ~ v 1 1 2k v 1 T for k>>1 B(5): (A T A ) k v’ ~ (constant) v 1 ie., for (almost) any v’, it converges to a vector parallel to v 1 Thus, useful to compute first eigenvector/value (as well as the next ones…)
Less obvious properties - repeated: A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U 2 U T B(2):(A T ) [m x n] A [n x m] = V 2 V T B(3): ( (A T ) [m x n] A [n x m] ) k = V 2k V T B(4): (A T A ) k ~ v 1 1 2k v 1 T B(5): (A T A ) k v’ ~ (constant) v 1
Least obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(1): A [n x m] x [m x 1] = b [n x 1] let x 0 = V (-1) U T b if under-specified, x 0 gives ‘shortest’ solution if over-specified, it gives the ‘solution’ with the smallest least squares error
Least obvious properties Illustration: under-specified, eg [1 2] [w z] T = 4 (ie, 1 w + 2 z = 4) all possible solutions x0x0 w z shortest-length solution
Least obvious properties - cont’d A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] where v 1, u 1 the first (column) vectors of V, U. (v 1 == right-eigenvector) C(3): symmetrically: u 1 T A = 1 v 1 T u 1 == left-eigenvector Therefore:
Least obvious properties - cont’d A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(4): A T A v 1 = 1 2 v 1 (fixed point - the usual dfn of eigenvector for a symmetric matrix)
Least obvious properties - altogether A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(1): A [n x m] x [m x 1] = b [n x 1] then, x 0 = V (-1) U T b: shortest, actual or least-squares solution C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] C(3): u 1 T A = 1 v 1 T C(4): A T A v 1 = 1 2 v 1
Properties - conclusions A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(5): (A T A ) k v’ ~ (constant) v 1 C(1): A [n x m] x [m x 1] = b [n x 1] then, x 0 = V (-1) U T b: shortest, actual or least-squares solution C(4): A T A v 1 = 1 2 v 1
SVD - detailed outline... Case studies SVD properties more case studies Kleinberg/google algorithms Conclusions
Kleinberg’s Algorithm Main idea: In many cases, when you search the web using some terms, the most relevant pages may not contain this term (or contain the term only a few times) Harvard : Search Engines: yahoo, google, altavista Authorities and hubs
Kleinberg’s algorithm Problem dfn: given the web and a query find the most ‘authoritative’ web pages for this query Step 0: find all pages containing the query terms (root set) Step 1: expand by one move forward and backward (base set)
Kleinberg’s algorithm Step 1: expand by one move forward and backward
Kleinberg’s algorithm on the resulting graph, give high score (= ‘authorities’) to nodes that many important nodes point to give high importance score (‘hubs’) to nodes that point to good ‘authorities’) hubsauthorities
Kleinberg’s algorithm observations recursive definition! each node (say, ‘i’-th node) has both an authoritativeness score a i and a hubness score h i
Kleinberg’s algorithm Let E be the set of edges and A be the adjacency matrix: the (i,j) is 1 if the edge from i to j exists Let h and a be [n x 1] vectors with the ‘hubness’ and ‘authoritativiness’ scores. Then:
Kleinberg’s algorithm Then: a i = h k + h l + h m that is a i = Sum (h j ) over all j that (j,i) edge exists or a = A T h k l m i
Kleinberg’s algorithm symmetrically, for the ‘hubness’: h i = a n + a p + a q that is h i = Sum (q j ) over all j that (i,j) edge exists or h = A a p n q i
Kleinberg’s algorithm In conclusion, we want vectors h and a such that: h = A a a = A T h Recall properties: C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] C(3): u 1 T A = 1 v 1 T
Kleinberg’s algorithm In short, the solutions to h = A a a = A T h are the left- and right- eigenvectors of the adjacency matrix A. Starting from random a’ and iterating, we’ll eventually converge (Q: to which of all the eigenvectors? why?)
Kleinberg’s algorithm (Q: to which of all the eigenvectors? why?) A: to the ones of the strongest eigenvalue, because of property B(5): B(5): (A T A ) k v’ ~ (constant) v 1
Kleinberg’s algorithm - results Eg., for the query ‘java’: java.sun.com (“the java developer”)
Kleinberg’s algorithm - discussion ‘authority’ score can be used to find ‘similar pages’ to page p (how?) closely related to ‘citation analysis’, social networs / ‘small world’ phenomena
google/page-rank algorithm closely related: The Web is a directed graph of connected nodes imagine a particle randomly moving along the edges (*) compute its steady-state probabilities. That gives the PageRank of each pages (the importance of this page (*) with occasional random jumps
PageRank Definition Assume a page A and pages P1, P2, …, Pm that point to A. Let d is a damping factor. PR(A) the pagerank of A. C(A) the out-degree of A. Then:
google/page-rank algorithm Compute the PR of each page~identical problem: given a Markov Chain, compute the steady state probabilities p1... p
Computing PageRank Iterative procedure Also, … navigate the web by randomly follow links or with prob p jump to a random page. Let A the adjacency matrix (n x n), d i out-degree of page i Prob(A i ->A j ) = pn -1 +(1-p)d i –1 A ij A’[i,j] = Prob(A i ->A j )
google/page-rank algorithm Let A’ be the transition matrix (= adjacency matrix, row-normalized : sum of each row = 1) =
google/page-rank algorithm A p = p =
google/page-rank algorithm A p = p thus, p is the eigenvector that corresponds to the highest eigenvalue (=1, since the matrix is row- normalized)
Kleinberg/google - conclusions SVD helps in graph analysis: hub/authority scores: strongest left- and right- eigenvectors of the adjacency matrix random walk on a graph: steady state probabilities are given by the strongest eigenvector of the transition matrix
Conclusions SVD: a valuable tool given a document-term matrix, it finds ‘concepts’ (LSI)... and can reduce dimensionality (KL)... and can find rules (PCA; RatioRules)
Conclusions cont’d... and can find fixed-points or steady- state probabilities (google/ Kleinberg/ Markov Chains)... and can solve optimally over- and under-constraint linear systems (least squares)
References Brin, S. and L. Page (1998). Anatomy of a Large-Scale Hypertextual Web Search Engine. 7th Intl World Wide Web Conf. Chen, C. M. and N. Roussopoulos (May 1994). Adaptive Selectivity Estimation Using Query Feedback. Proc. of the ACM-SIGMOD, Minneapolis, MN.
References cont’d Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment. Proc. 9th ACM- SIAM Symposium on Discrete Algorithms. Press, W. H., S. A. Teukolsky, et al. (1992). Numerical Recipes in C, Cambridge University Press.