Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multimedia Databases SVD II. SVD - Detailed outline Motivation Definition - properties Interpretation Complexity Case studies SVD properties More case.

Similar presentations


Presentation on theme: "Multimedia Databases SVD II. SVD - Detailed outline Motivation Definition - properties Interpretation Complexity Case studies SVD properties More case."— Presentation transcript:

1 Multimedia Databases SVD II

2 SVD - Detailed outline Motivation Definition - properties Interpretation Complexity Case studies SVD properties More case studies Conclusions

3 SVD - detailed outline... Case studies SVD properties more case studies google/Kleinberg algorithms Conclusions

4 Optimality of SVD Def: The Frobenius norm of a n x m matrix M is (reminder) The rank of a matrix M is the number of independent rows (or columns) of M Let A=U  V T and A k = U k  k V k T (SVD approximation of A) Theorem: [Eckart and Young] Among all n x m matrices C of rank at most k, we have that:

5 SVD - Other properties - summary can produce orthogonal basis can solve over- and under-determined linear problems (see C(1) property) can compute ‘fixed points’ (= ‘steady state prob. in Markov chains’) (see C(4) property)

6 SVD -outline of properties (A): obvious (B): less obvious (C): least obvious (and most powerful!)

7 Properties - by defn.: A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] A(1): U T [r x n] U [n x r ] = I [r x r ] (identity matrix) A(2): V T [r x n] V [n x r ] = I [r x r ] A(3):  k = diag( 1 k, 2 k,... r k ) (k: ANY real number) A(4): A T = V  U T

8 Less obvious properties A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = ??

9 Less obvious properties A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U  2 U T symmetric; Intuition?

10 Less obvious properties A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U  2 U T symmetric; Intuition? ‘document-to-document’ similarity matrix B(2): symmetrically, for ‘V’ (A T ) [m x n] A [n x m] = V  2 V T Intuition?

11 Less obvious properties B(3): ( (A T ) [m x n] A [n x m] ) k = V  2k V T and B(4): (A T A ) k ~ v 1 1 2k v 1 T for k>>1 where v 1 : [m x 1] first column (eigenvector) of V 1 : strongest eigenvalue

12 Less obvious properties B(4): (A T A ) k ~ v 1 1 2k v 1 T for k>>1 B(5): (A T A ) k v’ ~ (constant) v 1 ie., for (almost) any v’, it converges to a vector parallel to v 1 Thus, useful to compute first eigenvector/value (as well as the next ones…)

13 Less obvious properties - repeated: A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U  2 U T B(2):(A T ) [m x n] A [n x m] = V  2 V T B(3): ( (A T ) [m x n] A [n x m] ) k = V  2k V T B(4): (A T A ) k ~ v 1 1 2k v 1 T B(5): (A T A ) k v’ ~ (constant) v 1

14 Least obvious properties A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] C(1): A [n x m] x [m x 1] = b [n x 1] let x 0 = V  (-1) U T b if under-specified, x 0 gives ‘shortest’ solution if over-specified, it gives the ‘solution’ with the smallest least squares error

15 Least obvious properties Illustration: under-specified, eg [1 2] [w z] T = 4 (ie, 1 w + 2 z = 4) 1 234 1 2 all possible solutions x0x0 w z shortest-length solution

16 Least obvious properties - cont’d A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] where v 1, u 1 the first (column) vectors of V, U. (v 1 == right-eigenvector) C(3): symmetrically: u 1 T A = 1 v 1 T u 1 == left-eigenvector Therefore:

17 Least obvious properties - cont’d A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] C(4): A T A v 1 = 1 2 v 1 (fixed point - the usual dfn of eigenvector for a symmetric matrix)

18 Least obvious properties - altogether A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] C(1): A [n x m] x [m x 1] = b [n x 1] then, x 0 = V  (-1) U T b: shortest, actual or least-squares solution C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] C(3): u 1 T A = 1 v 1 T C(4): A T A v 1 = 1 2 v 1

19 Properties - conclusions A(0): A [n x m] = U [ n x r ]  [ r x r ] V T [ r x m] B(5): (A T A ) k v’ ~ (constant) v 1 C(1): A [n x m] x [m x 1] = b [n x 1] then, x 0 = V  (-1) U T b: shortest, actual or least-squares solution C(4): A T A v 1 = 1 2 v 1

20 SVD - detailed outline... Case studies SVD properties more case studies Kleinberg/google algorithms Conclusions

21 Kleinberg’s Algorithm Main idea: In many cases, when you search the web using some terms, the most relevant pages may not contain this term (or contain the term only a few times) Harvard : www.harvard.eduwww.harvard.edu Search Engines: yahoo, google, altavista Authorities and hubs

22 Kleinberg’s algorithm Problem dfn: given the web and a query find the most ‘authoritative’ web pages for this query Step 0: find all pages containing the query terms (root set) Step 1: expand by one move forward and backward (base set)

23 Kleinberg’s algorithm Step 1: expand by one move forward and backward

24 Kleinberg’s algorithm on the resulting graph, give high score (= ‘authorities’) to nodes that many important nodes point to give high importance score (‘hubs’) to nodes that point to good ‘authorities’) hubsauthorities

25 Kleinberg’s algorithm observations recursive definition! each node (say, ‘i’-th node) has both an authoritativeness score a i and a hubness score h i

26 Kleinberg’s algorithm Let E be the set of edges and A be the adjacency matrix: the (i,j) is 1 if the edge from i to j exists Let h and a be [n x 1] vectors with the ‘hubness’ and ‘authoritativiness’ scores. Then:

27 Kleinberg’s algorithm Then: a i = h k + h l + h m that is a i = Sum (h j ) over all j that (j,i) edge exists or a = A T h k l m i

28 Kleinberg’s algorithm symmetrically, for the ‘hubness’: h i = a n + a p + a q that is h i = Sum (q j ) over all j that (i,j) edge exists or h = A a p n q i

29 Kleinberg’s algorithm In conclusion, we want vectors h and a such that: h = A a a = A T h Recall properties: C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] C(3): u 1 T A = 1 v 1 T

30 Kleinberg’s algorithm In short, the solutions to h = A a a = A T h are the left- and right- eigenvectors of the adjacency matrix A. Starting from random a’ and iterating, we’ll eventually converge (Q: to which of all the eigenvectors? why?)

31 Kleinberg’s algorithm (Q: to which of all the eigenvectors? why?) A: to the ones of the strongest eigenvalue, because of property B(5): B(5): (A T A ) k v’ ~ (constant) v 1

32 Kleinberg’s algorithm - results Eg., for the query ‘java’: 0.328 www.gamelan.com 0.251 java.sun.com 0.190 www.digitalfocus.com (“the java developer”)

33 Kleinberg’s algorithm - discussion ‘authority’ score can be used to find ‘similar pages’ to page p (how?) closely related to ‘citation analysis’, social networs / ‘small world’ phenomena

34 google/page-rank algorithm closely related: The Web is a directed graph of connected nodes imagine a particle randomly moving along the edges (*) compute its steady-state probabilities. That gives the PageRank of each pages (the importance of this page (*) with occasional random jumps

35 PageRank Definition Assume a page A and pages P1, P2, …, Pm that point to A. Let d is a damping factor. PR(A) the pagerank of A. C(A) the out-degree of A. Then:

36 google/page-rank algorithm Compute the PR of each page~identical problem: given a Markov Chain, compute the steady state probabilities p1... p5 1 2 3 4 5

37 Computing PageRank Iterative procedure Also, … navigate the web by randomly follow links or with prob p jump to a random page. Let A the adjacency matrix (n x n), d i out-degree of page i Prob(A i ->A j ) = pn -1 +(1-p)d i –1 A ij A’[i,j] = Prob(A i ->A j )

38 google/page-rank algorithm Let A’ be the transition matrix (= adjacency matrix, row-normalized : sum of each row = 1) 1 2 3 4 5 =

39 google/page-rank algorithm A p = p 1 2 3 4 5 =

40 google/page-rank algorithm A p = p thus, p is the eigenvector that corresponds to the highest eigenvalue (=1, since the matrix is row- normalized)

41 Kleinberg/google - conclusions SVD helps in graph analysis: hub/authority scores: strongest left- and right- eigenvectors of the adjacency matrix random walk on a graph: steady state probabilities are given by the strongest eigenvector of the transition matrix

42 Conclusions SVD: a valuable tool given a document-term matrix, it finds ‘concepts’ (LSI)... and can reduce dimensionality (KL)... and can find rules (PCA; RatioRules)

43 Conclusions cont’d... and can find fixed-points or steady- state probabilities (google/ Kleinberg/ Markov Chains)... and can solve optimally over- and under-constraint linear systems (least squares)

44 References Brin, S. and L. Page (1998). Anatomy of a Large-Scale Hypertextual Web Search Engine. 7th Intl World Wide Web Conf. Chen, C. M. and N. Roussopoulos (May 1994). Adaptive Selectivity Estimation Using Query Feedback. Proc. of the ACM-SIGMOD, Minneapolis, MN.

45 References cont’d Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment. Proc. 9th ACM- SIAM Symposium on Discrete Algorithms. Press, W. H., S. A. Teukolsky, et al. (1992). Numerical Recipes in C, Cambridge University Press.


Download ppt "Multimedia Databases SVD II. SVD - Detailed outline Motivation Definition - properties Interpretation Complexity Case studies SVD properties More case."

Similar presentations


Ads by Google