Download presentation
Presentation is loading. Please wait.
Published byCora Richardson Modified over 9 years ago
1
Carnegie Mellon Powerful Tools for Data Mining Fractals, power laws, SVD C. Faloutsos Carnegie Mellon University
2
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)2 Part I: Singular Value Decomposition (SVD) = Eigenvalue decomposition www.cs.cmu.edu/~christos/
3
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)3 SVD - outline Motivation Definition - properties Interpretation; more properties Complexity Applications - success stories Conclusions
4
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)4 SVD - Motivation problem #1: text - LSI: find ‘concepts’ problem #2: compression / dim. reduction
5
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)5 SVD - Motivation problem #1: text - LSI: find ‘concepts’
6
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)6 SVD - Motivation problem #2: compress / reduce dimensionality
7
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)7 Problem - specs ~10**6 rows; ~10**3 columns; no updates; Compress / extract features
8
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)8 SVD - Motivation
9
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)9 SVD - Motivation
10
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)10 SVD - outline Motivation Definition - properties Interpretation; more properties Complexity Applications - success stories Conclusions
11
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)11 SVD - Definition (reminder: matrix multiplication x 3 x 22 x 1 =
12
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)12 SVD - Definition (reminder: matrix multiplication x 3 x 22 x 1 = 3 x 1
13
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)13 SVD - Definition (reminder: matrix multiplication x 3 x 22 x 1 = 3 x 1
14
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)14 SVD - Definition (reminder: matrix multiplication x 3 x 22 x 1 = 3 x 1
15
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)15 SVD - Definition (reminder: matrix multiplication x=
16
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)16 SVD - notation Conventions: bold capitals -> matrix (eg. A, U, , V) bold lower-case -> column vector (eg., x, v 1, u 3 ) regular lower-case -> scalars (eg., 1, r )
17
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)17 SVD - Definition A [n x m] = U [n x r] r x r] (V [m x r] ) T A: n x m matrix (eg., n documents, m terms) U: n x r matrix (n documents, r concepts) : r x r diagonal matrix (strength of each ‘concept’) (r : rank of the matrix) V: m x r matrix (m terms, r concepts)
18
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)18 SVD - Definition A = U V T - example:
19
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)19 SVD - Properties THEOREM [Press+92]: always possible to decompose matrix A into A = U V T, where U, V: unique (*) U, V: column orthonormal (ie., columns are unit vectors, orthogonal to each other) –U T U = I; V T V = I (I: identity matrix) : eigenvalues are positive, and sorted in decreasing order
20
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)20 SVD - Example A = U V T - example: data inf. retrieval brain lung = CS MD xx
21
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)21 SVD - Example A = U V T - example: data inf. retrieval brain lung = CS MD xx CS-concept MD-concept
22
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)22 SVD - Example A = U V T - example: data inf. retrieval brain lung = CS MD xx CS-concept MD-concept doc-to-concept similarity matrix
23
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)23 SVD - Example A = U V T - example: data inf. retrieval brain lung = CS MD xx ‘strength’ of CS-concept
24
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)24 SVD - Example A = U V T - example: data inf. retrieval brain lung = CS MD xx term-to-concept similarity matrix CS-concept
25
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)25 SVD - Example A = U V T - example: data inf. retrieval brain lung = CS MD xx term-to-concept similarity matrix CS-concept
26
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)26 SVD - outline Motivation Definition - properties Interpretation; more properties Complexity Applications - success stories Conclusions
27
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)27 SVD - Interpretation #1 ‘documents’, ‘terms’ and ‘concepts’: [Foltz+92] U: document-to-concept similarity matrix V: term-to-concept sim. matrix : its diagonal elements: ‘strength’ of each concept
28
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)28 SVD - Interpretation #1 ‘documents’, ‘terms’ and ‘concepts’: Q: if A is the document-to-term matrix, what is A T A? Q: A A T ?
29
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)29 SVD - Interpretation #1 ‘documents’, ‘terms’ and ‘concepts’: Q: if A is the document-to-term matrix, what is A T A? A: term-to-term ([m x m]) similarity matrix Q: A A T ? A: document-to-document ([n x n]) similarity matrix
30
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)30 SVD - Interpretation #2 best axis to project on: (‘best’ = min sum of squares of projection errors)
31
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)31 SVD - Motivation
32
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)32 SVD - interpretation #2 minimum RMS error SVD: gives best axis to project v1
33
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)33 SVD - Interpretation #2
34
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)34 SVD - Interpretation #2 A = U V T - example: = xx v1
35
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)35 SVD - Interpretation #2 A = U V T - example: = xx variance (‘spread’) on the v1 axis
36
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)36 SVD - Interpretation #2 A = U V T - example: –U gives the coordinates of the points in the projection axis = xx
37
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)37 SVD - Interpretation #2 More details Q: how exactly is dim. reduction done? = xx
38
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)38 SVD - Interpretation #2 More details Q: how exactly is dim. reduction done? A: set the smallest eigenvalues to zero: = xx
39
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)39 SVD - Interpretation #2 ~ xx
40
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)40 SVD - Interpretation #2 ~ xx
41
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)41 SVD - Interpretation #2 ~ xx
42
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)42 SVD - Interpretation #2 ~
43
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)43 SVD - Interpretation #2 Exactly equivalent: ‘spectral decomposition’ of the matrix: = xx
44
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)44 SVD - Interpretation #2 Exactly equivalent: ‘spectral decomposition’ of the matrix: = xx u1u1 u2u2 1 2 v1v1 v2v2
45
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)45 SVD - Interpretation #2 Exactly equivalent: ‘spectral decomposition’ of the matrix: =u1u1 1 vT1vT1 u2u2 2 vT2vT2 + +... n m
46
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)46 SVD - Interpretation #2 Exactly equivalent: ‘spectral decomposition’ of the matrix: =u1u1 1 vT1vT1 u2u2 2 vT2vT2 + +... n m n x 1 1 x m r terms
47
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)47 SVD - Interpretation #2 approximation / dim. reduction: by keeping the first few terms (Q: how many?) =u1u1 1 vT1vT1 u2u2 2 vT2vT2 + +... n m assume: 1 >= 2 >=...
48
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)48 SVD - Interpretation #2 A (heuristic - [Fukunaga]): keep 80-90% of ‘energy’ (= sum of squares of i ’s) =u1u1 1 vT1vT1 u2u2 2 vT2vT2 + +... n m assume: 1 >= 2 >=...
49
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)49 SVD - outline Motivation Definition - properties Interpretation –#1: documents/terms/concepts –#2: dim. reduction –#3: picking non-zero, rectangular ‘blobs’ –#4: fixed points - graph interpretation Complexity...
50
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)50 SVD - Interpretation #3 finds non-zero ‘blobs’ in a data matrix = xx
51
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)51 SVD - Interpretation #3 finds non-zero ‘blobs’ in a data matrix = xx
52
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)52 SVD - Interpretation #3 finds non-zero ‘blobs’ in a data matrix = xx
53
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)53 SVD - Interpretation #3 finds non-zero ‘blobs’ in a data matrix = xx
54
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)54 SVD - Interpretation #4 A as vector transformation Axx’ = x
55
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)55 SVD - Interpretation #4 if A is symmetric, then, by defn., its eigenvectors remain parallel to themselves (‘fixed points’) Av1v1 v1v1 = 3.62 * 1
56
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)56 SVD - Interpretation #4 fixed point operation (A T A is symmetric) –A T A v i = i 2 v i (i=1,.., r) [Property 1] –eg., A: transition matrix of a Markov Chain; then v 1 is related to the steady state probability distribution (= a particle, doing a random walk on the graph) convergence: for (almost) any v’: –(A T A) k v’ ~ (constant) v 1 [Property 2] –where v 1 : strongest ‘right’ eigenvector
57
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)57 SVD - Interpretation #4 convergence - illustration:
58
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)58 SVD - Interpretation #4 convergence - illustration:
59
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)59 SVD - Interpretation #4 v 1... v r : ‘right’ eigenvectors symmetrically: u 1... u r : ‘left’ eigenvectors for matrix A (= U V T ): A v i = i u i u i T A = i v i T (right eigenvectors -> ‘authority’ scores in Kleinberg’s algo; left ones -> ‘hub’ scores)
60
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)60 SVD - outline Motivation Definition - properties Interpretation Complexity Applications - success stories Conclusions
61
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)61 SVD - Complexity O( n * m * m) or O( n * n * m) (whichever is less) less work, if we just want eigenvalues... or if we want first k eigenvectors... or if the matrix is sparse [Berry] Implemented: in any linear algebra package (LINPACK, matlab, Splus, mathematica...)
62
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)62 SVD - conclusions so far SVD: A= U V T : unique (*) U: document-to-concept similarities V: term-to-concept similarities : strength of each concept
63
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)63 SVD - conclusions so far dim. reduction: keep the first few strongest eigenvalues (80-90% of ‘energy’) SVD: picks up linear correlations SVD: picks up non-zero ‘blobs’ v 1 : fixed point (-> steady-state prob.)
64
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)64 SVD - outline Motivation Definition - properties Interpretation Complexity Applications - success stories Conclusions
65
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)65 SVD - Case studies... Applications - success stories –multi-lingual IR; LSI queries –compression –PCA - ‘ratio rules’ –Karhunen-Lowe transform –Kleinberg/google algorithm
66
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)66 Case study - LSI Q1: How to do queries with LSI? Q2: multi-lingual IR (english query, on spanish text?)
67
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)67 Case study - LSI Q1: How to do queries with LSI? Problem: Eg., find documents with ‘data’ data inf. retrieval brain lung = CS MD xx
68
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)68 Case study - LSI Q1: How to do queries with LSI? A: map query vectors into ‘concept space’ – how? data inf. retrieval brain lung = CS MD xx
69
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)69 Case study - LSI Q1: How to do queries with LSI? A: map query vectors into ‘concept space’ – how? data inf. retrieval brain lung qT=qT= term1 term2 v1 q v2
70
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)70 Case study - LSI Q1: How to do queries with LSI? A: map query vectors into ‘concept space’ – how? data inf. retrieval brain lung term1 term2 v1 q v2 A: inner product (cosine similarity) with each ‘concept’ vector v i qT=qT=
71
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)71 Case study - LSI Q1: How to do queries with LSI? A: map query vectors into ‘concept space’ – how? data inf. retrieval brain lung term1 term2 v1 q v2 A: inner product (cosine similarity) with each ‘concept’ vector v i q o v1 qT=qT=
72
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)72 Case study - LSI compactly, we have: q T concept = q T V Eg: data inf. retrieval brain lung qT=qT= term-to-concept similarities = CS-concept
73
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)73 Case study - LSI Drill: how would the document (‘information’, ‘retrieval’) handled by LSI?
74
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)74 Case study - LSI Drill: how would the document (‘information’, ‘retrieval’) handled by LSI? A: SAME: d T concept = d T V Eg: data inf. retrieval brain lung dT=dT= term-to-concept similarities = CS-concept
75
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)75 Case study - LSI Observation: document (‘information’, ‘retrieval’) will be retrieved by query (‘data’), although it does not contain ‘data’!! data inf. retrieval brain lung dT=dT= CS-concept qT=qT=
76
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)76 Case study - LSI Q1: How to do queries with LSI? Q2: multi-lingual IR (english query, on spanish text?)
77
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)77 Case study - LSI Problem: –given many documents, translated to both languages (eg., English and Spanish) –answer queries across languages
78
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)78 Case study - LSI Solution: ~ LSI data inf. retrieval brain lung CS MD datos informacion
79
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)79 Case study - LSI Solution: ~ LSI data inf. retrieval brain lung CS MD datos informacion
80
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)80 SVD - outline - Case studies... Applications - success stories –multi-lingual IR; LSI queries –compression –PCA - ‘ratio rules’ –Karhunen-Lowe transform –Kleinberg/google algorithm
81
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)81 Case study: compression [Korn+97] Problem: given a matrix compress it, but maintain ‘random access’
82
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)82 Problem - specs ~10**6 rows; ~10**3 columns; no updates; random access to any cell(s) ; small error: OK
83
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)83 Idea
84
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)84 SVD - reminder space savings: 2:1 minimum RMS error
85
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)85 Performance - scaleup space error
86
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)86 SVD - outline - Case studies... Applications - success stories –multi-lingual IR; LSI queries –compression –PCA - ‘ratio rules’ –Karhunen-Lowe transform –Kleinberg/google algorithm
87
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)87 PCA - ‘Ratio Rules’ [Korn+00] Typically: ‘Association Rules’ (eg., {bread, milk} -> {butter} But: –which set of rules is ‘better’? –how to reconstruct missing/corrupted values?
88
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)88 PCA - ‘Ratio Rules’ Idea: try to find ‘concepts’: Eigenvectors dictate rules about ratios: bread:milk:butter = 2:4:3 $ on bread $ on milk $ on butter 2 3 4
89
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)89 PCA - ‘Ratio Rules’ [Korn+00] Typically: ‘Association Rules’ (eg., {bread, milk} -> {butter} But: –how to reconstruct missing/corrupted values? –which set of rules is ‘better’?
90
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)90 PCA - ‘Ratio Rules’ Moreover: visual data mining, for free:
91
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)91 PCA - Ratio Rules no Gaussian clusters; Zipf-like distribution phonecalls stocks
92
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)92 PCA - Ratio Rules NBA dataset ~500 players; ~30 attributes
93
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)93 PCA - Ratio Rules PCA: get eigenvectors v1, v2,... ignore entries with small abs. value try to interpret the rest
94
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)94 PCA - Ratio Rules NBA dataset - V matrix (term to ‘concept’ similarities) v1
95
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)95 Ratio Rules - example RR1: minutes:points = 2:1 corresponding concept? v1
96
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)96 Ratio Rules - example RR1: minutes:points = 2:1 corresponding concept? A: ‘goodness’ of player
97
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)97 Ratio Rules - example RR2: points: rebounds negatively correlated(!)
98
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)98 Ratio Rules - example RR2: points: rebounds negatively correlated(!) - concept? v2
99
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)99 Ratio Rules - example RR2: points: rebounds negatively correlated(!) - concept? A: position: offensive/defensive
100
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)100 SVD - Case studies... Applications - success stories –multi-lingual IR; LSI queries –compression –PCA - ‘ratio rules’ –Karhunen-Lowe transform –Kleinberg/google algorithm
101
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)101 K-L transform [Duda & Hart]; [Fukunaga] A subtle point: SVD will give vectors that go through the origin v1
102
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)102 K-L transform A subtle point: SVD will give vectors that go through the origin Q: how to find v 1 ’ ? v1 v1’v1’
103
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)103 K-L transform A subtle point: SVD will give vectors that go through the origin Q: how to find v 1 ’ ? A: ‘centered’ PCA, ie., move the origin to center of gravity v1 v1’v1’
104
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)104 K-L transform A subtle point: SVD will give vectors that go through the origin Q: how to find v 1 ’ ? A: ‘centered’ PCA, ie., move the origin to center of gravity and THEN do SVD v1’v1’
105
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)105 SVD - outline Motivation Definition - properties Interpretation Complexity Applications - success stories Conclusions
106
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)106 SVD - Case studies... Applications - success stories –multi-lingual IR; LSI queries –compression –PCA - ‘ratio rules’ –Karhunen-Lowe transform –Kleinberg/google algorithm
107
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)107 Kleinberg’s algorithm Problem dfn: given the web and a query find the most ‘authoritative’ web pages for this query Step 0: find all pages containing the query terms Step 1: expand by one move forward and backward
108
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)108 Kleinberg’s algorithm Step 1: expand by one move forward and backward
109
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)109 Kleinberg’s algorithm on the resulting graph, give high score (= ‘authorities’) to nodes that many important nodes point to give high importance score (‘hubs’) to nodes that point to good ‘authorities’) hubsauthorities
110
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)110 Kleinberg’s algorithm observations recursive definition! each node (say, ‘i’-th node) has both an authoritativeness score a i and a hubness score h i
111
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)111 Kleinberg’s algorithm Let A be the adjacency matrix: the (i,j) entry is 1 if the edge from i to j exists Let h and a be [n x 1] vectors with the ‘hubness’ and ‘authoritativiness’ scores. Then:
112
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)112 Kleinberg’s algorithm Then: a i = h k + h l + h m that is a i = Sum (h j ) over all j that (j,i) edge exists or a = A T h k l m i
113
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)113 Kleinberg’s algorithm symmetrically, for the ‘hubness’: h i = a n + a p + a q that is h i = Sum (q j ) over all j that (i,j) edge exists or h = A a p n q i
114
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)114 Kleinberg’s algorithm In conclusion, we want vectors h and a such that: h = A a a = A T h That is: a = A T A a
115
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)115 Kleinberg’s algorithm a is a right- eigenvector of the adjacency matrix A (by dfn!) Starting from random a’ and iterating, we’ll eventually converge (Q: to which of all the eigenvectors? why?)
116
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)116 Kleinberg’s algorithm (Q: to which of all the eigenvectors? why?) A: to the one of the strongest eigenvalue, because of [Property 2]: (A T A ) k v’ ~ (constant) v 1
117
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)117 Kleinberg’s algorithm - results Eg., for the query ‘java’: 0.328 www.gamelan.com 0.251 java.sun.com 0.190 www.digitalfocus.com (“the java developer”)
118
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)118 Kleinberg’s algorithm - discussion ‘authority’ score can be used to find ‘similar pages’ (how?) closely related to ‘citation analysis’, social networks / ‘small world’ phenomena
119
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)119 google/page-rank algorithm closely related: imagine a particle randomly moving along the edges (*) compute its steady-state probabilities (*) with occasional random jumps
120
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)120 SVD - outline Motivation Definition - properties Interpretation Complexity Applications - success stories Conclusions
121
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)121 SVD - conclusions SVD: a valuable tool - two major properties: (A) it gives the optimal low-rank approximation (= least squares best hyper- plane) –it finds ‘concepts’ = principal components = features; (linear) correlations –dim. reduction; visualization –(solves over- and under-constraint problems)
122
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)122 SVD - conclusions - cont’d (B) eigenvectors: fixed points A T A v i = i v i –steady-state prob. in Markov Chains / graphs –hubs/authorities
123
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)123 Conclusions - in detail: given a (document-term) matrix, it finds ‘concepts’ (LSI)... and can reduce dimensionality (KL)... and can find rules (PCA; RatioRules)
124
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)124 Conclusions cont’d... and can find fixed-points or steady-state probabilities (Kleinberg / google/ Markov Chains) (... and can solve optimally over- and under- constraint linear systems - see Appendix, (Eq. C1) - Moore/Penrose pseudo-inverse)
125
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)125 References Berry, Michael: http://www.cs.utk.edu/~lsi/ Brin, S. and L. Page (1998). Anatomy of a Large- Scale Hypertextual Web Search Engine. 7th Intl World Wide Web Conf. Chen, C. M. and N. Roussopoulos (May 1994). Adaptive Selectivity Estimation Using Query Feedback. Proc. of the ACM-SIGMOD, Minneapolis, MN.
126
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)126 References Duda, R. O. and P. E. Hart (1973). Pattern Classification and Scene Analysis. New York, Wiley. Faloutsos, C. (1996). Searching Multimedia Databases by Content, Kluwer Academic Inc. Foltz, P. W. and S. T. Dumais (Dec. 1992). "Personalized Information Delivery: An Analysis of Information Filtering Methods." Comm. of ACM (CACM) 35(12): 51-60.
127
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)127 References Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition, Academic Press. Jolliffe, I. T. (1986). Principal Component Analysis, Springer Verlag. Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms.
128
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)128 References Korn, F., H. V. Jagadish, et al. (May 13-15, 1997). Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. ACM SIGMOD, Tucson, AZ. Korn, F., A. Labrinidis, et al. (1998). Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining. VLDB, New York, NY.
129
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)129 References Korn, F., A. Labrinidis, et al. (2000). "Quantifiable Data Mining Using Ratio Rules." VLDB Journal 8(3-4): 254-266. Press, W. H., S. A. Teukolsky, et al. (1992). Numerical Recipes in C, Cambridge University Press. Strang, G. (1980). Linear Algebra and Its Applications, Academic Press.
130
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)130 Appendix: SVD properties (A): obvious (B): less obvious (C): least obvious (and most powerful!)
131
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)131 Properties - by defn.: A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] A(1): U T [r x n] U [n x r ] = I [r x r ] (identity matrix) A(2): V T [r x n] V [n x r ] = I [r x r ] A(3): k = diag( 1 k, 2 k,... r k ) (k: ANY real number) A(4): A T = V U T
132
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)132 Less obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = ??
133
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)133 Less obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U 2 U T symmetric; Intuition?
134
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)134 Less obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U 2 U T symmetric; Intuition? ‘document-to-document’ similarity matrix B(2): symmetrically, for ‘V’ (A T ) [m x n] A [n x m] = V 2 V T Intuition?
135
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)135 Less obvious properties A: term-to-term similarity matrix B(3): ( (A T ) [m x n] A [n x m] ) k = V 2k V T and B(4): (A T A ) k ~ v 1 1 2k v 1 T for k>>1 where v 1 : [m x 1] first column (eigenvector) of V 1 : strongest eigenvalue
136
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)136 Less obvious properties B(4): (A T A ) k ~ v 1 1 2k v 1 T for k>>1 B(5): (A T A ) k v’ ~ (constant) v 1 ie., for (almost) any v’, it converges to a vector parallel to v 1 Thus, useful to compute first eigenvector/value (as well as the next ones, too...)
137
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)137 Less obvious properties - repeated: A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(1): A [n x m] (A T ) [m x n] = U 2 U T B(2):(A T ) [m x n] A [n x m] = V 2 V T B(3): ( (A T ) [m x n] A [n x m] ) k = V 2k V T B(4): (A T A ) k ~ v 1 1 2k v 1 T B(5): (A T A ) k v’ ~ (constant) v 1
138
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)138 Least obvious properties A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(1): A [n x m] x [m x 1] = b [n x 1] let x 0 = V (-1) U T b if under-specified, x 0 gives ‘shortest’ solution if over-specified, it gives the ‘solution’ with the smallest least squares error (see Num. Recipes, p. 62)
139
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)139 Least obvious properties Illustration: under-specified, eg [1 2] [w z] T = 4 (ie, 1 w + 2 z = 4) 1 234 1 2 all possible solutions x0x0 w z shortest-length solution
140
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)140 Least obvious properties Illustration: over-specified, eg [3 2] T [w] = [1 2] T (ie, 3 w = 1; 2 w = 2 ) 1 234 1 2 reachable points (3w, 2w) desirable point b
141
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)141 Least obvious properties - cont’d A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] where v 1, u 1 the first (column) vectors of V, U. (v 1 == right-eigenvector) C(3): symmetrically: u 1 T A = 1 v 1 T u 1 == left-eigenvector Therefore:
142
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)142 Least obvious properties - cont’d A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(4): A T A v 1 = 1 2 v 1 (fixed point - the usual dfn of eigenvector for a symmetric matrix)
143
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)143 Least obvious properties - altogether A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] C(1): A [n x m] x [m x 1] = b [n x 1] then, x 0 = V (-1) U T b: shortest, actual or least- squares solution C(2): A [n x m] v 1 [m x 1] = 1 u 1 [n x 1] C(3): u 1 T A = 1 v 1 T C(4): A T A v 1 = 1 2 v 1
144
Carnegie Mellon MDIC-04Copyright: C. Faloutsos (2004)144 Properties - conclusions A(0): A [n x m] = U [ n x r ] [ r x r ] V T [ r x m] B(5): (A T A ) k v’ ~ (constant) v 1 C(1): A [n x m] x [m x 1] = b [n x 1] then, x 0 = V (-1) U T b: shortest, actual or least- squares solution C(4): A T A v 1 = 1 2 v 1
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.