Download presentation
Presentation is loading. Please wait.
Published byBarnaby Smith Modified over 9 years ago
1
CS246 Topic-Based Models
2
Motivation Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector model? Q: Is it desirable? Q: What can we do?
3
Topic-Based Models Index documents based on “topics” not by individual terms Return a document if it shares the same topic with the query We can return a document with “automobile” for the query “car” Much fewer “topics” than “terms” Topic-based index can be more compact than term-based index
4
Example (1) Two topics: “Car”, “Movies” Four terms: car, automobile, movie, theater Topic-term matrix Document-topic matrix Topiccarautomobilemovietheater “Car”10.900 “Movie”0010.8 “Car”“Movie” doc101 doc210 doc30.80.2
5
Example (2) But what we have is document-term matrix!!! How are the three matrices related? carautomobilemovietheater doc10010.8 doc210.900 doc30.80.720.20.16
6
Linearity Assumption A document is generated as a topic-weighted linear combination of topic-term vectors A simplifying assumption on document generation doc1 = 0 (1,0.9, 0,0) + 1 (0,0,1,0.8) = ( 0, 0, 1, 0.8) doc3 = 0.8 (1,0.9, 0,0) + 0.2 (0,0,1,0.8) = (0.8,0.72, 0.2, 0.16) Topiccarautomobilemovietheater “Car”10.900 “Movie”0010.8 carautomobilemovietheater doc10010.8 doc210.900 doc30.80.720.20.16 “Car”“Movie” doc101 doc210 doc30.80.2
7
Topic-Based Index as Matrix Decomposition
8
# topics << # terms, # topics << # docs Decompose (doc-term) matrix to two matrices of rank-K (K: # topics) Of course, decomposition will be approximate for real data doc topic term topic = X
9
Topic-Based Index as Rank-K Approximation Q: How to choose the two decomposed matrices? What is the “best” decomposition? Latent Semantic Index (LSI) Find the decomposition that is the “closest” to the original matrix Singular-Value Decomposition (SVD) A decomposition method that leads to the best rank-K approximation We will spend the next few hours to learn about SVD and its meaning Basic understanding of linear algebra will be very useful for both IR and datamining
10
A Brief Review of Linear Algebra Vector and a list of numbers Addition Scalar multiplication Dot product Dot product as a projection Q: (1, 0) vs (0, 1). Are they the same vectors? A: Choice of basis determines the “meaning” of the numbers Matrix Matrix multiplication Four ways to look at matrix multiplication Matrix as vector transformation
11
Change of Coordinates (1) Two coordinate systems Q: What are the coordinates of (2,0) under the second coordinate system? Q: What about (1,1)?
12
Change of Coordinates (2) In general, we get the new coordinates of a vector under the new basis vectors by multiplying the original coordinates with the following matrix Verify with previous example Q: What does the above matrix look like? How can we identify a coordinate-change matrix?
13
Matrix and Change of Coordinates vectors are orthonormal to each other Orthonormal matrix: An orthonormal matrix can be interpreted as change-of- coordinate transformation The rows of the matrix Q are the new basis vectors
14
Linear Transformation Linear transformation Every linear transformation can be represented as a matrix By selecting appropriate basis vectors Matrix form of a linear transformation can be obtained simply by learning how the basis vectors transform Verify with 45 degree rotation. What transformations are possible for linear transformation?
15
Linear Transformation that We Know Rotation Stretching Anything else? Claim: Any linear transformation is a stretching followed by a rotation “Meaning” of singular value decomposition An important result of linear algebra Let us learn why this is the case
16
Rotation Matrix form of rotation? What property will it have? Remember Rotation matrix R Orthonormal matrix ’s are unit basis vectors as well Orthonormal matrix Change of coordinates Rotation
17
Stretching (1) Q: Matrix form of stretching by 3 along x, y, z axes in 3D? Q: Matrix form of stretching by 3 along x axis and by 2 along y axis in 3D. Q: Stretching matrix diagonal matrix?
18
Stretching (2) Q: Matrix form of stretching by 3 along and by 2 along ? Verify by transforming (1,1) and (-1, 1) Decomposition of T = Q T’ Q T shows the transformation in a different coordinate system Under the matrix form, the simplicity of the stretching transformation may not be obvious Q: What if we chose as the basis?
19
Stretching (3) Under a good choice of basis vectors, orthogonal- stretching transformation can always be represented as a diagonal matrix Q: How can we tell whether a matrix corresponds to an orthogonal-stretching transformation?
20
Stretching – Orthogonal Stretching (1) Remember that this is orthogonal-stretching along If a transformation is orthogonal stretching, we should always be able to represent it as QDQ T for some Q, where Q shows the stretching axes Q: What is the matrix form of the transformation that stretches by 5 along (4/5, 3/5) and by 4 along (-3/5, 4/5)?
21
Stretching – Orthogonal Stretching (2) Q: Given a matrix, how do we know whether it is orthogonal-stretching? A: When it can be decomposed to T = QDQ T A: Spectral Theorem Any symmetric matrix T can always be decomposed into T = QDQ T Symmetric matrix orthogonal stretching Q: How can we decompose T to QDQ T ? A: If T stretches along X, then TX = X for some. X: eigenvector of T : eigenvalue of T Solve the equation for and X
22
Eigen Values, Eigen Vectors and Orthogonal Stretching Eigenvector: stretching axis Eigenvalue: stretching factor All eigenvectors are orthogonal Orthogonal stretching Symmetric matrix (spectral theorem) Example Q: What transformation is this?
23
Singular Value Decomposition (SVD) Any linear transformation T can be decomposed to T = R S (R: rotation, S: orthogonal stretching) One of the basic results of linear algebra In matrix form, any matrix T can be decomposed to Diagonal entries in D: singular values Example Q: What transformation is this?
24
Singular Value Decomposition (2) Q: For (n x m) matrix T, what will be the dimension of the three matrices after SVD? Q: What is the meaning of non-square diagonal matrix? The diagonal matrix is also responsible for projection (or dimension padding).
25
Singular Values vs Eigenvalues Q: What is this transformation? A: Q 1 – eigenvectors of T T T D – square root of eigenvalues of T T T. Similarly, Q 2 – eigenvectors of TT T D – square root of eigenvalues of TT T. SVD can be done by computing eigenvalues and eigenvectors of T T T and TT T
26
SVD as Matrix Approximation Q: If we want to reduce the rank of T to 2, what will be a good choice? The best rank-k approximation of any matrix T is to keep the first-k entries of its SVD.
27
SVD Approximation Example: 1000 x 1000 matrix with (0…255) 62605857585755535554 61605857 55535554 61595857 56555455 59 5857 5655545655 58 575655 5655 5758 575655 56 55 5657585755545556 57585755545556 595857565556 575957 58 57 56 5857 56 5756 57 58 57565556
28
Image of original matrix 1000x1000
29
SVD. Rank 1 approximation
30
SVD. Rank 10 approximation
31
SVD. Rank 100 approximation
32
Original vs Rank 100 approximation Q: How many numbers do we keep for each?
33
Back to LSI LSI: decompose (doc-term) matrix to two matrices of rank-K Our goal is to find the “best” rank-K approximation Apply SVD, keep the top-K singular values, meaning that we keep the first K column and the first K rows of the first and third matrix after SVD. doc topic term topic = X
34
LSI and SVD LSI doc topic term topic = X doc term = SVD
35
LSI and SVD LSI summary Formulate the topic-based indexing problem as rank-K matrix approximation problem Use SVD to find the best rank-K approximation When applied to real data, 10-20% improvement reported Using LSI was the road to fame for Excite in early days
36
Limitations of LSI Q: Any problems with LSI? Problems with LSI Scalability SVD is known to be difficult to perform for a large data Interpretability Extracted document-topic matrix is impossible to interpret Difficult to understand why we get good/bad results from LSI for some queries Q: Any way to develop more interpretable topic-based indexing? Topic for next lecture
37
Summary Topic-based indexing Synonym and polyseme problem Index documents by topic, not by terms Latent Semantic Index (LSI) Document is a linear combination of its topic vector and the topic- term vectors Formulate the problem as a rank-K matrix approximation problem Uses SVD to find the best approximation Basic linear algebra Linear transformation, matrix, stretching and rotation Orthogonal stretching, diagonal matrix, symmetric matrix, eigenvalues and eigenvectors Rotation, change of coordinate, and orthonormal matrix SVD and its implication as a linear transformation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.