CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector.

CS246 Topic-Based Models

Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector model?  Q: Is it desirable?  Q: What can we do?

Topic-Based Models  Index documents based on “topics” not by individual terms  Return a document if it shares the same topic with the query  We can return a document with “automobile” for the query “car”  Much fewer “topics” than “terms”  Topic-based index can be more compact than term-based index

Example (1)  Two topics: “Car”, “Movies” Four terms: car, automobile, movie, theater  Topic-term matrix  Document-topic matrix Topiccarautomobilemovietheater “Car”10.900 “Movie”0010.8 “Car”“Movie” doc101 doc210 doc30.80.2

Example (2)  But what we have is document-term matrix!!!  How are the three matrices related? carautomobilemovietheater doc10010.8 doc210.900 doc30.80.720.20.16

Linearity Assumption  A document is generated as a topic-weighted linear combination of topic-term vectors  A simplifying assumption on document generation doc1 = 0 (1,0.9, 0,0) + 1 (0,0,1,0.8) = ( 0, 0, 1, 0.8) doc3 = 0.8 (1,0.9, 0,0) + 0.2 (0,0,1,0.8) = (0.8,0.72, 0.2, 0.16) Topiccarautomobilemovietheater “Car”10.900 “Movie”0010.8 carautomobilemovietheater doc10010.8 doc210.900 doc30.80.720.20.16 “Car”“Movie” doc101 doc210 doc30.80.2

Topic-Based Index as Matrix Decomposition

 # topics << # terms, # topics << # docs  Decompose (doc-term) matrix to two matrices of rank-K (K: # topics)  Of course, decomposition will be approximate for real data doc topic term topic = X

Topic-Based Index as Rank-K Approximation  Q: How to choose the two decomposed matrices? What is the “best” decomposition?  Latent Semantic Index (LSI)  Find the decomposition that is the “closest” to the original matrix  Singular-Value Decomposition (SVD)  A decomposition method that leads to the best rank-K approximation  We will spend the next few hours to learn about SVD and its meaning  Basic understanding of linear algebra will be very useful for both IR and datamining

A Brief Review of Linear Algebra  Vector and a list of numbers  Addition  Scalar multiplication  Dot product  Dot product as a projection  Q: (1, 0) vs (0, 1). Are they the same vectors?  A: Choice of basis determines the “meaning” of the numbers  Matrix  Matrix multiplication  Four ways to look at matrix multiplication  Matrix as vector transformation

Change of Coordinates (1)  Two coordinate systems  Q: What are the coordinates of (2,0) under the second coordinate system?  Q: What about (1,1)?

Change of Coordinates (2)  In general, we get the new coordinates of a vector under the new basis vectors by multiplying the original coordinates with the following matrix  Verify with previous example  Q: What does the above matrix look like? How can we identify a coordinate-change matrix?

Matrix and Change of Coordinates  vectors are orthonormal to each other  Orthonormal matrix:  An orthonormal matrix can be interpreted as change-of- coordinate transformation  The rows of the matrix Q are the new basis vectors

Linear Transformation  Linear transformation  Every linear transformation can be represented as a matrix  By selecting appropriate basis vectors  Matrix form of a linear transformation can be obtained simply by learning how the basis vectors transform  Verify with 45 degree rotation.  What transformations are possible for linear transformation?

Linear Transformation that We Know  Rotation  Stretching  Anything else?  Claim: Any linear transformation is a stretching followed by a rotation  “Meaning” of singular value decomposition  An important result of linear algebra  Let us learn why this is the case

Rotation  Matrix form of rotation? What property will it have? Remember  Rotation matrix R Orthonormal matrix  ’s are unit basis vectors as well   Orthonormal matrix  Change of coordinates  Rotation

Stretching (1)  Q: Matrix form of stretching by 3 along x, y, z axes in 3D?  Q: Matrix form of stretching by 3 along x axis and by 2 along y axis in 3D.  Q: Stretching matrix diagonal matrix?

Stretching (2)  Q: Matrix form of stretching by 3 along and by 2 along ?  Verify by transforming (1,1) and (-1, 1)  Decomposition of T = Q T’ Q T shows the transformation in a different coordinate system  Under the matrix form, the simplicity of the stretching transformation may not be obvious  Q: What if we chose as the basis?

Stretching (3)  Under a good choice of basis vectors, orthogonal- stretching transformation can always be represented as a diagonal matrix  Q: How can we tell whether a matrix corresponds to an orthogonal-stretching transformation?

Stretching – Orthogonal Stretching (1)  Remember that this is orthogonal-stretching along  If a transformation is orthogonal stretching, we should always be able to represent it as QDQ T for some Q, where Q shows the stretching axes  Q: What is the matrix form of the transformation that stretches by 5 along (4/5, 3/5) and by 4 along (-3/5, 4/5)?

Stretching – Orthogonal Stretching (2)  Q: Given a matrix, how do we know whether it is orthogonal-stretching?  A: When it can be decomposed to T = QDQ T  A: Spectral Theorem  Any symmetric matrix T can always be decomposed into T = QDQ T  Symmetric matrix orthogonal stretching  Q: How can we decompose T to QDQ T ?  A: If T stretches along X, then TX = X for some.  X: eigenvector of T  : eigenvalue of T  Solve the equation for and X

Eigen Values, Eigen Vectors and Orthogonal Stretching  Eigenvector: stretching axis  Eigenvalue: stretching factor  All eigenvectors are orthogonal Orthogonal stretching Symmetric matrix (spectral theorem)  Example  Q: What transformation is this?

Singular Value Decomposition (SVD)  Any linear transformation T can be decomposed to T = R S (R: rotation, S: orthogonal stretching)  One of the basic results of linear algebra  In matrix form, any matrix T can be decomposed to  Diagonal entries in D: singular values  Example Q: What transformation is this?

Singular Value Decomposition (2)  Q: For (n x m) matrix T, what will be the dimension of the three matrices after SVD?  Q: What is the meaning of non-square diagonal matrix?  The diagonal matrix is also responsible for projection (or dimension padding).

Singular Values vs Eigenvalues  Q: What is this transformation?  A: Q 1 – eigenvectors of T T T D – square root of eigenvalues of T T T. Similarly, Q 2 – eigenvectors of TT T D – square root of eigenvalues of TT T.  SVD can be done by computing eigenvalues and eigenvectors of T T T and TT T

SVD as Matrix Approximation  Q: If we want to reduce the rank of T to 2, what will be a good choice?  The best rank-k approximation of any matrix T is to keep the first-k entries of its SVD.

SVD Approximation Example: 1000 x 1000 matrix with (0…255) 62605857585755535554 61605857 55535554 61595857 56555455 59 5857 5655545655 58 575655 5655 5758 575655 56 55 5657585755545556 57585755545556 595857565556 575957 58 57 56 5857 56 5756 57 58 57565556

Image of original matrix 1000x1000

SVD. Rank 1 approximation

Original vs Rank 100 approximation Q: How many numbers do we keep for each?

Back to LSI  LSI: decompose (doc-term) matrix to two matrices of rank-K  Our goal is to find the “best” rank-K approximation  Apply SVD, keep the top-K singular values, meaning that we keep the first K column and the first K rows of the first and third matrix after SVD. doc topic term topic = X

LSI and SVD  LSI doc topic term topic = X doc term =  SVD

LSI and SVD  LSI summary  Formulate the topic-based indexing problem as rank-K matrix approximation problem  Use SVD to find the best rank-K approximation  When applied to real data, 10-20% improvement reported  Using LSI was the road to fame for Excite in early days

Limitations of LSI  Q: Any problems with LSI?  Problems with LSI  Scalability  SVD is known to be difficult to perform for a large data  Interpretability  Extracted document-topic matrix is impossible to interpret  Difficult to understand why we get good/bad results from LSI for some queries  Q: Any way to develop more interpretable topic-based indexing?  Topic for next lecture

Summary  Topic-based indexing  Synonym and polyseme problem  Index documents by topic, not by terms  Latent Semantic Index (LSI)  Document is a linear combination of its topic vector and the topic- term vectors  Formulate the problem as a rank-K matrix approximation problem  Uses SVD to find the best approximation  Basic linear algebra  Linear transformation, matrix, stretching and rotation  Orthogonal stretching, diagonal matrix, symmetric matrix, eigenvalues and eigenvectors  Rotation, change of coordinate, and orthonormal matrix  SVD and its implication as a linear transformation

CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector.

Similar presentations

Presentation on theme: "CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector.

Similar presentations

Presentation on theme: "CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector."— Presentation transcript:

Similar presentations

About project

Feedback