Implementation of Vector Space Model March 27, 2006
How TA Can Be Used in Vector Space Model? Let consider a query with keyword microsoft and corporation, q = (microsoft, corporation) Create table for each keyword, e.g., These lists are called “Inverted Lists” docidTf micosoft * Idf microsoft docidTf corporation * Idf corporation Space occupied = O(# of non-zero entries in the matrix) - So its not cheap in terms of space
How TA Can Be Used in Vector Space Model? Inverted List In original database words are generated for given documents In Inverted List, documents are generated for given words; that’s why this is called Inverted List
How TA Can Be Used in Vector Space Model? Inverted List Union of List microsoft and List corporation Keep list sorted by document id Intersection of List microsoft and List corporation Arrange keywords from more specific to the least