Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Vector Space Model for Automatic Indexing

Similar presentations


Presentation on theme: "A Vector Space Model for Automatic Indexing"— Presentation transcript:

1 A Vector Space Model for Automatic Indexing
G. Salton, A. Wong and C. S. Yang Enhanced Vector Space Models for Content-based Recommender Systems Cataldo Musto Presenter Sawood Alam

2 A Vector Space Model for Automatic Indexing
G. Salton, A. Wong and C. S. Yang Cornell University

3 Introduction In document retrieval, best indexing space is where each entity lies far away from others Density of the object space becomes a measure of indexing system Retrieval performance correlate inversely with space density

4 Document Space Di = (di1, di2, di3, …, dij)

5 Document Space (cont.)

6 Document Space (cont.)

7 Indexing Performance vs. Space Density

8 Cluster Density vs. Indexing Performance

9 Discrimination Value Model

10 Discrimination Value Model (cont.)

11 Discrimination Value Model Summary

12 Average Recall vs. Precision

13 Summary Recall vs. Precision

14 Enhanced Vector Space Models for Content-based Recommender Systems
Cataldo Musto Dept. of Computer Science University of Bari, Italy

15 Introduction Vector Space Models (VSM) in Information Retrieval is an established practice Investigate the impact of vector space models in Information Filtering Recommender system

16 Problems of VSM High dimensionality
Becoming more serious due to emerging social apps and micro-blogging, generating lots of web content and new vocabulary Inability to manage document semantics Order of the term occurrence in the document

17 Components Context vector for each term Values in {-1, 0, 1}
Vector Space representation of a term (t) Vector Space representation of a document (d) Vector Space representation of a user profile (pu)

18 Indexing Technique Random Indexing-based model
Weighted Random Indexing-based model Semantic Vector-based model Weighted Semantic Vector-based model

19 Experimental Evaluation

20 Conclusions First prototype with naive weighting scheme is comparable to other content based filtering techniques like Bayesian classifier Other complex weighting schemes should perform better User profiles may be studied based on Linked Data rather than keyword based user profiles


Download ppt "A Vector Space Model for Automatic Indexing"

Similar presentations


Ads by Google