Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms on Web Documents
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab Motivation Content analysis usually takes a long time, and it is almost impossible to get users' feedback or visiting times for most Web documents.
Intelligent Database Systems Lab Objectives Present two ways to improve the precision of HITS-based algorithms on Web documents.
Intelligent Database Systems Lab Methodology – HITS algorithm limit authority hub New weighted HITS-BASED algorithm
Intelligent Database Systems Lab Methodology – HITS algorithm limit
Intelligent Database Systems Lab Methodology – Vector Space Model(VSM) Inner Product Weight a query q document Xi Vector
Intelligent Database Systems Lab Methodology – Vector Space Model(VSM) coverage of Google
Intelligent Database Systems Lab Methodology – Okapi Similarity Measurement(Okapi)
Intelligent Database Systems Lab Methodology – Cover Density Ranking (CDR ) In CDR, the results of phrase queries are ranked in two steps: The score of the cover set
Intelligent Database Systems Lab Methodology – Three-Level Scoring Method (TLS) Compute the relevance of a Web page to a query two steps: (1) (2)
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Conclusions The weighted HITS-based method performs better than Bharat's improved HITS algorithm.
Intelligent Database Systems Lab Comments Advantages - Effective. Applications - Information retrieval 、 Rank web pages.