CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005 A Presentation on When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics K. Bharat & G. A. Mihaila WWW10 Conference, May 2001, Hong Kong by Osama Ahmed Khan 10/06/2005
Problem Query on Popular Topic Content Analysis Solution Most Authoritative Pages
Technical Terms Expert Recommendation Non-affiliation
Hilltop Algorithm 1.Expert Lookup Detecting Host Affiliation Expert Selection Expert Indexing 2.Target Ranking Computing Expert Score Computing Target Score
Detecting Host Affiliation Conditions Same first 3 octets of IP Same rightmost non-generic token of hostname Union-Find Algorithm
Expert Selection Retrieve all webpages with: Out-degree > Threshold (k) (e.g. k = 5) Expert will have: URLs pointing to k distinct non-affiliated hosts
Expert Indexing Inverted Index Mapping Keywords to Experts Key Phrases Match Positions
Computing Expert Score Condition Atleast 1 URL with all query keywords Expert Score: (S 0, S 1, S 2 ) S i = SUM {key phrases p with k-i query terms} * LevelScore(p) * FullnessFactor(p,q) Expert_Score = 2 32 * S * S 1 + S 2
Computing Target Score Condition Atleast 2 non-affiliated experts Target Score: Edge_Score(E,T) = Expert_Score(E) * SUM {query keywords w} * occ(k,T) Target_Score = Sum {Edge_Score(E,T)}
Evaluation 1.Locating Specific Popular Targets
Evaluation Evaluation (Contd.) 2.Gathering Relevant Pages
Conclusion Characteristics Popular Queries Expert Subset Hilltop vs. PageRank Topic Distillation
Thank You