Download presentation
Presentation is loading. Please wait.
Published byAugust Hill Modified over 9 years ago
1
1 The EigenRumor Algorithm for Ranking Blogs Advisor: Hsin-Hsi Chen Speaker: Sheng-Chung Yen ( 嚴聖筌 )
2
2 Outline Motivation Assumed Blog Structure Classification of Blog Ranking The EigenRumor Algorithm Community model Scores Algorithm Mapping to Blog community Experiments Related Works Conclusion Future Work References
3
3 Motivation Approaches of Page ranking PageRank [2] HITS (Hypertext Induced Topic Selection) [3] Issues The number of links to a blog entry is generally very small. Some time is needed to develop a number of in-links and thus have a higher PageRank score.
4
4 Assumed Blog Structure A blog consist a top page and a set of blog entries. A blog is generally updated and maintained by a single blogger. There are links from the top page of the blog to each blog entry and each blog entry has a permanent URI. Blog entries are frequently added and the notification of updates is, as an option, sent to a ping server. A mechanism to construct a trackback [3] is provided.
5
5 Classification of Blog Ranking Subject of ranking Space of ranking Temporal space of ranking Semantics of ranking Source of evaluations collected
6
6 The EigenRumor Algorithm – Community model (1/2)
7
7 The EigenRumor Algorithm – Community model (2/2) When agent i provides (posts) object j, a provisioning link is established from i to j. When agent i evaluates the usefulness of an existing object j with the scoring value e ij, an evaluation link is established from i to j. Provisioning matrix P = [p ij ] to represent all provisioning links in the universe. Evaluation matrix E=[E ij ] to represent all evaluation links in the universe.
8
8 The EigenRumor Algorithm – Scores Authority score (agent property) This indicates to what level agent i provided objects in the past that following the community direction. Hub score (agent property) This indicates to what level agent i submitted comments (evaluation) that followed the community direction on other past objects. Reputation score (object property) This indicates the level of support object j received from the agents.
9
9 The EigenRumor Algorithm – Algorithm (1/4) Assumptions The objects that are provided by a “ good ” authority will follow the direction of the community. The objects that are supported by a “ good ” hub will follow the direction of the community. The agent that provide objects that follow the community direction are “ good ” authorities of the community. The agent that evaluate objects that follow the community direction are “ good ” hubs of the community.
10
10 The EigenRumor Algorithm – Algorithm (2/4) Notations
11
11 The EigenRumor Algorithm – Algorithm (3/4)
12
12 The EigenRumor Algorithm – Algorithm (4/4)
13
13 Mapping to Blog community (1/3) The links from top page of the blog site to the blog entries => information provisioning links. The links to blog entries in other blogs => information evaluation links. (Forward) Trackback => the interest of the blogger. (Backward) Trackback => be ignored, often generated by spamming.
14
14 Mapping to Blog community (2/3) The basic algorithm does not normalize information provisioning matrix P or information evaluation E. Problem: Some user creates many blog accounts and interlinks them, he/she can inflate the scores.
15
15 Mapping to Blog community (3/3) Solutions: Normalization function 1: Normalization function 2 (longevity factor):
16
16 Experiments (1/3) In the database of this system, 9280000 entries from 30500 blog sites (04/10/16 ~ 05/02/03). Original: 1520000 (16.3%) entries have one or more hyperlinks. 116000 (1.25%) entries are linked to other blogs. 107000 (1.15%) entries are referred to by other blogs.
17
17 Experiments (2/3) Applying EigenRumor algorithm: 36200 bloggers have at least one blog entry linked from other blogs. 28300 (9.28%) bloggers have nonzero authority scores => 862000 (9.28%) entries have nonzero reputation scores.
18
18 Experiments (3/3) Face-to-Face user survey (40 guests Feb. 2005) Best result EigenRumorIn-linkTFIDFNot determined Queries18 (45%)2 (5%)1 (2.5%) 19 (48%)
19
19 Related Works iRank Technorati provided a commercial blog search. EigenRumor algorithm: Agent-to-object, instead of page-to- page or agent-to-agent. The normalization of link. Dynamic structure of links.
20
20 Conclusion The important feature of the algorithm is to widen the coverage of blog entries that are assigned a score by only from static link analysis.
21
21 Future Work The problem of spamming. How to choose a better ranking algorithm for specific keyword?
22
22 References [1] K. Fujimura, T. Inoue, and M. Sugisaki, “ The EigenRumor Algorithm for Ranking Blogs, ” Nippon Telegraph and Telephone, 10 May 2005. [2] S. Brin and L. Page, “ The Anatomy of a Large-scale Hypertextual Web Search Engine, ” In Proceedings of 7 th International World Wide Web Conference, 1998. [3] Wikipedia, http://en.wikipedia.org/.http://en.wikipedia.org/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.