Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The EigenRumor Algorithm for Ranking Blogs Advisor: Hsin-Hsi Chen Speaker: Sheng-Chung Yen ( 嚴聖筌 )

Similar presentations


Presentation on theme: "1 The EigenRumor Algorithm for Ranking Blogs Advisor: Hsin-Hsi Chen Speaker: Sheng-Chung Yen ( 嚴聖筌 )"— Presentation transcript:

1 1 The EigenRumor Algorithm for Ranking Blogs Advisor: Hsin-Hsi Chen Speaker: Sheng-Chung Yen ( 嚴聖筌 )

2 2 Outline  Motivation  Assumed Blog Structure  Classification of Blog Ranking  The EigenRumor Algorithm Community model Scores Algorithm  Mapping to Blog community  Experiments  Related Works  Conclusion  Future Work  References

3 3 Motivation  Approaches of Page ranking PageRank [2] HITS (Hypertext Induced Topic Selection) [3]  Issues The number of links to a blog entry is generally very small. Some time is needed to develop a number of in-links and thus have a higher PageRank score.

4 4 Assumed Blog Structure  A blog consist a top page and a set of blog entries. A blog is generally updated and maintained by a single blogger.  There are links from the top page of the blog to each blog entry and each blog entry has a permanent URI.  Blog entries are frequently added and the notification of updates is, as an option, sent to a ping server.  A mechanism to construct a trackback [3] is provided.

5 5 Classification of Blog Ranking  Subject of ranking  Space of ranking  Temporal space of ranking  Semantics of ranking  Source of evaluations collected

6 6 The EigenRumor Algorithm – Community model (1/2)

7 7 The EigenRumor Algorithm – Community model (2/2)  When agent i provides (posts) object j, a provisioning link is established from i to j.  When agent i evaluates the usefulness of an existing object j with the scoring value e ij, an evaluation link is established from i to j.  Provisioning matrix P = [p ij ] to represent all provisioning links in the universe.  Evaluation matrix E=[E ij ] to represent all evaluation links in the universe.

8 8 The EigenRumor Algorithm – Scores  Authority score (agent property) This indicates to what level agent i provided objects in the past that following the community direction.  Hub score (agent property) This indicates to what level agent i submitted comments (evaluation) that followed the community direction on other past objects.  Reputation score (object property) This indicates the level of support object j received from the agents.

9 9 The EigenRumor Algorithm – Algorithm (1/4)  Assumptions The objects that are provided by a “ good ” authority will follow the direction of the community. The objects that are supported by a “ good ” hub will follow the direction of the community. The agent that provide objects that follow the community direction are “ good ” authorities of the community. The agent that evaluate objects that follow the community direction are “ good ” hubs of the community.

10 10 The EigenRumor Algorithm – Algorithm (2/4)  Notations

11 11 The EigenRumor Algorithm – Algorithm (3/4)

12 12 The EigenRumor Algorithm – Algorithm (4/4)

13 13 Mapping to Blog community (1/3)  The links from top page of the blog site to the blog entries => information provisioning links.  The links to blog entries in other blogs => information evaluation links.  (Forward) Trackback => the interest of the blogger.  (Backward) Trackback => be ignored, often generated by spamming.

14 14 Mapping to Blog community (2/3)  The basic algorithm does not normalize information provisioning matrix P or information evaluation E.  Problem: Some user creates many blog accounts and interlinks them, he/she can inflate the scores.

15 15 Mapping to Blog community (3/3)  Solutions: Normalization function 1: Normalization function 2 (longevity factor):

16 16 Experiments (1/3)  In the database of this system, 9280000 entries from 30500 blog sites (04/10/16 ~ 05/02/03).  Original: 1520000 (16.3%) entries have one or more hyperlinks. 116000 (1.25%) entries are linked to other blogs. 107000 (1.15%) entries are referred to by other blogs.

17 17 Experiments (2/3)  Applying EigenRumor algorithm: 36200 bloggers have at least one blog entry linked from other blogs. 28300 (9.28%) bloggers have nonzero authority scores => 862000 (9.28%) entries have nonzero reputation scores.

18 18 Experiments (3/3)  Face-to-Face user survey (40 guests Feb. 2005) Best result EigenRumorIn-linkTFIDFNot determined Queries18 (45%)2 (5%)1 (2.5%) 19 (48%)

19 19 Related Works  iRank  Technorati provided a commercial blog search.  EigenRumor algorithm: Agent-to-object, instead of page-to- page or agent-to-agent. The normalization of link. Dynamic structure of links.

20 20 Conclusion  The important feature of the algorithm is to widen the coverage of blog entries that are assigned a score by only from static link analysis.

21 21 Future Work  The problem of spamming.  How to choose a better ranking algorithm for specific keyword?

22 22 References [1] K. Fujimura, T. Inoue, and M. Sugisaki, “ The EigenRumor Algorithm for Ranking Blogs, ” Nippon Telegraph and Telephone, 10 May 2005. [2] S. Brin and L. Page, “ The Anatomy of a Large-scale Hypertextual Web Search Engine, ” In Proceedings of 7 th International World Wide Web Conference, 1998. [3] Wikipedia, http://en.wikipedia.org/.http://en.wikipedia.org/


Download ppt "1 The EigenRumor Algorithm for Ranking Blogs Advisor: Hsin-Hsi Chen Speaker: Sheng-Chung Yen ( 嚴聖筌 )"

Similar presentations


Ads by Google