Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1:

Similar presentations


Presentation on theme: "Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1:"— Presentation transcript:

1 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 1 Part I: Web Structure Mining Chapter 2: Hyperlink Based Ranking Social Network Analysis PageRank Authorities and Hubs Link Based Similarity Search Enhanced Techniques for Page Ranking

2 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 2 Social Networks Directed graph with weights assigned to its edges Nodes represent documents and the edges – citations from one document to other documents. Prestige can be associated with the number of input edges to a node (in-degree). Prestige has a recursive nature. depends on the authority (or again, the prestige) of citations

3 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 3 Social Networks adjacency matrix –if document cites document –otherwise prestige score

4 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 4 Social Networks Computing prestige Eigen decomposition –Eigenvector P –Eigenvalue

5 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 5 Social Networks

6 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 6 Social Networks Loop: While Power Iteration

7 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 7 PageRank “Random web surfer” keeps clicking on hyperlinks at random with uniform probability Implements random walk on the web graph Page u links to web pages Probability of visiting page v will be Amount of prestige that page v receives from page u is of the prestige of u

8 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 8 PageRank Propagation of page rank

9 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 9 PageRank Calculation of page rank Norm Integers


Download ppt "Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1:"

Similar presentations


Ads by Google