Download presentation
Presentation is loading. Please wait.
Published byLorraine Johnston Modified over 9 years ago
1
Link Analysis on the Web An Example: Broad-topic Queries Xin
2
Problem Specific queries: “Does Netscape support the JDK 1.1 code-signing API?” Broad-topic queries: “Find information about the Java programming language.” Authority is important in broad-topic queries Web Query: “ java ” 1.http://java.sun.comhttp://java.sun.com 2.http://sunsite.unc.edu/java faq/javafaq.htmlhttp://sunsite.unc.edu/java faq/javafaq.html 3.…
3
Why to use link analysis comparing to content information? Query: Harvard “Harvard” occurring times: 4 Harvard HomepageOther page introducing Harvard “Harvard” occurring times: 8 Query: Search engines “Search engines” occurring times: 0 Yahoo! HomepageOther page introducing search engines “Search engines” occurring times: 4
4
Graph Presentation G=(V,E) V: pages E: in-link and out-link Adjacency matrix p1p1 p2p2 p3p3 p4p4 p1p1 p2p2 p3p3 p4p4 1 1 1 1 1 Given a query, how to find the most authoritative page through these link information?
5
Overview Web Query: “ java ” 1.http://java.sun.comhttp://java.sun.com 2.http://sunsite.unc.edu/java faq/javafaq.htmlhttp://sunsite.unc.edu/java faq/javafaq.html 3.… 1 2 1.Sub-graph construction 2.Hubs and authorities computation
6
Step1: Sub-graph Construction Challenge: –Small in size –Rich in relevant pages –Contains most of the strongest authorities
7
Step2: Hubs and Authorities Basic Idea: in-degree Problem:
8
Step2: Hubs and Authorities
9
An Iterative Algorithm:
10
Simple Example 1 (x,y): x=hub score y=authority score (1/4,1/4)
11
Simple Example 2 (1/4,1/4) Hub : 1: 1/4 2: 1/4+1/4 3: 1/4 4: 1/4 Authority : 1: 1/4+1/4+1/4 2: 1/4 3: 0 4: 1/4 0
12
Page Rank
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.