Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000.

Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000

Coverage of the Web (Est. 1 billion total pages) 40% 35% 30% 25% 20%
38% 40% 35% 32% 31% 27% 30% 26% 25% 20% 14% 15% 17% 10% 6% 6% 5% 0% FAST AltaVista Excite Northern Light Google Inktomi Go Lycos Report Date: Feb.3,2000

Focused Crawling Toward topic-specific web resource discovery Ranking
a focused crawler analyzes its crawl boundary to find the links relevant for the topic, avoiding irrelevant regions of the web. Ranking core of information retrieval/discovery system Classification another view of ranking Important metrics for target pages Similarity to the driving query Backlink count Pagerank, HITS Location Metrics (the above metrics are adopted from ref. 1) estimated Q-value (adopted from ref. 2) ...

U : uniform distribution A : adjacency matrix
PageRank (1/3) Consider a random web surfer Jumps to random page with probability ε With probability (1-ε ) follows a random hyperlink of the current page Transition probability matrix ε x U + (1-ε) x A U : uniform distribution A : adjacency matrix Query-independent ranking = stationary probability for this Markov chain adopted from ref. 6

PageRank (2/3) Simplified Definition of PageRank
R(u) = c SUMvєBu (R(v)/Nv) Fu : the set of pages u points to Bu : the set of pages that point to u Nu = | Fu |    URL: _____________    URL: _____________ 100 53 50 Hyperlink 1 Hyperlink 1 Hyperlink 2 Hyperlink 2 3 50    URL: _____________    URL: _____________ 9 50 Hyperlink 1 Hyperlink 1 Hyperlink 2 3 Hyperlink 2 Hyperlink 3 3

PageRank (3/3) Definition of PageRank
   URL: _____    URL: _____    URL: _____ ∞ ∞ ∞ Rank Sink Definition of PageRank R’(u) = c SUMvєBu (R’(v)/Nv) + cE(u) E(u) is some vector over the web pages that corresponds to a source of rank C is a decay factor. Personalized PageRank Aside from solving rank sinks, E turns out to be a powerful parameter to adjust the page ranks Change the uniformed distribution to a biased distribution favoring specific topic.

Reinforcement Learning (1/4)
Goal : Autonomous agents learn to choose optimal actions to achieve its goal. Learn a control strategy, or policy, for choosing actions. Method : Use reward (reinforcement) to learn a successful agent function. Model : Agent STATE , REWARD a0 a1 a2 S0 S1 S2 ... r0 r1 r2 ACTION Environment Goal: learn to choose actions that maximize discounted cumulated reward r0+γr1+γ2r2+… , where 0 ≦γ<1 adopted from ref. 3

Interaction between agent and environment agent can perceive a set S of distinct states of its environment agent has a set A of distinct actions that it can perform environment responds by a reward function rt=r(st,at) environment produces the succeeding state st+1=δ(st,at) r, δ are parts of environment and not necessary known to agent Markov decision process (MDP) the functions r(st,at), δ(st,at) depend only on the current state and action. Formulate policy agent learns π : S→A , selecting next action at based on state st such a policy should lead to maximize cumulative value Vπ(st). Vπ(st) = rt+γrt+1+γ2rt+2+ … = SUMi=0, ∞ ( γirt+i) π* = argmaxπVπ(s) for all s

Example (suppose γ=0.9) 100 G G 100 r(s,a) immediate reward values One optimal policy 90 100 90 100 G G 81 72 90 81 81 100 81 90 81 90 100 72 81 Q(s,a) values V*(s) values V = x0+...=100 V = 0+0.9x x0+...=90 V = 0+0.9x0+0.92x x0...=81

Q Learning It’s difficult to learn π* : S→A directly , because training data does not provide examples of the form <s,a> Agent prefer state s1 over s2 whenever V*(s1)>V*(s2) The optimal action in state s is the action a that maximizes the sum of the immediate reward r(s,a) plus the value V* of the immediate successor state, discounted by γ π* = argmaxa [r(s,a) + γV*(δ(s,a)) ] Corelated measurement Q Q(s,a) = r(s,a) + γV*(δ(s,a)) => π* = argmaxa Q(s,a) Relation between Q and V* V*(s) = maxa’ Q(s,a’) Estimate Q-value iteratively Q’(s,a) ← r + γmaxa’Q’(s’,a’)

Efficient Web Spidering (1/3)
Paper : Efficient Web Spidering with Reinforcement Learning (ICML ‘98) part of the project for building ,cora, domain-specific search engines containing computer science research papers. Series of papers appear in AAAI-99 symp. On IA, IJCAI ‘99 Demo site : Why use reinforcement learning ? Performance of a topic-specific spidering is measured in terms of reward over time. The environment presents situations with delayed reward. How to do that ? Learn a mapping from the text in the neighborhood of a hyperlink to the expected (discounted) number of relevant pages that ca be found as a result of following that hyperlink. Use naïve Bayes to classify the text into a corresponding finite number of classes

Obtaining Training Data Off-line training using 4 CS department, including documents and hyperlinks. State transition function , T, and reward function, R, are known Crawling learn the Q function Calculating Q function    URL: _____________ target hyperlink 3 reward=1    URL: _____________ reward=0 hyperlink 1 hyperlink 2 reward=0    URL: _____________ Neighborhood of a hyperlink Anchor text of the link, header , Page title of the linked document hyperlink 4

Mapping Text to Q-value Given we have calculated Q-values for hyperlinks in training data Discretize the discounted sum of reward values into bins, place the text in the neighborhood of the hyperlinks into the bind corresponding to their Q-values Train a naïve Bayes text classifier using those text For each hyperlink, calculate the probabilistic class membership of each bin, The estimated Q-value of that hyperlink is the weighted average of each bins’ value. Evaluation Measurement : # of hyperlinks followed before 75% target found. Reinforcement Learning : 16% of the hyperlinks Breadth-first : 48% of the hyperlinks

Reference 1. Efficient crawling through URL ordering - Junghoo Cho, Lawrence Page, 7-th WWW conference. 2. Using Reinforcement Learning to Spider the Web Efficiently - Jason Rennie, Andrew McCallum, ICML ‘98. 3. Machine Learning - Tom M. Mitchell, McGRAW-HILL 4. The PageRank Citation Ranking : Bringing Order to the Web 5. Mining the Link Structure of the World Wide Web, Soumen Chakrabarti, … ’99 6. Information Retrieval on the Web , Tutorial in SIGIR ‘98

Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000.

Similar presentations

Presentation on theme: "Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000.

Similar presentations

Presentation on theme: "Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000."— Presentation transcript:

Similar presentations

About project

Feedback