Web Information retrieval (Web IR)

Web Information retrieval (Web IR)
Handout #12: Combinational Ranking Ali Mohammad Zareh Bidoki ECE Department, Yazd University Autumn 2011

Ranking Algorithm Problems
Rich-get- richer (Connectivity based) Low precision (at most 0.30) Each ranking algorithm operates well in some situations Autumn 2011

Combinational Ranking
Content + connectivity +??? How can we combine these features? R=f( query, content, connectivity) Autumn 2011

Relevance propagation Model (by Shakery)
A hyper score (h) is computed for each document. WI and WO are weighting functions for in-link and out-link pages, respectively. S (p) is similarity between query q and page p(self relevance): Autumn 2011

Three Iterative Models
Weighted In-Link Weighted Out-Link Uniform Out-Link Autumn 2011

Weighted In-Link This model of user behavior is quite similar to Random surfer, except that it is not query-independent. The probability that the random surfer visits a page is its hyper-relevance score. Autumn 2011

Weighted Out-Link In this model, we assume that given a page to a user, he reads the content of the page with probability alpha and he traverses the outgoing edges with probability (1-alpha). The pages that are linked from a page do not have the same impact on its weight. Pages whose contents are more similar to the query are assumed to have more impact on the score of the page than those which are less similar. Autumn 2011

Uniform Out-Link In this special case, they assume that at each page, the user reads the content of the page, and with probability (1-alpha) he reads all the pages that are linked from the page. Autumn 2011

Algorithm Implementation
Algorithm is run on a working set Working set construction: They first find the top pages which have the highest content similarity to the query From these pages, a small number (about 200) of the most similar pages are selected to be the core set of pages. They then expand the core set to the working set by adding the pages that are among the pages and which point to the pages in the core set or are pointed to by the pages in the core set Autumn 2011

Algorithm Properties It is
Online?? Recursive Query independent It is shown on TREC Weighted In-Link outperforms others Autumn 2011

Frequency Propagation (By Song)
Instead of Propagation of score, frequency of query terms are propagated We can use it online It is used based on site structure Autumn 2011

Propagation Formula ft(p) is the frequency of tem t in page p
f’t(p) is the frequency of tem t in page p after propagation Autumn 2011

Overall Framework for propagation
SS is the best ST & HT-WI are similar Autumn 2011

Combinational Ranking Algorithms Based on learning (Learning to Rank)
Autumn 2011

Combination Framework
Learning System q1:{(x11,4),(x12,3),…(x1m,0)} q2:{(x21,3),(x22,2),…(x2m,1)} …. qn:{(xn1,4),(xn2,3),…(xnm,2)} Training Set Ranking Model g(x,w) Ranking System (x1,?), (x2,?),… Test Set (x1,g(x1,w)) (x2,g(x2,w)) (x3,g(x2,w)) … Labels (Relevance judgments or click orders) Autumn 2011

Three learning categories
Point wise Pair wise List wise Autumn 2011

Web Information retrieval (Web IR)

Similar presentations

Presentation on theme: "Web Information retrieval (Web IR)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Information retrieval (Web IR)

Similar presentations

Presentation on theme: "Web Information retrieval (Web IR)"— Presentation transcript:

Similar presentations

About project

Feedback