Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR 2008 Presented by Jae-won Lee
Copyright 2008 by CEBT Introduction Users are generally loyal to one search engine Even though it may not satisfy their needs A given search engine performs well for some queries and poorly for others Excessive loyalty can hinder search effectiveness Our Goal Support engine switching by recommending the most effective search engine for a given query IDS Lab. Seminar - 2Center for E-Business Technology
Copyright 2008 by CEBT Related Work Meta search engines such as Clusty and Dogpile Merge search results Switching search engine is more attractive Strong brand loyalty may discourage users from migrating to meta search engine Meta search engine eliminates the benefits of interface feature of each engine Hurts source engine brand awareness We lets users keep their default engine Suggest an alternative engine if it performs better for the current query IDS Lab. Seminar - 3Center for E-Business Technology
Copyright 2008 by CEBT Does Switching Help Users? Potential Benefits of Switching We quantify the benefits of multiple engine use – Normalized Discounted Cumulative Gain (NDCG) Measure a topical relevance of results to a given query – Click-through rate for search results Reasonable estimation of search result utility NDCG – A measure to evaluate the Web search engine performance Where N i : a constant for normalization r(i) : relevance score of the i th result, 0 (bad) ~ 5 (perfect) IDS Lab. Seminar - 4Center for E-Business Technology
Copyright 2008 by CEBT Does Switching Help Users? Potential Benefits of Switching (cont’d) X, Y, and Z are anonymous notations of Google, Yahoo, and Live Search Number of queries for which engine performs best Engine choice for particular query is important IDS Lab. Seminar - 5Center for E-Business Technology Search engineRelevance (NDCG)Result click-through rate X952 (19.3%)2,777 (56.4%) Y1,136 (23.1%)1,226 (24.9%) Z789 (16.1%)892 (18.1%) No difference2,044 (41.5%)26 (0.6%)
Copyright 2008 by CEBT Switching as Classification - Query Processing IDS Lab. Seminar - 6Center for E-Business Technology Query Feature Extractor Classifier (offline Training) Recommend a Search Engine Features Search Engines Result sets
Copyright 2008 by CEBT Classifier Features Classifier must recommend an engine in real-time Derive features from result pages, a query and query-result matching Features Features from result pages Features from the query Features from the query-result page match IDS Lab. Seminar - 7Center for E-Business Technology
Copyright 2008 by CEBT Classifier Features IDS Lab. Seminar - 8Center for E-Business Technology
Copyright 2008 by CEBT Classifier Notation q : query R : result page of original engine R’ : result page of target engine R* = {(d 1,s 1 ), …, (d k,s k )} : Human-judged result set - d k (result page), s k (score) Utility of each engine : U(R) = NDCG R* (R), U(R’) = NDCG R* (R’) Training Each training instance D = {(x, y)} – x = f(q, R, R’) ; comprised of features derived from the query and result page – y = 1 iff NDCG R* (R’) >= NDCG R* (R) + margin Switching engine if utility is higher by at least some margin IDS Lab. Seminar - 9Center for E-Business Technology
Copyright 2008 by CEBT Experiments Evaluate accuracy of switching support to determine its viability Data Set From Google, Yahoo, Live Search logs IDS Lab. Seminar - 10Center for E-Business Technology Total number of queries17,111 Total number of judged pages4,254,730 Total number of judged pages labeled Fair or higher1,378,011
Copyright 2008 by CEBT Precision – Recall Results The proposed method can achieve high accuracy Therefore, the method can be used for providing useful search engine suggestions to users IDS Lab. Seminar - 11Center for E-Business Technology
Copyright 2008 by CEBT Avoiding Querying the Alternative Engine Evaluating the utility of target engine is undesirable to some users due to the network traffic So, only use the features from the current engine’s result pages IDS Lab. Seminar - 12Center for E-Business Technology
Copyright 2008 by CEBT Contribution of Features All sets of features contribute to accuracy Features obtained from result pages seems to provide the most benefit IDS Lab. Seminar - 13Center for E-Business Technology
Copyright 2008 by CEBT Conclusion Demonstrated potential benefit of switching Described a method for automatically determining when to switch engines for a given query Evaluated the method and illustrated good performance, especially at usable recall IDS Lab. Seminar - 14Center for E-Business Technology
Copyright 2008 by CEBT Pros. & Cons. Pros. Propose a new research area of IR by switching support Good explanation for user behavior Cons. Poor explanation for equations No analysis for the experiment results IDS Lab. Seminar - 15Center for E-Business Technology