1 Date: 2012/9/13 Source: Yang Song, Dengyong Zhou, Li-wei Heal(WSDM’12) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Query Suggestion by Constructing Term-Transition Graphs
2 Outline Introduction User preference from log Query suggestion Topic-based Pagerank model (topic level) Learning to Rank model (term-level) Experiment Conclusion
3 It’s difficult for search engines to fully understand user’s search intent in many scenarios. Query suggestion techniques aim at recommending a list of relevant queries to user’s input. User query reformulation activities 1)Adding terms after the query 2)Deleting terms within the query 3)Modifying terms to new terms 1) stanford university location 2) stanford university address 3) stanford university map → stanford university location Search engine logs
4 four user activities in a session: {q1, URL1, q2, URL2} stanford university stanford university location satisfy reformulate q1 q2 URL1 URL2
5 How to extract high quality user clicks in the user session and use the clicks properly? it is critical to accurately infer user intent by examining the entire it is critical to accurately infer user intent by examining the entire user search session. user search session. Given a specific query, which query suggestion method should be used? Can query reduction method be applied to short queries? Can query reduction method be applied to short queries? Is query expansion always better than query reformulation? Is query expansion always better than query reformulation?
5 Step 2 Step 3 Step 1 Derive high quality user clicks by extracting specific user activity patterns which conveys implicit user preferences. Ex : tuple {q1, q2, url} Construct a term transition graph model from the data. Node: term in the query Edge: a preference find user preferences within each topic and choose the best suggestion method according to the models.
Outline IIntroduction UUser preference from log QQuery suggestion Topic-based Pagerank model (topic level) Learning to Rank model (term-level) EExperiment CConclusion 6
7 A typical log entry: User identification number URL that the user visited Timestamp of the visit Total dwell time on that URL organize based on sessions Contain : a series of URL visits from a particular user, ordered by the timestamp. Statistics: (focus on mining user preference in this paper) users only modify one of the terms in the queries: ≥76% users are much more likely to revise the last term in the query : ≥80%
8 Three cases of user query refinements 3-month Toolbar logs between May 2010 and August In total, over 4 million user refinement activities were discovered in the log. A total of 350,000 pairs of refinements were found.
9 Outline IIntroduction UUser preference from log QQuery suggestion Topic-based Pagerank model Learning to Rank model (term-level) EExperiment CConclusion
10 term transition graph M matrix: Rank (t) matrix: Initial rank value Mij = 1/N(j) if there is a transition from node j to node i, with N(j) being the total number of outlinks of node j.
11 Arts Business ComputersCom, , Home, House, Login, Page Games Health Home Kids and teens News Recreation Reference Regional Science Shopping Society Sports World P computers,com =1/6 p computers, =1/6 p computers,home = 1/6 p computer,house = 1/6 p computers,login = 1/6 p computers,page = 1/6 p matrix:
Arts Business ComputersCom, , Home, House, Login, Page Games Health Home Kids and teens News Recreation ReferenceCom, , Login, Page Regional Science Shopping Society Sports World P reference,com =1/4 p reference, =1/4 p reference,home = 0 p reference,house = 0 p reference,login = 1/4 p reference,page = 1/4 p matrix: 12
13 α :0.5 (1-0.5)x+0.5 x Rank (t) Rank (t+1) com home house login page Rank (t+2)
14 Query Suggestion Given a query qi with m terms {w i1,..., w ij,..., w im } Stanford university mapStanford university location w iJ :map w’ iJ :location Query (q) New Query (q’) P(w iJ → w’ iJ |T 1 )=0.3[Pagerank value] P(T 1 | q’)= 6/10 Only consider the top-10 URLs returned by a search engine into ODP categories T 1 :Regional T 2 :Art T 3 :Computers …. T 16 : P(w iJ → w’ iJ |q’)=0.3* *0.4 =0.38 Query Term refinement probability:ggestion P(w iJ → w’ iJ |T 2 )=0.5[Pagerank value] P(T 2 | q’)= 4/10
Arts Business ComputersCom, , Home, House, Login, Page Games Health Home Kids and teens News Recreation ReferenceCom, , Login, Page Regional Science Shopping Society Sports World P reference,com = 1 p reference, = 1 p reference,home = 0 p reference,house = 0 p reference,login = 1 p reference,page = 1 p matrix: 16
15 Stanford university mapStanford university location Query (q) New Query (q’) Stanford university address Stanford colleage location Harvard university location P(q’| q)=P(map)*P(map → location|q) =4/16* 0.04=0.01 P(q’| q)=P(map)*P(map → address|q) =4/16* 0.02=0.005 Stanford school location
16 § : Suggestions from using expansion gets a very low score
17 Outline IIntroduction UUser preference from log QQuery suggestion Topic-based Pagerank model(topic level) Learning to Rank model (term-level) EExperiment CConclusion
18 Given a set of queries and labeled relevance score (usually in 5- scale) for each query pair. learning-to-rank tries to optimize for the ranking loss for all queries in the training set. Three sets of features: Query Features Is the query a Wikipedia title? ∈ {0, 1} Is the query a Wikipedia category? ∈ {0, 1} # of times query contains in Wikipedia title ∈ R # of times query contains in Wikipedia body ∈ R Term Features Pagerank score of the term # of inlinks & outlinks entropy of Pagerank score in 16 ODP topics: −∑i P(ti) log P(ti) # of times derived from EMPTY node Query-Term Features N-gram conditional probabilities, p(wn|wn−m+1,..., wn−1) Inverted Query Frequency: logN(ti)/N(ti|qj )
19 Query-Term Features : N-gram conditional probabilities ( P(term|N-gram) ) Stanford universitymap query User log1 {map,Stanford map,click URL1} User log2 {Harvard university map,Stanford university map,click URL2} User log3 {Stanford university map,Harvard university map,click URL3} User log4 {Stanford university,Harvard university,click URL4} User log5 {university map,Harvard university map,click URL5} term
20 Generate ranking labels for each group of query refinements Training labels automatically from implicit user feedbacks in the log. ─ Training tuple {q 1, q 2, click url } ─ clicks and skips ( α, β ) ─ given a refined query q’ for the original query q probability of click distribution using a function ─ compare two query refinements q 1 and q 2 P(stanford admission > stanford map) = 0.61 P(stanford map > stanford history) = 0.08 Threshold:0.5
21 query1 term 1Feature 1,F2,F3…F302 term 2 """""""""" 1 term 3 """"""""""" 3 query2 query Training data Build model Build model query1 term 1Feature 1,F2,F3…F30 term 2 """""""""" term 3 """"""""""" query2 query Test data …. rank Predict Ranking score
Given a new query q that contains k terms: {t 1,..., t k } Create a candidate set of queries Ex: query{t1, t2, t3} q 1 q 2 q 3 q 4 the candidate set contains {t1, t2}, {t1, t3},{t2, t3}and {t1, t2, t3}. The highest scored terms are suggested for the query {t1, t2} term 1Feature 1,F2,F3…F term 2 """""""""" term 3 """"""""""" {t1, t3} {t2, t3} {t1, t2, t3} Predict Ranking score {t1, t2}+term3 → {t1, t2,term3 } 22
23 Outline IIntroduction UUser preference from log QQuery suggestion Topic-based Pagerank model(topic level) Learning to Rank model (term-level) EExperiment CConclusion
24 1) Performance of the two proposed models 2) Calibrate the parameters of methods user study and ask judgers to evaluate the query suggestion performance of metohds 1)Normalized discounted cumulative gain (NDCG) use the top-5 suggested queries for evaluation Relevance score(rel): Perfect (5), Excellent (4), Good (3), Fair (2) and Poor(1) 2)Zero/one-error Metrics:
Stanford university Query Top iQuery suggestionrel 1Query+Term15 2Query+Term23 3Query+Term34 4Query+Term43 5Query+Term52 Query suggestion
26 Zero/one-error Rank → Train Term1Term2Term3Term4Term5 Test Term1Term4Term3Term5Term2 Query suggestion( Top 5) Zero/one-error =2/5=0.4
27
28 3 different judgers and the majority vote is used as the final label. (1)Relevant (2) irrelevant (3) no opinion (hard to judge) Rank(N)Query suggestionJudger voteP(N) 1q1q1 Rel1/1 2 q2q2 Irrel1/2 3 q3q3 Irrel1/3 4 q4q4 Rel2/4 5 q5q5 Rel3/5 1 when query at j is relevant 0 otherwise
29 Rank(N)Query suggestionP(N) 1 q 11 1/1 2 q 12 1/2 3 q 13 1/3 4 q 14 2/4 5 q 15 3/5 Query 1 Rank(N)Query suggestionP(N) 1q q 22 1/2 3q 23 2/3 4q 24 3/4 5q 25 4/5 Query 2 ……………………….Query K AverageP(q 2 )=0.5 If K=2, MAP=( )/2=0.6 Irrel Rel rel Irrel Rel
30 Short: ≤ 2, medium: [3, 5] and long: ≥ 6
+A novel query suggestion framework which extracted user preference data from user sessions in search engine logs. +used the user patterns to build two suggestion models. - topic-based PageRank model - Learning to Rank model +User study was conducted where all queries are triple-judged by human judgers. Experimental results indicated significant improvement on two models. +Model only considered changing one term from the queries, how the performance will increase/decrease by leveraging a more sophisticated user preference extraction model which could consider multi-term alteration. 31
32