Download presentation
Presentation is loading. Please wait.
1
A Study of Immediate Requery Behavior in Search
Haotian Zhang, Mustafa Abualsaud, and Mark D. Smucker Presented by: Haotian Zhang
2
We define this choice as an immediate requery.
Introduction Users begin processing search engine result page (SERP) with the goal of making one of three decisions: Click a search result and view page content. Without clicking on any results, abandon the query and reformulate the query to get a new SERP. We define this choice as an immediate requery. Abandon the query and quit the search process. Our study is focus is understanding people behavior in web search. When user enters a query and displayed the search result page, they’re faced with one of three decisions They find a document they’re interested in further reading They click the search result to view its content They look at the query result, they don’t click at page, so they abandon the query result but reformulate their query to get a new SERP that hopefully returns documents user is looking for. A reason for this decision is the search page quality We call this decision and immediate requery Means that their second action after getting the result is a reformulation They abandon the query result and finish the search process. Here, abandonment can be a good abandonment They found what their looking for without any clicks They give up
3
Questions What is the probability of an immediate requery at different levels of search result quality? How much time does it take for users to decide to make an immediate requery at different levels of search result quality? The focus of our study is on immediate requery In particular, how long until user do an immediate requery at different levels of search qualities? We want to know if search result quality has an effect on the time to immediate requery And what’s the probability of immediate requery at different search result qualities Whether quality has an effect on doing immediate requery
4
Summary of Findings The probability of an immediate requery increases as the search result quality decreases. Users decide quickly to make an immediate requery (median = 7.7 seconds), and the time appears to be independent of search result quality. Here’s a quick summary of what we found: Users make their decision to immd. requery very quickly. Median 7.7 seconds Time until making an immediate requery decision appears from our data to be independent of SERP quality. Prob. Of immed. Requery however, increases as SERP quality decreases As the SERP quality becomes worse, users are more likely to issue an immediate requery. We also found to be two classes of users: The majority of users focus on the top ranking documents to make a decision whether to immd. Requery or not The other user seem to be exhausitve and spend more time looking at result before making a decision
5
Overview of Study Design
We asked study participants to search for answers to simple questions. We manipulated the search results quality by controlling the rank at which an answer to the question can be found. We measured the probability of an immediate requery and the time to requery. So here’s how we designed our user study to investigate our questions. We created 12 tasks, each task contains a single factoid question. Question are meant to be simple yet unknown to most people They should also have a single standard answer, so not to confuse people Should be easy to find non-relevant plausible result We’ll be manipulating the result, so non relevant documents should at least some relevant query terms Example questions are : … On each task, we ask participants to use our search engine to find an answer to the question.
6
Study Design – Search Tasks
12 search tasks – 1 factoid question on each task. We designed each question: To be easy for users to find a relevant document containing the answer, and To have an answer that is unknown to most people. So here’s how we designed our user study to investigate our questions. We created 12 tasks, each task contains a single factoid question. Question are meant to be simple yet unknown to most people They should also have a single standard answer, so not to confuse people Should be easy to find non-relevant plausible result We’ll be manipulating the result, so non relevant documents should at least some relevant query terms Example questions are : … On each task, we ask participants to use our search engine to find an answer to the question.
7
Study Design – Search Tasks
Example questions: How long is the Las Vegas monorail in miles? Answer: 3.9 miles. Which year was the first Earth Day held? Answer: 1970. What is the scientific name of Mad Cow Disease? Answer: Bovine Spongiform Encephalopathy So here’s how we designed our user study to investigate our questions. We created 12 tasks, each task contains a single factoid question. Question are meant to be simple yet unknown to most people They should also have a single standard answer, so not to confuse people Should be easy to find non-relevant plausible result We’ll be manipulating the result, so non relevant documents should at least some relevant query terms Example questions are : … On each task, we ask participants to use our search engine to find an answer to the question.
8
Study Design – Search Interface
Similar to common commercial search engines. Displays 10 results per query. No pagination. We log client-side timestamps of all behavior actions.(e.g. clicks, keystrokes) 7 results We designed our interface to look similar to that of commercial search engines. We restricted to only display 10 result per query, with no pagination. We log user behavior actions like keystrokes, clicks and page loads. We do this to determine time between any two actions.
9
Study Design – SERP Manipulation
We crafted SERPs with different qualities prior to the study. Users need to enter any of trigger query words to trigger the manipulated SERP. All further queries return Bing API results. Question Trigger Query Words How long is the Las Vegas Monorail in miles? Las, Vegas, monorail Which year was the first Earth Day held? Earth, Day All of the manipulated SERPs are done before the study. We only show the manipulated SERP once after a user submits an effect query.
10
Study Design - Quality of Search Results
One Good: 1 relevant result containing correct answer and 9 non-relevant results. Rank of relevant result varied from 1-10. All Bad: All 10 results are non-relevant. Control: Unmodified search results from the Microsoft Bing Search API. With 10 results per page and simple binary relevance, there are many different possible ways to construct a SERP with different qualities. To make things simpler we focus only on three types: We put 1 relevant document that contain the answer in the list, and 9 other non-relevant documents. The non-relevant documents are plausible nonrel By doing this we assume: the lower the rel doc is, the lower the quality. The other type is having all 10 be non relevant. The last type is our control, which is basically unmodified result from a commercial search engine API. The picture shows an example
11
Example Manipulated Search Results
With 10 results per page and simple binary relevance, there are many different possible ways to construct a SERP with different qualities. To make things simpler we focus only on three types: We put 1 relevant document that contain the answer in the list, and 9 other non-relevant documents. The non-relevant documents are plausible nonrel By doing this we assume: the lower the rel doc is, the lower the quality. The other type is having all 10 be non relevant. The last type is our control, which is basically unmodified result from a commercial search engine API. The picture shows an example
12
Key Study Details Measures: Balanced design. 60 participants.
12 questions, each given a different SERP quality: 10 One Good at ranks 1 to 10. 1 All Bad treatment. 1 Control with Bing results. Measures: Probability of immediate requery. Time from query to immediate requery. Users complete 12 tasks in an balanced order. Each question appear in each task to eliminate any biases. 12 type of search result qualities 1 with all 10 non-relevant documents 1 control Users complete consent and demographics, and they do a practice task before starting the study
13
Results – Probability of Immediate Requery
As search quality decreases, the probability of an immediate requery increases. As the rank of the only correct document in the SERP goes from 1 (top page) to 10 (bottom page), the probability to requery increases.
14
Results – Probability of Immediate Requery
Probability is significantly different when the rank of the correct result is 1 or all results are non-relevant. As the rank of the only correct document in the SERP goes from 1 (top page) to 10 (bottom page), the probability to requery increases.
15
Results – Probability of Immediate Requery
Bing SERP is effectively the same as placing a correct result at rank 1, i.e. rank 1 Bing result is likely correct.
16
Results – Time to Immediate Requery
Median time to decide to do an immediate requery is 7.7 seconds. Time appears to be independent of search result quality. time to requery means the user did not click. Thus, for the users that rarely requery, they don't have much impact on this data. For the users that do requery, we think the users are examining ranks 1-4 and then requerying and thus the time of requery is independent of rank, because those that requery do the same thing each time (examine top ranks).
17
Other Results and Discussion
For queries that did not result in an immediate requery, how long does it take from a query to the first click on a search result? Do all users have the same propensity to immediately requery? time to requery means the user did not click. Thus, for the users that rarely requery, they don't have much impact on this data. For the users that do requery, we think the users are examining ranks 1-4 and then requerying and thus the time of requery is independent of rank, because those that requery do the same thing each time (examine top ranks).
18
Results – Time to the First Search Result Click
19
Results – Time to the First Search Result Click
Linear increase in time to click from rank 1 to 4. Median time to click at rank 1 is 3.1 seconds.
20
Results – Time to the First Search Result Click
The time to click documents at ranks 5 – 7 have a different pattern. b) time to click, means people clicked. So, since we have a bunch of users that requery rather than click at ranks 5-7, we have different sets of people behind each of these ranks, which makes comparison tricky. We could make a plot using only users who clicked at ranks 5-7 and see what the data looks like then, but we haven't done that yet. I'm not sure why the error bars are not larger for ranks 5-7 given there are fewer people behind the data.
21
Results – Time to the First Search Result Click
Participants appear to scan up from rank 10 to rank 8.
22
Existing SERP Behavior Research
Several studies have found two types of user behavior for examination of SERPs. In the language of (Aula et al., 2005): Economic users scan at most the first three results before acting, c.f. depth-first users of (Klöckner et al., 2004). Exhaustive users examine more than half of the visible summaries and sometimes even scroll to see the remaining summaries before acting. Eye-tracking (Lorigo et al., 2008) and mouse- tracking studies (Huang et al., 2011) find that users focus on top 3-4 results before deciding to requery. Cutrell and Guan (2007) found users to view the first 8 results before requerying. With this work in mind, we further look into our results
23
Results – Two Classes of Users
There appears to be two groups of users: Low rate of immediate requery (≤ 3 immediate requeries in total): 12 Users. High rate of immediate requery (≥ 4 immediate requeries in total): 48 Users. Figre shows the median time to answer a question for the low and high groups of users across the 12 search conditions. While the data is noisy because of the limited size of the low group, we see that for the control condition, and when the relevant document is at ranks 1-4 and 8-10, the low participants take longer than the high group. We also see that for the mid-ranks of 5-7, the low users have slightly faster times to answer than the high group. For compari- son, Table 4 reports the median time to answer for all participants. What seems to be happening is that the low group wastes time looking at more results for results at ranks 1-4 than is necessary to select the relevant document. When the relevant document is at ranks 5-7, the group of participants with a high probability of im- mediately requerying has apparently stopped scanning at rank 3 or 4 and immediately requerying. Meanwhile, the low group, which is exhaustively scanning results nds the relevant document at ranks 5-7 without needing to incur the cost of an immediate requery.
24
Results – Two Classes of Users
Figre shows the median time to answer a question for the low and high groups of users across the 12 search conditions. While the data is noisy because of the limited size of the low group, we see that for the control condition, and when the relevant document is at ranks 1-4 and 8-10, the low participants take longer than the high group. We also see that for the mid-ranks of 5-7, the low users have slightly faster times to answer than the high group. For compari- son, Table 4 reports the median time to answer for all participants. What seems to be happening is that the low group wastes time looking at more results for results at ranks 1-4 than is necessary to select the relevant document. When the relevant document is at ranks 5-7, the group of participants with a high probability of im- mediately requerying has apparently stopped scanning at rank 3 or 4 and immediately requerying. Meanwhile, the low group, which is exhaustively scanning results nds the relevant document at ranks 5-7 without needing to incur the cost of an immediate requery.
25
Results – Time From Query to Answer
High users took 86 seconds to answer and low users took 112 seconds. While data is noisy, we see: Low seems to find relevant document at ranks 5-7 without needing to incur cost of immediate requery. High users able to to keep time to answer nearly uniform for ranks 5-10 and ”AllBad”. If users try to optimize behavior to find an answer quickly, high users seem to be economic and low users seem to exhaustive.
26
Results – Time From Query to Answer
High users able to to keep time to answer nearly uniform for ranks 5-10 and “AllBad”. While data is noisy, we see: Low seems to find relevant document at ranks 5-7 without needing to incur cost of immediate requery. High users able to to keep time to answer nearly uniform for ranks 5-10 and ”AllBad”. If users try to optimize behavior to find an answer quickly, high users seem to be economic and low users seem to exhaustive.
27
Results – Time From Query to Answer
Possible explanation: high users scan ranks 1-4 and then requery, which gets them the good Bing results. While data is noisy, we see: Low seems to find relevant document at ranks 5-7 without needing to incur cost of immediate requery. High users able to to keep time to answer nearly uniform for ranks 5-10 and ”AllBad”. If users try to optimize behavior to find an answer quickly, high users seem to be economic and low users seem to exhaustive.
28
Conclusion As search result quality decreases, the probability of immediately requerying increases. Users can quickly decide to immediately reformulate. There appears to be two types of users: High probability of immediately reformulating. Unlikely to immediately reformulate unless no relevant documents can be found. While requerying takes time, it is the group of users who are more likely to immediately requery that are able to able find answers to questions the fastest.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.