Georg Buscher Georg Buscher, Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI) Knowledge Management Department Kaiserslautern, Germany SIGIR 08 Query Expansion Using Gaze-Based Feedback on the Subdocument Level
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 2Georg Buscher 1. Motivation 2. Reading detection and document annotation technique 3. Implicit feedback methods 4. Study design 5. Results Outline /
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 3Georg Buscher Outline 1. Motivation 2. Reading detection and document annotation technique 3. Implicit feedback methods 4. Study design 5. Results /
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 4Georg Buscher Background and Motivation Relevance feedback à la Rocchio is well understood Feedback is mostly applied for entire documents Precision presumably gets better when acquiring feedback on the subdocument level Drawbacks of such fine-grained feedback: –Too much cognitive load for explicit feedback –Too little implicit feedback data through explicit interactions (e.g. highlighting) document Relevance feedback on the document level / Relevance feedback on the subdocument level Use eye gaze as source for implicit feedback on the subdocument level
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 5Georg Buscher Outline 1. Motivation 2. Reading detection and document annotation technique 3. Implicit feedback methods 4. Study design 5. Results
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 6Georg Buscher Eye Tracking Unobtrusive Relatively precise (accuracy: 1° of visual angle) Expensive Mostly used as passive tool for behavior analysis, e.g. visualized by heatmaps: We use eye tracking for immediate implicit feedback taking into account temporal fixation patterns
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 7Georg Buscher Reading Detection 1. Starting point: Noisy gaze data from the eye tracker. 2. Fixation detection and saccade classification 3. Reading (red) and skimming (yellow) detection line by line See G. Buscher, A. Dengel, L. van Elst: Eye Movements as Implicit Relevance Feedback, in CHI '08
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 8Georg Buscher Gaze-Based Document Meta Data 5. Store reading information as document annotations in a semantic Wiki 4. Line-matching by applying optical character recognition See G. Buscher, A. Dengel, L. van Elst, F. Mittag: Generating and Using Gaze-Based Document Annotations, in CHI '08
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 9Georg Buscher Outline 1. Motivation 2. Reading detection and document annotation technique 3. Implicit feedback methods 4. Study design 5. Results
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 10Georg Buscher Implicit Relevance Feedback for Query Expansion Input: viewed documents having one specific task in mind Find terms that best describe the users current interest. Use these terms for query expansion task / information need context terms describing the users current interest / context
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 11Georg Buscher Three Implicit Feedback Methods to Evaluate Input: viewed documents Gaze-Filter TF x IDF Gaze-Length- Filter Interest(t) x TF x IDF based on length of coherently read text based on read or skimmed passages
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 12Georg Buscher Gaze-Length-Filter # long read or skimmed passages containing t Interest(t) = # all read or skimmed passages containing t Long passages are passages containing at least 230 characters (i.e. more than the following two lines). The heuristic assumes that shorter text parts only rarely convey sophisticated concepts to the reader. It further assumes that readers are generally not very interested in the contents of short read or skimmed text parts. Therefore all terms contained in short read or skimmed text parts get a lower interest value.
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 13Georg Buscher Three Implicit Feedback Methods to Evaluate Input: viewed documents Gaze-Filter TF x IDF Gaze-Length- Filter Reading Speed ReadingScore(t) x TF x IDF based on read vs. skimmed passages containing term t based on read or skimmed passages Interest(t) x TF x IDF based on length of coherently read text
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 14Georg Buscher Reading Speed P are all read or skimmed passages containing term t. The heuristic assumes that more thoroughly read text parts (and therefore their terms) are more likely to be of interest to the user than cursorily viewed parts. 1 ReadingScore(t) = |P | t Σ p є P t r(p) t
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 15Georg Buscher Three Implicit Feedback Methods to Evaluate Input: viewed documents Baseline TF x IDF Gaze-Filter TF x IDF Gaze-Length- Filter Reading Speed ReadingScore(t) x TF x IDF based on read vs. skimmed passages containing term t based on opened entire documents based on read or skimmed passages Interest(t) x TF x IDF based on length of coherently read text
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 16Georg Buscher Outline 1. Motivation 2. Reading detection and document annotation technique 3. Implicit feedback methods 4. Study design 5. Results
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 17Georg Buscher Study Design 1. Informational task given 2 different tasks Task description in simulated Participants had to imagine being journalists 2. Read pre-selected documents attachments Document structure carefully chosen 3. Search for more information on Wikipedia 3 different queries: main topic, sub-topic, related topic 4. Give relevance feedback for the first 20 result entries per query Read about topic in Look through 4 attachments to get started with the topic Find more information by querying search engine Give explicit relevance feedback 3x 2x
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 18Georg Buscher Topic: perceptual organs of animals Pre-selected documents: 4 Wikipedia articles about cats, sharks, dogs, bats –The articles described all facets of the species. –Each article contained several paragraphs dealing with perception-related issues. 3 different queries –Main topic query: more material about perception –Sub-topic query: more material about visual perception –Related-topic query: perceptual organs for the earths magnetic field Task Example
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 19Georg Buscher Result List Generation Create basic result list Create expanded queries (+ top 50 terms) Re-rank that list for every query expansion variant Merge the re-ranked result lists in a balanced, ordered way Present merged list to the participant User query Variation: Baseline Variation: Gaze-Filter Variation: Gaze-Length-Filter Variation: Reading-Speed Re-ranked list 1 Re-ranked list 2 Re-ranked list 3 Re-ranked list 4 Expanded query 1 Expanded query 2 Expanded query 3 Expanded query 4 Result list Merged result list Viewed documents User
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 20Georg Buscher Outline 1. Motivation 2. Reading detection and document annotation technique 3. Implicit feedback methods 4. Study design 5. Results
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 21Georg Buscher Overview 21 participants minutes per participant 111 issued user queries 2220 explicit relevance ratings Distribution of the relevance ratings
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 22Georg Buscher Precision and Discounted Cumulative Gain (DCG)
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 23Georg Buscher Mean Average Precision Powerful improvement of all gaze-based variants over the baseline Reading-Speed variant is less effective than GF and GLF GLF might be a bit better than GF? ** : p < 0.01 * : p < 0.05 (*): p < 0.1 (two-tailed paired t-test)
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 24Georg Buscher Query Type Differentiation Generally similar trend within each query type MAP consistently decreases from main topic to sub topic to related topic queries –Narrow information needs especially for related topic queries –Wikipedia did not contain too many relevant pages MAP of the Baseline decreases much more (-0.25) compared to GF (-0.17), GLF (-0.18) Asterisks mark significance of improvement over the baseline B: Baseline GF: Gaze-Filter GLF: Gaze-Length-F. RS: Reading-Speed
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 25Georg Buscher Pages about animal species Inappropriate Context The baseline method extracts terms that might be far away from the users current topic of interest. Expanding the query with these terms can lead in a wrong and for the user unpredictable direction. The more distant the topic of the users next query is (i.e. related topic query), the more negative is the effect of unsuitable terms for expanding the query. Animal perception Parts of animal perception (e.g. only visual and auditory perception) Gaze-based methods Animal species Baseline method
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 26Georg Buscher Conclusion Gaze data can effectively be analyzed and used as a source for implicit feedback Reading behavior detection on its own provides useful information for query expansion and re-ranking Precision can be improved just by adding those terms to a query that have been read before Future Work More realistic web search scenarios (e.g. not only on Wikipedia) More sophisticated heuristics for interpreting gaze-based feedback Gaze also for long-term implicit feedback (e.g. desktop search)
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 27Georg Buscher Interested? Interested in implicit feedback for personalization? –E.g. scrolling behavior, click-through, mouse movements, eye tracking, EEG, bio sensors, emotions, magic, … Please let me know! – Workshop?
Query Expansion Using Gaze-Based Feedback on the Subdocument Level, slide 28Georg Buscher Thank you for your attention! Special thanks for the travel grant by - ACM SIGIR - Amit Singhal made in honor of Donald B. Crouch - Microsoft Research made in honor of Karen Sparck Jones