Download presentation
Presentation is loading. Please wait.
1
Slides Please download the slides from
2
2
LBSC 796/CMSC 838o Daqing He, Douglas W. Oard Session 5, March 8, 2004
Interactions LBSC 796/CMSC 838o Daqing He, Douglas W. Oard Session 5, March 8, 2004 1
3
Slides Please download the slides from
2
4
Agenda Interactions in retrieval systems Query formulation Selection
Examination Document delivery 2
5
System Oriented Retrieval Model
Search Query Ranked List Indexing Index Acquisition Collection
6
Whose Process Is It? Who initiates a search process?
Who controls the progress? Who ends a search process? 8
7
User Oriented Retrieval Model
Source Selection Query Formulation IR System Search Query Ranked List Collection Indexing Index Document Selection Document Document Examination Document Collection Acquisition Document Delivery
8
Taylor’s Conceptual Framework
Four levels of “information needs” Visceral What you really want to know Conscious What you recognize that you want to know Formalized (e.g., TREC topics) How you articulate what you want to know Compromised (e.g., TREC queries) How you express what you want to know to a system [Taylor 68] 8
9
Belkin’s ASK model Users are concerned with a problem
But do not clearly understand the problem itself the information need to solve the problem Anomalous State of Knowledge Need clarification process to form a query [Belkin 80, Belkin, Oddy, Brooks 82] 8
10
What are humans good at? Sense low level stimuli Recognize patterns
Reason inductively Communicate with multiple channels Apply multiple strategies Adapt to changes or unexpected events Fuzzy and hard things From Ben Shneiderman’s “designing user interfaces”
11
What are computers good at?
Sense stimuli outside human’s range Calculate fast and mechanical Store large quantities and recall accurately Response rapidly and consistently Perform repetitive actions reliably Maintain performance under heavy load and extended time “Simple and sharply defined things” again paraphrasing George Miller From Ben Shneiderman’s “designing user interfaces”
12
What should Interaction be?
Synergic Humans do things that human are good at Computers do things that computers are good at the strength of one covers the weakness of the other
13
Source Selection People have their own preference
Different tasks require different sources Possible choices ask help from people or machines browsing or search, or combination general purpose vs specific domain IR system different collections
14
Query Formulation User Query Formulation Search Collection Indexing
15
User’s Goals User’s goals How can the user achieve this goal?
Identify the right query for the current need conscious/formalized need => compromised need How can the user achieve this goal? Infer the right query terms Infer the right composition of terms
16
System’s Goals Help the user build links between needs
know more about the system and the collection
17
How does System Achieve Its Goals?
Ask more from the user Encourage long/complex queries Provide a large text entry area Use forms filling or direct manipulation Initiate interactions Ask questions related to the needs Engage a dialogue with the user Infer from relevant items Infer from previous queries Infer from previous retrieved documents
18
Query Formulation Interaction Styles
Shneiderman 97 Command Language Form Fillin Menu Selection Direct Manipulation Natural Language Credit: Marti Hearst
19
Form-Based Query Specification (Melvyl)
Credit: Marti Hearst
20
Form-based Query Specification (Infoseek)
Credit: Marti Hearst
21
Direct Manipulation Spec. VQUERY (Jones 98)
Credit: Marti Hearst
22
High-Accuracy Retrieval of Documents
Topic Statement Search Engine Baseline Results Answers to Clarification Questions New track in TREC Study interaction between a user and a system Only one chance to interact with the user Query formulation is still the system’s task Extended batch IR model Acknowledge queries not equal to needs Allow asking user a set of clarification questions Designed for controlled evaluation Clarification questions generated in batch mode, Clarification questions only generated once Use ranked list as the outcome of the search Reasons for participation Human factor in IR process Controlled evaluation is hard in full interactive IR experiment HARD Results Clarification Questions
23
UMD HARD 2003 retrieval model
Clarification Questions HARD retrieval process Preference among subtopic areas Query Expansion Recently viewed relevant documents Document Reranking Refined Ranked List Preference to sub-collections or genres One way of achieving personalization is to include user’s search contexts into the retrieval process. To illustrate this, let’s look at a scenario. Desired result formats Passage Retrieval Ranked List Merging [He & Demner, 2003]
24
Dialogues in Need Negotiation
Information Need Document Collection 1. Formulate a Query Search Engine 2. Need negotiation Lets see what do I mean. Our interests are in the retrieval process. Which in a library situation would be: 1. A person with an information need 2. He formulates a query and consults with an experienced human intermediary who knows a lot about the document collection the library has. 3. She looks through the collection and finds the documents that match with the user query 3. Find Documents Matching the Query Search Results
25
Personalization through User’s Search Contexts
Incremental Learner Casablanca Context African Queen Context One way of achieving personalization is to include user’s search contexts into the retrieval process. To illustrate this, let’s look at a scenario. Romantic Films Context Information Retrieval System Romantic Films [Goker & He, 2000]
26
Things That Hurt Obscure ranking methods Counterintuitive statistics
Unpredictable effects of adding or deleting terms Only single-term queries avoid this problem Counterintuitive statistics “clis”: AltaVista says 3,882 docs match the query “clis library”: ,025 docs match the query! Every document with either term was counted 11
27
Browsing Retrieved Set
User Query Formulation Query Search Ranked List Document Selection Document Query Reformulation Document Reselection Document Examination
28
Indicative vs. Informative
Terms often applied to document abstracts Indicative abstracts support selection They describe the contents of a document Informative abstracts support understanding They summarize the contents of a document Applies to any information presentation Presented for indicative or informative purposes 15
29
User’s Browsing Goals Identify documents for some form of delivery
An indicative purpose Query Enrichment Relevance feedback (indicative) User designates “more like this” documents System adds terms from those documents to the query Manual reformulation (informative) Better approximation of visceral information need 16
30
System’s Goals Assist the user to Identify relevant documents
Identify potential useful terms for clarifying the right information need for generating better queries
31
Browsing Retrieved Set
User Query Formulation Query Search Ranked List Document Selection Document Query Reformulation Document Reselection Document Examination
32
A Selection Interface Taxonomy
One dimensional lists Content: title, source, date, summary, ratings, ... Order: retrieval status value, date, alphabetic, ... Size: scrolling, specified number, RSV threshold Two dimensional displays Construction: clustering, starfields, projection Navigation: jump, pan, zoom Three dimensional displays Contour maps, fishtank VR, immersive VR 18
33
Extraction-Based Summarization
Robust technique for making disfluent summaries Four broad types: Single-document vs. multi-document Term-oriented vs. sentence-oriented Combination of evidence for selection: Salience: similarity to the query Selectivity: IDF or chi-squared Emphasis: title, first sentence For multi-document, suppress duplication 27
34
Generated Summaries Fluent summaries for a specific domain
Define a knowledge structure for the domain Frames are commonly used Analysis: process documents to fill the structure Studied separately as “information extraction” Compression: select which facts to retain Generation: create fluent summaries Templates for initial candidates Use language model to select an alternative 27
35
Google’s KWIC Summary For Query “University of Maryland College Park”
20
36
Teoma’s Query Refine Suggestions
url: 20
37
Vivisimo’s Clustering Results
url: vivisimo.com
38
Kartoo’s Cluster Visualization
url: kartoo.com
39
Cluster Formation Based on inter-document similarity
Computed using the cosine measure, for example Heuristic methods can be fairly efficient Pick any document as the first cluster “seed” Add the most similar document to each cluster Adding the same document will join two clusters Check to see if each cluster should be split Does it contain two or more fairly coherent groups? Lots of variations on this have been tried 20
40
Starfield 21
41
Dynamic Queries: IVEE/Spotfire/Filmfinder (Ahlberg & Shneiderman 93)
42
Constructing Starfield Displays
Two attributes determine the position Can be dynamically selected from a list Numeric position attributes work best Date, length, rating, … Other attributes can affect the display Displayed as color, size, shape, orientation, … Each point can represent a cluster Interactively specified using “dynamic queries” 22
43
Projection Depict many numeric attributes in 2 dimensions
While preserving important spatial relationships Typically based on the vector space model Which has about 100,000 numeric attributes! Approximates multidimensional scaling Heuristic approaches are reasonably fast Often visualized as a starfield But the dimensions lack any particular meaning 23
44
Contour Map Displays Display a cluster density as terrain elevation
Fit a smooth opaque surface to the data Visualize in three dimensions Project two 2-D and allow manipulation Use stereo glasses to create a virtual “fishtank” Create an immersive virtual reality experience Mead mounted stereo monitors and head tracking “Cave” with wall projection and body tracking 24
45
ThemeView Credit to: Pacific Northwest National Laboratory
46
Browsing Retrieved Set
User Query Formulation Query Search Ranked List Document Selection Document Query Reformulation Document Reselection Document Examination
47
Full-Text Examination Interfaces
Most use scroll and/or jump navigation Some experiments with zooming Long documents need special features “Best passage” function helps users get started Overlapping 300 word passages work well “Next search term” function facilitates browsing Integrated functions for relevance feedback Passage selection, query term weighting, … 26
48
A Long Document
49
Document lens Robertson & Mackinlay, UIST'93, Atlanta, 1993
50
TileBar [Hearst et al 95]
51
SeeSoft [Eric 94]
52
Things That Help Show the query in the selection interface
It provides context for the display Explain what the system has done It is hard to control a tool you don’t understand Highlight search terms, for example Complement what the system has done Users add value by doing things the system can’t Expose the information users need to judge utility 28
53
Document Delivery User Document Examination Document Document Delivery
54
Delivery Modalities On-screen viewing Printing Fax-on-demand
Good for hypertext, multimedia, cut-and-paste, … Printing Better resolution, portability, annotations, … Fax-on-demand Really just another way to get to a printer Synthesized speech Useful for telephone and hands-free applications 30
55
Take-Away Messages IR process belongs to users
Matching documents for a query is only part of the whole IR process But IR system can help users And IR systems need to support Query formulation/reformulation Document Selection/Examination
56
Two Minute Paper When examining documents in the selection and examination interfaces, which type of information need (visceral, conscious, formalized, or compromised) guides the user’s decisions? Please justify your answer. What was the muddiest point in today’s lecture? 35
57
Alternate Query Modalities
Spoken queries Used for telephone and hands-free applications Reasonable performance with limited vocabularies But some error correction method must be included Handwritten queries Palm pilot graffiti, touch-screens, … Fairly effective if some form of shorthand is used Ordinary handwriting often has too much ambiguity 13
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.