Search Strategies Session 7 INST 301 Introduction to Information Science.

Slides:



Advertisements
Similar presentations
Recuperação de Informação B Cap. 10: User Interfaces and Visualization 10.1,10.2,10.3 November 17, 1999.
Advertisements

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Chapter 5: Introduction to Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Search and Ye Shall Find (maybe) Seminar on Emergent Information Technology August 20, 2007 Douglas W. Oard.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Ranked Retrieval INST 734 Module 3 Doug Oard. Agenda  Ranked retrieval Similarity-based ranking Probability-based ranking.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Kalervo Järvelin – Issues in Context Modeling – ESF IRiX Workshop - Glasgow THEORETICAL ISSUES IN CONTEXT MODELLING Kalervo Järvelin
Evaluating Search Engine
Search Engines and Information Retrieval
Information Retrieval Review
© Tefko Saracevic, Rutgers University1 Search strategy & tactics Governed by effectiveness & feedback.
© Tefko Saracevic, Rutgers University1 Interaction in information retrieval There is MUCH more to searching than knowing computers, networks & commands,
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
INFO 624 Week 3 Retrieval System Evaluation
Information Retrieval Interaction CMSC 838S Douglas W. Oard April 27, 2006.
HCI Part 2 and Testing Session 9 INFM 718N Web-Enabled Databases.
© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
SLIDE 1IS 202 – FALL 2003 Lecture 26: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
WMES3103: INFORMATION RETRIEVAL WEEK 10 : USER INTERFACES AND VISUALIZATION.
The Information School of the University of Washington INFO 310 Information Behavior Models of information behavior.
INFO Human Information Behavior (HIB) What is information behavior? What is “information”?
Search Engines Session 11 LBSC 690 Information Technology.
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Information Seeking Processes and Models Dr. Dania Bilal IS 530 Fall 2007.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Search Engines and Information Retrieval Chapter 1.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
The Cognitive Perspective in Information Science Research Anthony Hughes Kristina Spurgin.
SLB /04/07 Thinking and Communicating “The Spiritual Life is Thinking!” (R.B. Thieme, Jr.)
Psychology: memory. Overview An understanding of human memory is critical to an appreciation of how users will store and use relevant information when.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Information Retrieval Evaluation and the Retrieval Process.
Planning a search strategy.  A search strategy may be broadly defined as a conscious approach to decision making to solve a problem or achieve an objective.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Evaluation INST 734 Module 5 Doug Oard. Agenda  Evaluation fundamentals Test collections: evaluating sets Test collections: evaluating rankings Interleaving.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
Structure of IR Systems INST 734 Module 1 Doug Oard.
Web Search Module 6 INST 734 Doug Oard. Agenda The Web Crawling  Web search.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets  Selection part 2: Result sets Examination.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Information Retrieval Techniques Israr Hanif M.Phil QAU Islamabad Ph D (In progress) COMSATS.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Jane Reid, AMSc IRIC, QMUL, 30/10/01 1 Information seeking Information-seeking models Search strategies Search tactics.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Information Retrieval
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Structure of IR Systems LBSC 796/INFM 718R Session 1, January 26, 2011 Doug Oard.
Evidence from Metadata INST 734 Doug Oard Module 8.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets Selection part 2: Result sets  Examination.
Search and Retrieval: Finding Out About Prof. Marti Hearst SIMS 202, Lecture 18.
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
Search Engines Session 5 INST 301 Introduction to Information Science.
Sampath Jayarathna Cal Poly Pomona
What is Information Retrieval (IR)?
Information Retrieval and Web Search
Evaluation of IR Performance
Document Clustering Matt Hughes.
Retrieval Performance Evaluation - Measures
Information Retrieval and Web Design
Structure of IR Systems
Presentation transcript:

Search Strategies Session 7 INST 301 Introduction to Information Science

The Search Engine Source Selection Search Query Selection Ranked List Examination Document Delivery Document Query Formulation IR System Indexing Index Acquisition Collection

“The Search” Source Selection Search Query Selection Ranked List Examination Document Delivery Document Query Formulation IR System Indexing Index Acquisition Collection

Search Strategies

Marchionini’s Factors Affecting Information Seeking Information seeker Task Search system Domain Setting Outcomes

Belkin’s ASK: Anomalous State of Knowledge Searchers do not clearly understand –The problem itself –What information is needed to solve the problem The query results from a clarification process

Q0 Q1 Q2 Q3 Q4 Q5 A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” Bates’ “Berry Picking” Model

Dervin’s Sensemaking Situation GapAction

Four Levels of Information Needs RIN 0 PIN 0 PIN m r0r0 r1r1 q0q0 … q1q1 q2q2 q3q3 rnrn qrqr Stefano Mizzaro. (1999) How Many Relevances in Information Retrieval? Interacting With Computers, 10(3), Real information needs (RIN) = visceral need Perceived information needs (PIN) = conscious need Request = formalized need Query = compromised need

Broder’s Web Query Taxonomy Informational (~50%) –Acquire static information (“topical”) Navigational (~20%) –Reach a particular site (“known item”) Transactional (~30%) –Perform a Web-mediated activity (“service”) Andrei Broder, SIGIR Forum, Fall 2002

Supporting the Search Process Source Selection Search Query Selection Ranked List Examination Document Delivery Document Query Formulation IR System Query Reformulation and Relevance Feedback Source Reselection NominateChoose Predict

Two Ways of Searching Write the document using terms to convey meaning Author Content-Based Query-Document Matching Document Terms Query Terms Construct query from terms that may appear in documents Free-Text Searcher Retrieval Status Value Construct query from available concept descriptors Controlled Vocabulary Searcher Choose appropriate concept descriptors Indexer Metadata-Based Query-Document Matching Query Descriptors Document Descriptors

Boolean Operators spacewalk Apollo Gemini spacewalk AND (Apollo OR Gemini) spacewalk Apollo Gemini spacewalk AND Apollo AND (NOT Gemini)

The Perfect Query Paradox Every information need has a perfect document ste –Finding that set is the goal of search Every document set has a perfect query –AND every word to get a query for document 1 –Repeat for each document in the set –OR every document query to get the set query The problem isn’t the system … it’s the query!

Pearl Growing Start with a set of relevant documents Use them to learn new vocabulary –Related terms to help broaden the search –Terms that help to remove unrelated senses Repeat until you converge on a useful set

Pearl Growing Example What is the Moon made of? Query: moon Initial search reveals: –Adding “Apollo” might help focus the search –Rejecting “Greek” might avoid unrelated page Revised query: +moon -Greek Apollo

28% of Web Queries are Reformulations Pass, et al., “A Picture of Search,” 2007

Concept Analysis Identify facets of your question –What entities are involved? –What attributes of those entities are important? –What attribute values do you seek? Choose the appropriate search terms –What terms might an author have used? Perhaps by using a thesaurus –Use initial searches to refine your term choices

Building Blocks Most Specific Facet Another Facet Final Query Partial Query AND

Building Blocks Example What is the history of Apartheid? Facets? –Entity: Racial segregation –Location: South Africa –Time: Before 1990 Query construction: –(Apartheid OR segregation) AND (“South Africa” OR Pretoria) AND (history OR review)

Web-specific Strategies Using outward links –Find good hubs, follow their links Using inward links –Find good authorities, see who links to them +url: URL pruning –Discover related work, authors, organization, … –Some servers provide raw directory listings

Query Suggestion Predict what the user might type next –Learned from behavior of many users –Can be customized to a user or group Helps w/typos, limited vocabulary, … Provides a basis for auto-completion –Particularly useful in difficult input settings

Pass, et al., “A Picture of Search,” 2007

Burstiness Query: Earthquake

Diversity Ranking Query ambiguity –UPS: United Parcel Service –UPS: Uninterruptible power supply –UPS: University of Puget Sound Query aspects –United Parcel Service: store locations –United Parcel Service: delivery tracking –United Parcel Service: stock price

Try Some Searches Using building blocks: –Which cities in the former country of Czechoslovakia have pollution problems? Using pearl growing: –What event in the early 1900’s is Noel Davis famous for?

Some Good Advice Human-Computer Interaction User in control –Anticipatable outcomes –Explainable results –Browsable content –Informative feedback –Easy reversal Limit working memory load –Show query context Support for learning –Novice and expert alternatives –Scaffolding Interactive IR Support human reasoning –Show actual content –Depict uncertainty –Be fast Use familiar metaphors –Timelines, ranked lists, maps, … Some system initiative –Loosely guide the process –Expose structure of knowledge Co-design w/search strategies Credit: Ben Shneiderman

Evaluation What can be measured that reflects the searcher’s ability to use a system? (Cleverdon, 1966) –Coverage of Information –Form of Presentation –Effort required/Ease of Use –Time and Space Efficiency –Recall –Precision Effectiveness

Evaluating IR Systems User-centered strategy –Given several users, and at least 2 retrieval systems –Have each user try the same task on both systems –Measure which system works the “best” System-centered strategy –Given documents, queries, and relevance judgments –Try several variations on the retrieval system –Measure which ranks more good docs near the top

Which is the Best Rank Order? = relevant document A. B. C. D. E. F.

Precision and Recall Precision –How much of what was found is relevant? –Often of interest, particularly for interactive searching Recall –How much of what is relevant was found? –Particularly important for law, patents, and medicine

Relevant Retrieved Measures of Effectiveness

Affective Evaluation Measure stickiness through frequency of use –Non-comparative, long-term Key factors (from cognitive psychology): –Worst experience –Best experience –Most recent experience Highly variable effectiveness is undesirable –Bad experiences are particularly memorable

Before You Go On a sheet of paper, answer the following (ungraded) question (no names, please): What was the muddiest point in today’s class?