© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback
© Tefko Saracevic2 Some definitions Search statement (query): –set of search terms with logical connectors and attributes - file and system dependent Search strategy (big picture): –overall approach to searching of a question selection of systems, files, search statements & tactics, sequence, output formats; cost, time aspects
© Tefko Saracevic3 Some definitions (cont.) Search tactics (action choices): –choices & variations in search statements terms, connectors, attributes Move : –modifications of search strategies or tactics that are aimed at improving the results Cycle (particularly applicable to systems such as DIALOG): –set of commands from start (begin) to viewing (type) results, or from a viewing to a viewing command
© Tefko Saracevic4 Some definitions (cont.) Effectiveness : –performance as to objectives to what degree did a search accomplish what desired? how well done in terms of relevance? Efficiency : –performance as to costs at what cost and/or effort, time? Both KEY concepts & criteria for selection of strategy, tactics & evaluation
© Tefko Saracevic5 Effectiveness criteria Search tactics chosen & changed following some criteria of accomplishment, such as: –none - no thought given –relevance (very often) –magnitude (also very often) –output attributes –topic/strategy Tactics altered interactively –role & types of feedback Knowing what tactics may produce what results key to professional searcher
© Tefko Saracevic6 Relevance: key concept in IR Attribute/criterion reflecting effectiveness of exchange of inf. between people (users) & IR systems in communication contacts, based on valuation by people Some attributes: –in IR - user dependent –multidimensional or faceted –dynamic –measurable - somewhat –intuitively well understood
© Tefko Saracevic7 Types of relevance Several types considered: –Systems or algorithmic relevance relation between between a query as entered and objects in the file of a system as retrieved or failed to be retrieved by a given procedure or algorithm. Comparative effectiveness. –Topical or subject relevance: relation between topic in the query & topic covered by the retrieved objects, or objects in the file(s) of the system, or even in existence; Aboutness..
© Tefko Saracevic8 Types of relevance (cont.) –Cognitive relevance or pertinence: relation between state of knowledge & cognitive inf. need of a user and the objects provided or in the file(s). Informativeness, novelty... – Motivational or affective relevance relation between intents, goals & motivations of a user & objects retrieved by a system or in the file, or even in existence. Satisfaction... –Situational relevance or utility: relation between the task or problem-at-hand. and the objects retrieved (or in the files). Relates to usefulness in decision-making, reduction of uncertainty...
© Tefko Saracevic9 Effectiveness measures Precision: – probability that given that an object is retrieved it is relevant, or the ratio of relevant items retrieved to all items retrieved Recall: – probability that given that an object is relevant it is retrieved, or the ratio of relevant items retrieved to all relevant items in a file Precision easy to establish, recall is not union of retrievals as a “trick” to establish recall
© Tefko Saracevic10 Precision = a a + b Recall = a a + c Calculation High precision = maximize a, minimize b High recall = maximize a, minimize c
© Tefko Saracevic11 Interpretation: PRECISION Precision= percent of relevant stuff you have in your answer –or conversely percent of junk –high precision = most stuff relevant –low precision = a lot of junk Some users demand high precision –do not want to wade through much stuff –but it comes at a price: relevant stuff may be missed tradeoff
© Tefko Saracevic12 A file may have a lot of relevant stuff Recall = percent of that relevant stuff in the file that you retrieved –conversely percent of stuff you missed –high recall = you missed little –low recall = you missed a lot Some users demand high recall (e.g. PhD students doing dissertation) –want to make sure that important stuff is not missed –but will have to pay a price of wading through a lot of junk tradeoff Interpretation: RECALL
© Tefko Saracevic13 Precision-recall trade-off USUALLY: precision & recall are inversely related –higher recall usually lower precision & vice versa 100 % 0 Ideal Usual Improvements Precision Recall
© Tefko Saracevic14 Interpretation: TRADE-OFF It is like in life, usually: – you get some lose some Usually, but not always keep in mind these are probabilities –when you have high precision most stuff you got is relevant or on the target but you missed stuff that is also relevant – it was left behind –when you have high recall you did not miss much but you got also a lot of junk - wading through it You use different tactics for high recall from those for high precision
© Tefko Saracevic15 Search tactics What variations possible? –several ‘things’ in a query can be selected or changed that affect effectiveness –each variation has consequence in output if I do X then Y will happen 1. LOGIC –choice of connectors among terms (AND, OR, NOT, W …) 2. SCOPE –no. of terms linked - ANDs (A AND B vs A AND B AND C)
© Tefko Saracevic16 Search tactics (cont.) 3.EXHAUSTIVITY –for each concept no. of related terms - OR connections (A OR B vs. A OR B OR C) 4. TERM SPECIFICITY –for each concept level in hierarchy (broader vs narrower terms) 5. SEARCHABLE FIELDS –choice for text terms & non-text attributes e.g. titles only, limit as to years 6. FILE OR SYSTEM SPECIFIC CAPABILITIES –e.g. ranking, sorting
© Tefko Saracevic17 Effectiveness “laws” SCOPE - adding more ANDs EXHAUSTIVITY - adding more more ORs USE OF NOTs - adding more NOTs BROAD TERM USE –low specificity Output size: down Recall: down Precision: up Output size: up Recall: up Precision: down Output size down Recall: down Precision: up Output size: up Recall: up Precision: down Output size: down Recall: down Precision: up PHRASE USE - high specificity
© Tefko Saracevic18 Tactics: What to do? To increase precision: –use precision devices To increase recall: –use recall devices Each will also affect magnitude of output With experience use of these devices will become will become second nature
© Tefko Saracevic19 Recall, precision devices BROADENING higher recall: Fewer ANDs More ORs Fewer NOTs More free text Fewer controlled More synonyms Broader terms Less specific More truncation Fewer qualifiers Fewer limits Citation growing NARROWING - higher precision: More ANDs Fewer ORs More NOTs Less free text More controlled Less synonyms Narrower terms More specific Less truncation More qualifiers More limits Building blocks
© Tefko Saracevic20 Other tactics Citation growing: –find a relevant document –look for documents cited in –look for documents citing it –repeat on newly found relevant documents Building blocks –find documents with term A –review – add term B & so on Using different feedbacks –a most important tool
© Tefko Saracevic21 Feedback in searching Any feedback implies loops –a completion of a process provides information for modification, if any, for the next process –information from output is used to change previous or create new input In searching: –some information taken from output of a search is used to do something with next query (search statement) examine what you got to decide what to do next in searching –a basic tactic in searching Several feedback types used in searching –each used for different decisions
© Tefko Saracevic22 Feedback types Content relevance feedback –judge relevance of items retrieved –make decision what to do next switch files, change exhaustivity … Term relevance feedback –find relevant documents –examine what other terms used in those documents –search using additional terms also called query modification & in some systems done automatically Magnitude feedback –on the basis of size of output make tactical decisions often the size so big that documents are not examined but next search done to limit size
© Tefko Saracevic23 Feedback types (cont.) Tactical review feedback –after a number of queries (search statements) in the same search review tactics as to getting desired outputs review terms, logic, limits … –change tactics accordingly Strategic review feedback –after a while (or after consultation with user) review the “big” picture on what searched and how sources, terms, relevant documents, need satisfaction, changes in question, query … –do next searches accordingly –used in reiterative searching There is a difference between reviewing strategy & tactics –but they can be combined
© Tefko Saracevic24 Bates Berry-picking model of searching “…moving through many actions towards a general goal of satisfactory completion of research related to information need.” –query is shifting (continually) as search progresses queries are changing different tactics are used –searcher (user) may move through a variety of sources new files, resources may be used strategy may change
© Tefko Saracevic25 Berry-picking … –new information may provide new ideas, new directions feedback is used in various ways –question is not satisfied by a single set of answers, but by a series of selections & bits of information found along the way results may vary & may have to be provided in appropriate ways & means