Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Retrieval Techniques Israr Hanif M.Phil QAU Islamabad Ph D (In progress) COMSATS.

Similar presentations


Presentation on theme: "Information Retrieval Techniques Israr Hanif M.Phil QAU Islamabad Ph D (In progress) COMSATS."— Presentation transcript:

1 Information Retrieval Techniques Israr Hanif M.Phil QAU Islamabad Ph D (In progress) COMSATS

2 Information Retrieval Techniques MS(CS) Lecture 1 AIR UNIVERSITY MULTAN CAMPUS

3 Information Retrieval Systems Information – What is “information”? Retrieval – What do we mean by “retrieval”? – What are different types information needs? Systems – How do computer systems fit into the human information seeking process?

4 Dictionary says… Oxford English Dictionary – information: informing, telling; thing told, knowledge, items of knowledge, news – knowledge: knowing familiarity gained by experience; person’s range of information; a theoretical or practical understanding of; the sum of what is known Random House Dictionary – information: knowledge communicated or received concerning a particular fact or circumstance; news

5 Intuitive Notions Information must – Be something, although the exact nature (substance, energy, or abstract concept) is not clear; – Be “new”: repetition of previously received messages is not informative – Be “true”: false or counterfactual information is “mis-information” – Be “about” something Robert M. Losee. (1997) A Discipline Independent Definition of Information. Journal of the American Society for Information Science, 48(3), 254-269.

6 Information Hierarchy Data InformationKnowledge Wisdom More refined and abstract

7 Information Hierarchy Data – The raw material of information Information – Data organized and presented in a particular manner Knowledge – “Justified true belief” – Information that can be acted upon Wisdom – Distilled and integrated knowledge – Demonstrative of high-level “understanding”

8 A (Facetious) Example Data – 98.6º F, 99.5º F, 100.3º F, 101º F, … Information – Hourly body temperature: 98.6º F, 99.5º F, 100.3º F, 101º F, … Knowledge – If you have a temperature above 100º F, you most likely have a fever Wisdom – If you don’t feel well, go see a doctor

9 What types of information? Text (Documents and portions thereof) XML and structured documents Images Audio (sound effects, songs, etc.) Video Source code Applications/Web services

10 “Retrieval?” “Fetch something” that’s been stored Recover a stored state of knowledge Search through stored messages to find some messages relevant to the task at hand SenderRecipient EncodingDecoding storage message noise indexing/writingRetrieval/reading

11 What is IR? Information retrieval is a problem-oriented discipline, concerned with the problem of the effective and efficient transfer of desired information between human generator and human user Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers). Anomalous States of Knowledge as a Basis for Information Retrieval. (1980) Nicholas J. Belkin. Canadian Journal of Information Science, 5, 133-143.

12 What is Information Retrieval ? The process of actively seeking out information relevant to a topic of interest (van Rijsbergen) – Typically it refers to the automatic (rather than manual) retrieval of documents Information Retrieval System (IRS) – “Document” is the generic term for an information holder (book, chapter, article, webpage, etc)

13 Hopkins IR Workshop 2005Copyright © Victor Lavrenko What is Information Retrieval? Most people equate IR with web-search – highly visible, commercially successful endeavors – leverage 3+ decades of academic research IR: finding any kind of relevant information – web-pages, news events, answers, images, … – “relevance” is a key notion (details in Part II)

14 14

15 15

16 The formalized IR process Collection of documents Real world Document representations Query Information need Anomalous state of knowledge Matching Results

17 What do we want from an IRS ? Systemic approach – Goal (for a known information need): Return as many relevant documents as possible and as few non-relevant documents as possible Cognitive approach – Goal (in an interactive information-seeking environment, with a given IRS): Support the user’s exploration of the problem domain and the task completion.

18 The role of an IR system – a modern view – Support the user in – exploring a problem domain, understanding its terminology, concepts and structure – clarifying, refining and formulating an information need – finding documents that match the info need description As many relevant docs as possible As few non-relevant documents as possible

19 How does it do this ? User interfaces and visualization tools for – exploring a collection of documents – exploring search results Query expansion based on – Thesauri – Lexical/statistic analysis of text / context and concept formation – Relevance feedback Indexing and matching model

20 How well does it do this ? Evaluation – Of the components Indexing / matching algorithms – Of the exploratory process overall Usability issues Usefulness to task User satisfaction

21 Role of the user interface in IR Problem definition Source selection Problem articulation Examination of results Extraction of information Integration with overall task INPUT OUTPUT Engine

22 The Big Picture The four components of the information retrieval environment: – User – Process – System – Collection What computer geeks care about! What we care about!

23 The Information Retrieval Cycle Source Selection Search Quer y Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource query reformulation, vocabulary learning, relevance feedback source reselection

24 Supporting the Search Process Source Selection Search Quer y Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource Indexing Index Acquisition Collection

25 Simplification? Source Selection Search Quer y Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource query reformulation, vocabulary learning, relevance feedback source reselection Is this itself a vast simplification?

26 The IR Black Box Documents Query Hits

27 Inside The IR Black Box Documents Query Hits Representation Function Representation Function Query RepresentationDocument Representation Comparison Function Index


Download ppt "Information Retrieval Techniques Israr Hanif M.Phil QAU Islamabad Ph D (In progress) COMSATS."

Similar presentations


Ads by Google