Download presentation
Presentation is loading. Please wait.
Published byBuck Martin Modified over 8 years ago
1
Introduction to Information Retrieval
2
What is IR? Sit down before fact as a little child, be prepared to give up every conceived notion, follow humbly wherever and whatever abysses nature leads, or you will learn nothing. -- Thomas Huxley -- Search Engines2 Google Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Ask.com Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Yahoo! Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Google Korea Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Naver Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Daum Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval?
3
IR: Key Questions What are we looking for? How do we find it? Why is it difficult? Search Engines3 “A prudent question is one-half of wisdom” Francis Bacon
4
IR: What are we looking for? We are Looking for X. Q&A: population of China Known-item Search: “Cather in the Rye” Looking for something like/about X. General/background info: Taliban Collection Development: IR Literature Similar to (known) X: like “Cather in the Rye” WhatyoumacallX: “the rye-boy story” Looking for something Problem Resoultion: how can we fight terrorism? Knowledge Development: what is IR? Looking Need something, but don’t know what – what’s it all about? Serendipity: Web surfing Search Engines4
5
IR: How do we find it? Brute force search Easy to build, maintain, and use Searcher does all the work; Hard to get satisfaction Organize/structure the data Intuitive to use Hard to build and maintain Knowledge of builder’s language & organization structure is crucial Use a search tool Easier to build and maintain: Less manipulation of data Sometimes works, sometimes not (Helps to know the language of the data) Ask the experts Easy and satisfying to use (by definition) “Expert” knowledge is transitory, hard to encapsulate Go with the crowd Relatively easy to build and maintain Limited utility: doesn’t work with “unpopular” X Zen-Fusion search. Search Engines5
6
Information Seeking Process: Dynamic, Interactive, Iterative UserIntermediaryInformation What am I looking for? - Identification of info. need What question do I ask? - Query formulation What is the searcher looking for? - Discovery of user’s info. need How should the question be posed? - Query representation Where is the relevant information? - Query-document matching What data to collect? - Collection development What information to index? - Indexing/Representation How to represent it? - Data structure Search Engines6
7
Information Seeking Models Berry-picking Model ( 딸기따기 모델 ) Interesting information is scattered like berries among bushes. Information seeking is a dynamic, non-linear process, where information need/queries continually shift. Information needs are not satisfied by a single, final retrieved set of documents, but rather by a series of selections and bits of information found along the way. Traditional Model Linear process: 1.Problem identification 2.Identification of information need 3.Query formulation 4.Result evaluation Static information need The goal is to retrieve a perfect match of the information need Search Engines7 Bates, 1989 Broader, 2002
8
IR Research: Overview Search Engines8 Information Organization: - Add structure & annotation Information Retrieval - Create a searchable index Information Access - Retrieve information Data Mining - Discover Knowledge
9
IR Research: Information Retrieval Search Engines9 Representation - indexing, term weighting Searchable IndexRaw Data Query Formulation - “What is information retrieval?” Search Results - (ranked) document list D1wd1 wd2 wd3 D2wd2 wd4 wd1 wd2 D3wd1 wd4 Index TermD1D2D3 wd1 (information) 111 wd2 (model) 011 wd3 (retrieval) 120 wd4 (seminar) 100 RankdocIDscore 1D23 2D12 3D31 D1: information retrieval seminars D2: retrieval models and information retrieval D3: information model
10
IR Research: Information Organization Search Engines10 Representation - NLP & Machine Learning Organized DataRaw Data Query Formulation - “What is IR?” Search Results - document groups
11
IR Research: Natural Language Processing Goal Understanding/effective processing of natural language Not just pattern matching Research area, technique, tool for Knowledge Discovery, Data Mining Lexical Analysis using Part-of-Speech (POS) tagging Sentence Parsing Search Engines 11
12
IR Research: Machine Learning Research Area, technique, tool for Information Organization, Knowledge Discovery, Data Mining Information Organization via Supervised Learning (Automatic Classification) Unsupervised Learning (Clustering) Search Engines12 Class 1 Class 2 Class 1 Class 2 Classification Clustering
13
What is T ext RE trieval C onference ?T ext RE trieval C onference Annual Information Retrieval conference Sponsored by National Institute of Standards & Technology (NIST) Defense Advanced Research Project Agency (DARPA) Other U.S. agencies (e.g. DOD) Attended by International researchers from academic, commercial, and government institutions Goals Advance IR research based on large-scale data Refine IR evaluation methodologies Create test collections for various aspects of IR Stimulate exchange of ideas & communication among academia, industry, and government Search Engines13 Voorhees, 2014
14
TREC Tasks: Tracks Search Engines 14 Voorhees, 2014
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.