Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Information Retrieval. What is IR? Sit down before fact as a little child, be prepared to give up every conceived notion, follow humbly.

Similar presentations


Presentation on theme: "Introduction to Information Retrieval. What is IR? Sit down before fact as a little child, be prepared to give up every conceived notion, follow humbly."— Presentation transcript:

1 Introduction to Information Retrieval

2 What is IR? Sit down before fact as a little child, be prepared to give up every conceived notion, follow humbly wherever and whatever abysses nature leads, or you will learn nothing. -- Thomas Huxley -- Search Engines2 Google Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Ask.com Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Yahoo! Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Google Korea Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Naver Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval? Daum Query = What is IR?What is IR? Query = What is information retrieval?What is information retrieval?

3 IR: Key Questions  What are we looking for?  How do we find it?  Why is it difficult? Search Engines3 “A prudent question is one-half of wisdom” Francis Bacon

4 IR: What are we looking for?  We are  Looking for X.  Q&A: population of China  Known-item Search: “Cather in the Rye”  Looking for something like/about X.  General/background info: Taliban  Collection Development: IR Literature  Similar to (known) X: like “Cather in the Rye”  WhatyoumacallX: “the rye-boy story”  Looking for something  Problem Resoultion: how can we fight terrorism?  Knowledge Development: what is IR?  Looking  Need something, but don’t know what – what’s it all about?  Serendipity: Web surfing Search Engines4

5 IR: How do we find it?  Brute force search  Easy to build, maintain, and use  Searcher does all the work; Hard to get satisfaction  Organize/structure the data  Intuitive to use  Hard to build and maintain  Knowledge of builder’s language & organization structure is crucial  Use a search tool  Easier to build and maintain: Less manipulation of data  Sometimes works, sometimes not (Helps to know the language of the data)  Ask the experts  Easy and satisfying to use (by definition)  “Expert” knowledge is transitory, hard to encapsulate  Go with the crowd  Relatively easy to build and maintain  Limited utility: doesn’t work with “unpopular” X  Zen-Fusion search. Search Engines5

6 Information Seeking Process: Dynamic, Interactive, Iterative UserIntermediaryInformation What am I looking for? - Identification of info. need What question do I ask? - Query formulation What is the searcher looking for? - Discovery of user’s info. need How should the question be posed? - Query representation Where is the relevant information? - Query-document matching What data to collect? - Collection development What information to index? - Indexing/Representation How to represent it? - Data structure Search Engines6

7 Information Seeking Models  Berry-picking Model ( 딸기따기 모델 )  Interesting information is scattered like berries among bushes.  Information seeking is a dynamic, non-linear process, where information need/queries continually shift.  Information needs are not satisfied by a single, final retrieved set of documents, but rather by a series of selections and bits of information found along the way.  Traditional Model  Linear process: 1.Problem identification 2.Identification of information need 3.Query formulation 4.Result evaluation  Static information need  The goal is to retrieve a perfect match of the information need Search Engines7 Bates, 1989 Broader, 2002

8 IR Research: Overview Search Engines8 Information Organization: - Add structure & annotation Information Retrieval - Create a searchable index Information Access - Retrieve information Data Mining - Discover Knowledge

9 IR Research: Information Retrieval Search Engines9 Representation - indexing, term weighting Searchable IndexRaw Data Query Formulation - “What is information retrieval?” Search Results - (ranked) document list D1wd1 wd2 wd3 D2wd2 wd4 wd1 wd2 D3wd1 wd4 Index TermD1D2D3 wd1 (information) 111 wd2 (model) 011 wd3 (retrieval) 120 wd4 (seminar) 100 RankdocIDscore 1D23 2D12 3D31 D1: information retrieval seminars D2: retrieval models and information retrieval D3: information model

10 IR Research: Information Organization Search Engines10 Representation - NLP & Machine Learning Organized DataRaw Data Query Formulation - “What is IR?” Search Results - document groups

11 IR Research: Natural Language Processing  Goal  Understanding/effective processing of natural language  Not just pattern matching  Research area, technique, tool for  Knowledge Discovery, Data Mining  Lexical Analysis using  Part-of-Speech (POS) tagging  Sentence Parsing Search Engines 11

12 IR Research: Machine Learning  Research Area, technique, tool for  Information Organization, Knowledge Discovery, Data Mining  Information Organization via  Supervised Learning (Automatic Classification)  Unsupervised Learning (Clustering) Search Engines12 Class 1 Class 2 Class 1 Class 2 Classification Clustering

13 What is T ext RE trieval C onference ?T ext RE trieval C onference  Annual Information Retrieval conference  Sponsored by  National Institute of Standards & Technology (NIST)  Defense Advanced Research Project Agency (DARPA)  Other U.S. agencies (e.g. DOD)  Attended by  International researchers from academic, commercial, and government institutions  Goals  Advance IR research based on large-scale data  Refine IR evaluation methodologies  Create test collections for various aspects of IR  Stimulate exchange of ideas & communication among academia, industry, and government Search Engines13 Voorhees, 2014

14 TREC Tasks: Tracks Search Engines 14 Voorhees, 2014


Download ppt "Introduction to Information Retrieval. What is IR? Sit down before fact as a little child, be prepared to give up every conceived notion, follow humbly."

Similar presentations


Ads by Google