CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway

Slides:



Advertisements
Similar presentations
ELibrary The user-friendly general reference solution
Advertisements

Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Web Search – Summer Term 2006 I. General Introduction (c) Wolfgang Hürst, Albert-Ludwigs-University.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Search Engines and Information Retrieval
Anatomy of a Large-Scale Hypertextual Web Search Engine (e.g. Google)
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
A Mobile World Wide Web Search Engine Wen-Chen Hu Department of Computer Science University of North Dakota Grand Forks, ND
Information Access Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies Design Understanding.
Modern Information Retrieval Chapter 1 Introduction.
Exercise 1: Bayes Theorem (a). Exercise 1: Bayes Theorem (b) P (b 1 | c plain ) = P (c plain ) P (c plain | b 1 ) * P (b 1 )
WHAT HAVE WE DONE SO FAR?  Weeks 1 – 8 : various components of an information retrieval system  Now – look at various examples of information retrieval.
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Databases & Data Warehouses Chapter 3 Database Processing.
Introductions Search Engine Development COMP 475 Spring 2009 Dr. Frank McCown.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Search Engines and Information Retrieval Chapter 1.
Web Search Created by Ejaj Ahamed. What is web?  The World Wide Web began in 1989 at the CERN Particle Physics Lab in Switzerland. The Web did not gain.
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the.
Information Retrieval and Knowledge Organisation Knut Hinkelmann.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
CSM06 Information Retrieval Lecture 4: Web IR part 1 Dr Andrew Salway
Overview What is a Web search engine History Popular Web search engines How Web search engines work Problems.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Internet Skills The World Wide Web (Web) consists of billions of interconnected pages of information from a wide variety of sources. In this section: Web.
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
CSM06 Information Retrieval Lecture 6: Visualising the Results Set Dr Andrew Salway
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Search Engine Architecture
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
IR, IE and QA over Social Media Social media (blogs, community QA, news aggregators)  Complementary to “traditional” news sources (Rathergate)  Grow.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Judit Tóvári PhD Eszterházy Károly College, Eger (Hungary) Institute of Media Informatics From librarian to information manager.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Introduction to Access Chapter 13 pages 1-4. What is a database??? Related information is stored in databases  All SC student information is stored in.
 A website, also written Web site, web site, or simply site, is a group of Web pages and related text, databases, graphics, audio, and video files that.
Information Retrieval and Web Search Course overview Instructor: Rada Mihalcea.
Information Retrieval
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
1 CS 430: Information Discovery Lecture 18 Web Search Engines: Google.
Web Search Architecture & The Deep Web
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
INFORMATION STROAGE AND RETRIEVAL SYSTEM By Ms. Preeti Patel Lecturer School of Library And Information Science DAVV, Indore
Definition, purposes/functions, elements of IR systems Lesson 1.
1 Chapter 5 (3 rd ed) Your library is an excellent resource tool. Your library is an excellent resource tool.
WHIM- Spring ‘10 By:-Enza Desai. What is HCIR? Study of IR techniques that brings human intelligence into search process. Coined by Gary Marchionini.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Search Engine Architecture
CSCE 561 Information Retrieval System Models
WIRED Week 2 Syllabus Update Readings Overview.
Data Warehousing and Data Mining
Data Mining Chapter 6 Search Engines
Introduction to Information Retrieval
Search Engine Architecture
CS/INFO 430 Information Retrieval
Information Retrieval and Web Design
Information Retrieval and Web Design
ADVANCED TOPICS IN INFORMATION RETRIEVAL AND WEB SEARCH
Introduction to Search Engines
Presentation transcript:

CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway

Lecture 1a: INTRODUCTION What is information retrieval?

Why?, Who?, What? Why do we need information retrieval? Who are the users of information retrieval systems? What kinds of information do they want to retrieve? Why study information retrieval? For example, why is it important to understand how search engines work?

Applications of Information Retrieval For the World Wide Web For organisations’ intranets For our personal media collections

INSERT GOOGLE screenshot

INSERT AltaVista screenshot

INSERT Yahoo screenshot - Query

INSERT Autonomy screenshot

INSERT IBM Webfountain screenshot

INSERT my screenshot

INSERT my photos screenshot

A very brief history… Libraries for 1,000’s of years 1950’s - computer-based IR early 1990’s - web search late 1990’s - multimedia search

Some traditional ways of organizing information Table of Contents of a book Index of a book Library classification schemes: Hierarchies (e.g. Dewey Decimal) Controlled vocabularies Collections of abstracts

From the dictionary… Library. 1 A large organised collection of books for reading or reference. b A mass of learning or knowledge; a source providing knowledge and learning. c A collection of films, gramophone records, etc. when organised or sorted for some specific purpose… The New Shorter Oxford English Dictionary, 1993

Information Retrieval “the representation, storage, organisation of, and access to information items” (Baeza-Yates and Riberio-Neto 1999, page 1)

How is computer-based IR different to traditional libraries? Remote, multiple access May have multiple indexes Interactivity Scale Automatic indexing and ranking

What are the particular challenges for IR on the Web? Volume of text data – Google claims to index more than 8,000,000,000 webpages, and that’s not everything Multimedia information – traditional IR focussed on texts More and more multilingual information Cannot access original text when processing a query Distributed data – different platforms, bandwidths Large amount of volatile data and redundant data Diverse users (hence diverse information needs) and many inexperienced users Some good news though! The links between webpages can be useful for web search engines (more on this in Lecture 4)

Who are they?

“ Analysts estimate that Google is worth between $15 billion and $20 billion” The Times, 29/01/2004