IST 497E Information Retrieval and Organization

Slides:



Advertisements
Similar presentations
Metacrawler Melissa Cyr Information Literacy. A metasearch engine is a search tool that sends user requests to several other search engines and/or databases.
Advertisements

Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
To See, or Not to See—Is That the Query? Robert R. Korfhage Dept. of Information Science University of Pittsburgh 1991 Reviewed by Yi-Bu Chen LIS 551 Information.
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Web Search - Summer Term 2006 III. Web Search - Introduction (Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Information Retrieval in Practice
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Internet Resources Discovery (IRD) Meta-Search Engines (MSEs)
Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
Overview of Search Engines
SEARCHING ON THE INTERNET
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Search Engine Optimization ext 304 media-connection.com The process affecting the visibility of a website across various search engines to.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Web Scale Discovery Service Vs Federated Search NIKESH NARAYANAN
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Searching the Deep Web By Mrs. McGraw. Protocol of McGraw Search Method Metasearch Engine (like Dogpile) Search Engine (like Google ) Directory Search.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Personalized Search Xiao Liu
Search Engine Architecture
Search Engines.
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
Meta Search Engines Taly Sharon. T.Sharon Search Engine Seminar2 Contents Search Engines (SEs) generations Meta Search Engine (MSE) Why use several SEs.
Searching the World Wide Web: Meta Crawlers vs. Single Search Engines By: Voris Tejada.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
A search engine is a web site that collects and organizes content from all over the internet Search engines look through their own databases of.
CP3024 Lecture 12 Search Engines. What is the main WWW problem?  With an estimated 800 million web pages finding the one you want is difficult!
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
Learning how to search on the web “If all you ever do is all you’ve ever done, then all you’ll ever get is all you’ve ever got.” (author unknown)
Lecture 4 Access Tools/Searching Tools. Learning Objectives To define access tools To identify various access tools To be able to formulate a search strategy.
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
How to use Search Engines and Discovery Tools? Salama Khamis Al Mehairi U
Search Engine Optimization
Information Retrieval in Practice
Information Retrieval in Practice
Metasearch Thanks to Eric Glover NEC Research Institute.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Overview Blogs and wikis are two Web 2.0 tools that allow users to publish content online Blogs function as online journals Wikis are collections of searchable,
Information Organization: Overview
CSC 102 Lecture 12 Nicholas R. Howe
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Search Engine Architecture
Large-Scale Content-Based Audio Retrieval from Text Queries
Modern Information Retrieval
Simultaneous Support for Finding and Re-Finding
Surf and Search Online Resources.
Search Engine Architecture
Web Mining Ref:
Create your Benner - intro
Martin Rajman, Martin Vesely
Federated & Meta Search
B OOST W EBSITE P ERFORMANCE WITH T HE C USTOM W ORDPRESS P LUG -I N D EVELOPMENT
Search Engines & Subject Directories
Multimedia Information Retrieval
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Information Integration for Digital Libraries
The Four Dimensions of Search Engine Quality
Eric Sieverts University Library Utrecht Institute for Media &
Introduction to Smart Search
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
Data Warehousing and Data Mining
Maximizing Exposure for Your Non-Profit
Searching with context
Search Engines & Subject Directories
Search Engines & Subject Directories
Search Engine Architecture
Information Organization: Overview
Information Retrieval and Web Design
Presentation transcript:

IST 497E Information Retrieval and Organization Meta Search IST 497 E Meta-Search, Pradeep Teregowda

Overview What is a Meta Search Engine Features Differences Architecture Some Meta Search Engines Conclusions and Future ?

Overview What is a Meta Search Engine ? Traveling Further Why Meta Search ?

What Is a Meta Search Engine ? Dictionary meaning for Meta: “more comprehensive : transcending.”- webster.com Simple Explanation “A Meta Search engine allows you to search multiple search engines at once, returning more comprehensive and relevant results, fast.” –MetaCrawler {modified}

Traveling Further A little bit of History: Started with Harvest (1995) –[Ref: Information Discovery and Access System – C. Mic Bowman et al]. Was developed for gathering information from repositories, building topic-specific content indexes, web-caching,flexible searching. In many ways current Meta Search engines have similar aims.

Why Are Meta Search Engines Useful ? Meta Search improves the Search Quality in many ways: Comprehensive, Efficient, One query queries all {one-click paradigm},

Why Meta Search ? Individual Search engines don’t cover all the web by themselves, Individual Search Engines are prone to spamming {people trying to raise their ranking profile. In a non-legitimate manner or to promote commerce}, Difficulty in deciding and obtaining results with combined searches on different search engines,

Why Meta Search ? Data Fusion {multiple formats supported}, In Case of niche search engines provides the ‘big picture’, Takes less effort.

Overview Features

Features. Unifies the Search Interface and provides a consistent user interface, Standardizes the query structure, May make use of an independent ranking method for the results {rank-merging}, May have an independent ranking system for each search engine/database it searches, Meta Search is not a search for Meta Data.

Overview Differences Differences {Search vs. Meta Search}.

Differences {Search Vs. Meta-search} Doesn’t generally have a Database by itself, Does not search{crawl} the web. A Meta-Search Engine in terms of search engine. Essentially is a hub of search engines/databases accessible by a common interface providing the user with results which may/may not be ranked independently of the original search engine/source ranking.

Overview Architecture/Internals A block representation, What do those blocks do ?, Queries, Ranking.

Architecture Feedback Knowledge Personalize Dispatcher Query User Interface User S E 1 S E 2 S E 3 Web Display

What Do Those Blocks Do ? User Interface Dispatcher Normally resemble search engine interfaces with options for Types of search [Media] Search Engines to Use Dispatcher Generates actual queries to the search engines by using the user query May involve choosing/expanding search engines to use

What Do Those Blocks Do ? Display. Personalization/Knowledge. Generates Results page from the replies received, May involve ranking,parsing,clustering of the search results or just plain stitching. Personalization/Knowledge. May contain either or both. Personalization may involve weighting of search results/query/engine for each user.

Queries STARTS { protocol }. Inquirus: Expand Queries. A simple protocol that text search engines should follow to facilitate searching and indexing multiple collections of text documents. Choosing best source for a query, Evaluating a query at those sources, Merging the query results from them. Inquirus: Expand Queries. Ex: (What does Satellite stand for = Satellite stands for).

Independent Ranking Stitch together the results [Dogpile], Selection of a particular Search Engine based on a query – a meta index [SavySearch-Resource Balancing], Rank merging based on Search Engine rating [MetaCrawler], Context analysis for search results with respect to the query [Inquirus].

Fusion of Other Media Inquirus. Motivation for a meta-search {Images}: Queries from the user can be modified, since individual search engines are good at different types of queries the results are very good. How ? See Reference {Skipped because of topic overlap}. Can it work for others ? Example: FTP,Music – Probably yes [P2P searches].

Other Work Other work {Media Fusion}. MetaSEEk. Ixquick. Dogpile. Visual Search Engine [for images], Makes use of query by example. Ixquick. MP3,Images,News.[all come from different search interfaces –may not exactly be fusion]. Dogpile. Multimedia,Images,News,Files.

Overview Some Meta Search Engines On the web, Why are they not so popular ?, Some Reviews.

Meta search Engines on the Web MetaCrawler, Ixquick, Inquirus, SavySearch {now cnet search ?}, Dogpile, Sherlockhound, Vivisimo.

Why Are They Not So Popular ? Ads ! Some Meta Search engines pick up ads as part of search results from the participating search engines. Example: Dogpile. Similar reasons as general search engines (ads clutter search results). Example: MetaCrawler,Ixquick. Paid Placement Search Engines included, Relation with the search engines they depend on. {load,interaction}….

Why Are They Not So Popular ? Results as good as the worst search engine in the group. Combining results using meta-index and rankings can lead to incorrectly ranked results from a search engine reducing the relevance of correct results. If your highly ranked search engine returns a badly ranked result, then your results are also badly ranked.

Some Reviews Search Engine Watch [May 2001] Meta Search Engine % Results From A Major Search Engine Vivisimo 100 % [No Ads] Ixquick 70 % MetaCrawler 48 % Dogpile 14%

Overview Conclusions and Future

Conclusions and Future Vivisimo reported 43 % increase in traffic –[wired.com Aug. 14, 2001], Apple ‘Sherlock’ has been popular, Multiple Media Search, P2P Searches ?, Growth or Niche Search Engines.

References The MetaCrawler Architecture for Resource Aggregation on the web(1997) – Erik Selberg & Oren Etzioni. Context and Page Analysis for Improved Web Search(1998) –Steve Lawrence & C. Lee Giles. Experiences with Selecting Search Engines Using Metasearch(1997) – Daniel Dreilinger & Adele E. Howe.

Time to Wake Up…

Web References Search Engine Watch Article [Meta Search Or Meta Ads?], Wired Article [Searching For Google’s Successor], MetaSEEk [Writeup], Text and Image Meta Search on the Web [Citeseer Paper Reference].