The Players The Majors Dead Search Engines International Search Engines Metasearch Engines.

Slides:

Advertisements

Similar presentations

1 S.E.O Search Engine Optimization. 2 History of Google Began January 1996 Stanford University California Larry Page and Sergey Brin “BackRub” used a.

Advertisements

1 Presented By Avinash Gutte Under The Guidance of Mrs. Hemangi Kulkarni Department of Computer Engineering Pimpri-Chinchwad College of Engineering, Pune.

Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.

(c) Maria Indrawan Distributed Information Retrieval.

Search Engines Jan Damsgaard Dept. of Informatics Copenhagen Business School

Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.

What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.

Search engines. The number of Internet hosts exceeded in in in in in

Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.

Overview of Search Engines

Internet Research Search Engines & Subject Directories.

 Search engines are programs that search documents for specified keywords and returns a list of the documents where the keywords were found.  A search.

What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.

SEARCHING ON THE INTERNET

SEO from the Ground Up! Jack Roberts President and CEO of Peak Positions.

Cutting Through the Clutter Searching the Web. There is a wealth of information waiting for you on the internet, if you know the right tools to use and.

Searching “Search results are only as good as the query you pose and how you search. There is no silver bullet”

1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.

Introductions Search Engine Development COMP 475 Spring 2009 Dr. Frank McCown.

Lesson 12 — The Internet and Research

HOW SEARCH ENGINE WORKS. Aasim Bashir.. What is a Search Engine? Search engine: It is a website dedicated to search other websites and there contents.

Promotion & Cataloguing AGCJ 407 Web Authoring in Agricultural Communications.

Hotbot A Search Engine Case Study. Introduction  Owned by Terra/Lycos.  One of the largest web search engines.  Uses the Inktomi database combined.

Search Engine Interfaces search engine modus operandi.

Overview What is a Web search engine History Popular Web search engines How Web search engines work Problems.

ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.

Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore

Internet Business Foundations © 2004 ProsoftTraining All rights reserved.

Beyond Search Engines: Advanced Web Searching Subject Directories  Librarians’ Index to the Internet  Infomine Finding Databases on a Subject  The Invisible.

Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.

1 Search Engines Emphasis on Google.com. 2 Discovery  Discovery is done by browsing & searching data on the Web.  There are 2 main types of search facilities.

WISER Humanities: Quality Information on the Internet Johanneke Sytsema Linguistics Subject Consultant

Search Engines AGCM 4143 Electronic Communications in Agriculture.

Proprietary & confidential 1 The Future of Search JJ Hollowell CIO, icrossing, Inc. Spring 2005.

XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.

The Internet 8th Edition Tutorial 4 Searching the Web.

Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.

Search Engines June 20, 2005 LIBS100 Linda Galloway.

Contact Us: For more information on this or any other TownNews.com product, contact your regional sales manager. Main TownNews.com Office:

Stop Searching and Start FINDING: Strategies for Effective Web Research.

4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.

Searching The Internet Open Text Searching vs. Subject Tree Search Open Text Search Search Engine scans the Web looking for a word or group of words.

Searching for NZ Information in the Virtual Library Alastair G Smith School of Information Management Victoria University of Wellington.

Unit 1—Computer Basics Lesson 3 The Internet and Research.

CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.

Selected Internet Search Engines Search Engine Database Advanced/ Boolean Other search options Miscellaneous Google Google google.co m Advanced Search.

The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.

1 CS 430: Information Discovery Lecture 18 Web Search Engines: Google.

Web Search Architecture & The Deep Web

WISER Humanities: Quality Information on the Internet Johanneke Sytsema Linguistics Subject Consultant Judy Reading Reader.

Internet Power Searching: Finding Pearls in a Zillion Grains of Sand By Daniel Arze.

W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Finding quality information.

Search Engine Mortality & New Directions Greg R. Notess Internet Librarian International London 28 March 2001.

Learning how to search on the web “If all you ever do is all you’ve ever done, then all you’ll ever get is all you’ve ever got.” (author unknown)

Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.

Lecture 4 Access Tools/Searching Tools. Learning Objectives To define access tools To identify various access tools To be able to formulate a search strategy.

Seminar on seminar on Presented By L.Nageswara Rao 09MA1A0546. Under the guidance of Ms.Y.Sushma(M.Tech) asst.prof.

1 Chapter 5 (3 rd ed) Your library is an excellent resource tool. Your library is an excellent resource tool.

SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.

So You Think You Know How To Use The Internet?

Using Search Tools on the Internet

Search Engines and Search techniques

Search Engines & Subject Directories

Data Mining Chapter 6 Search Engines

Search Engine Mortality & New Directions

Search Engines & Subject Directories

Search Engines & Subject Directories

Web Searching Everything, now..

Presentation transcript:

The Players The Majors Dead Search Engines International Search Engines Metasearch Engines

Google Developed as BackRub by Stanford University students Larry Page and Sergey Brin Became a private company, and changed name to Google in 1998 One of largest databases >8 billion (they include pages their robots have searched, even if their indexing program hasn’t fully indexed it) Indexes 3 billion pages every 28 days; 3 million every day Makes money through powering over 130 portals and Corporate Web sites, and AdWords

Google Google Spidering Uses its own ‘bots to spider web Generally ignores meta keywords and description tags.

Google Google Indexing Descriptions (snippets) are formed automatically by extracting the most relevant portions of pages Finds the first instance of the search term on a page, then includes the words that appear around this term Only indexes first 100K or so Some pages don’t have a description - Google will include a “botted” page even if it has not been “indexed”

Google Indexes: Web - Indexed Web pages and other file types Ads - Paid advertisements appear on the right side or above search results under a "Sponsored Links" heading Images million+ images searched Groups million+ usenet messages searched News Directory - A ranked version of the Open Directory using Google's PageRank Froogle - Shopping and product search Catalog Search - Scanned, searchable retail catalogs

Google Web index subsets: Government sites Military sites University sites Linux sites Apple/Macintosh sites Microsoft sites

Google New! “Google teams with the libraries of Harvard, Stanford, the University of Michigan, the University of Oxford, and The New York Public Library to digitally scan books from their collections so that users worldwide can search them in Google…Users searching with Google will see links in their search results page when there are books relevant to their query. Clicking on a title delivers a Google Print page where users can browse the full text of public domain works and brief excerpts and/or bibliographic data of copyrighted material. Library content will be displayed in keeping with copyright law.”

Yahoo! Search Originally just a subject directory Search engine launched Feb Indexes first 500 KB of a Web page Includes some pay for inclusion sites

Teoma Founded in 2000 by a team of scientists from Rutgers University Teoma means "expert" in Gaelic Acquired by Ask Jeeves, Inc. in September 2001.

Teoma More than 2 billion English-only web documents Spam, duplicates and pornographic results removed from index Indexes whole page; no stop words Considers meta-tag descriptions Aims to re-index every month (freshness) Sponsored links from Google Adwords

Teoma Establishing authority and relevancy: Refine - organizes sites into naturally occurring communities that are about the subject of each search query Results - analyzes the relationship of sites within a community, ranking a site based on the number of same-subject pages that reference it (Subject-Specific Popularity) Resources - identifies expert resources about a particular subject

Gigablast Founded in 2000 Built and operated by sole proprietor Matt Wells Created to index up to 200 Billion pages with the least amount of hardware possible Currently indexes 650 million Provides "Gigabits” to help searchers refine their search based upon related topics from search results Makes money by selling search services to private companies

Wisenut Newer database ~ million pages indexed 1.5 billion – identified not crawled/indexed Few advanced search features Spider capable of fetching more than 100 million a day Often months out of date Smart/Relevant: all words on page, text or referring links and words around them, significance and content of pages with the links Generates automatic semantic searches called WiseGuide categories

MSN Search New, improved ~4.2 billion pages search/indexed? Formerly used Inktomi, now has proprietary robots, indexer, and retrieval engine

Dead Search Engines What ever happened to…? Direct Hit - defunct, redirecting to Teoma Infoseek – defunct, redirecting to Go Magellan - dead, redirects to WebCrawler Northern Light - defunct Openfind - Under "reconstruction" as of 2003 WebTop - Dead

Dead Search Engines The search engine formerly know as… AlltheWeb - uses Yahoo! database AltaVista - uses Yahoo! database Excite - uses an InfoSpace meta search Go - took over Infoseek, but now just uses Overture iWon – now uses Google "sponsored" ads, web, and image databases Looksmart - uses Wisenut search engine Lycos - uses Yahoo!/Inktomi database and LookSmart directory NBCi (formerly Snap) - uses metasearch engine Dogpile WebCrawler - uses an InfoSpace meta search

International Search Engines There are hundreds of search engines all over the world. We will not be investigating any of these very closely, but you can use the resources below to locate and master international search engines: All Search Engines: foreign search engines Search Engines Worldwide Search Engine Colossus Country-specific Search Engines

Metasearch Engines A search engine that queries other search engines and then combines the results that are received from all Allows user is not using just one search engine but a combination of many search engines at once to optimize Web searching

Metasearch Engines The difference among them: Engines covered (many pay-for-placement) # of engines that can be searched at once Sophistication of search query # of records from each search engine Length of time it will search each search engine Delete duplicates (de-duping)

Metasearch Engines Dogpile Metacrawler Mamma Kart00 Clusty Surfwax Ixquick Fazzle InfoGrid Gimenei

Metasearch Engines Good for getting a lay of the land: What is out there? Is there anything out there? Who covers a topic best? Learning the names of new or emerging search engines

Metasearch Engines Otherwise, usually better off searching multiple SE’s individually: Syntax varies among search engines and metasearch engines may not allow you to make use of all search engines May not translate your query well into different SE’s

Metasearch Engines Check out some cool, value-adding features emerging is metasearch engines

Clusty Clusty (using Vivisimo clustering engine): Clustering: uses algorithm to put search results together based on textual and linguistic similarity. Groups further refined using heuristics (i.e., human knowledge) designed to show what users wish to see when they examine clustered documents.

Clusty “Vivísimo's Clustering Engine lets you see deeper and farther--with less effort--into a large number of search results to: Get a quick overview of the main themes that relate to the query. See similar results grouped together for faster access. Find results that are buried in the ranked list and would otherwise be missed. Discover unexpected results and relationships between items.”

Mamma rSort Considers each listing duplicated in more than one SE as a “vote” for that page. Uses votes to rank pages per the "Condorcet Method“ One of the big advantages of this ranking method is the elimination of search engine spam.

Kart00 Interactive Mapping display for results Uses proprietary algorithm to sort pages Relevance of results are displayed as different-sized pages When you move the pointer over these pages, the relevant keywords are illuminated and a brief description of the site appears on the left side of the screen Click keywords to refine the search Refined or further results also displayed on a map

Surfwax Targeted multi-source searching Searches only sources from specific domains or topics determined as relevant SurfWax can spider deeper in any site public site, including pages or parts that are invisible to traditional search engines Uses a site's existing search syntax to uncover “deeper” content

Ixquick Understands and translates, when possible, complex syntax Complete Boolean searching Truncation/wildcard searching

Fazzle Meta-searches SE’s, plus unique searches in news and other invisible web resources Ranks everything together Delivers timely resources from news sources Delivers dynamic content missing from other metasearch engines