Choosing a Search Engine Taly Sharon Thanks to Ariel Frank, Bar-Ilan University
82% Loyal to SE iProspect
Search Engines Diverging Looking at the organic or natural listings for more than 485,000 first page search results, the study found that: Dogpile
Experienced Searchers use More Search Engines HarvestDigital
General rules for choosing SEs Use "major" SEs that are both well-known and well-used (and that hopefully won’t be downgraded or disappear soon ). Prefer SEs that employ both a huge index and a comprehensive directory (gives better results; can also switch between). Stick to SEs of established companies that treat search as their main business/expertise.
Google Trends ask&ctab=0&geo=all&date=all&sort=0
Criteria for Choosing SEs 1.Database (different) 2.Ranking algorithm 3.Query options (site, intitle, inrurl…) 4.Added values/features (clustering, define, NLP, …) 5.User Interface (UI)
Who Powers Whom? Major distinct databases: –Google –Yahoo –MSN –Ask –Wisenut, Exalead, etc. The rest of the search engines use the same databases as the above search engines – different retrieval algorithms, see:
SE Database Facts Summary Google is feeding from DMOZ Google is feeding Excite, Hotbot, iwon, Netscape and Aol search Yahoo! Is fed from Inktomi and feeding excite Ask is fed from Google and dmoz Directories –Yahoo! Is not fed from dmoz –But almost everyone else is!
Yahoo!
Ask
Directories
Why use Google? (1) Biggest, most comprehensive coverage: ~8 billion Web pages (but ~1 billion of it isn’t full-text searchable!) ~11 billion documents, if you count images and newsgroup postings. Fastest around. Most relevant results (voted 3 times most outstanding SE by Search Engine Watch readers). Provides good directory results (PageRanks results of DMOZ Open Directory).
Why use Google? (2) Has thinnest/cleanest interface around. But provides rich set of advanced search features/tools(/hacks). Finds similar/related pages. Supports Web pages translation. Cached (HTML) copy of pages (great for quick view of DOCs/PDFs and for 404s ). Google alert – use of push technology.
Share of Searches Share Of Searches: July 2006
Why use Yahoo! search? (1) Has brand new Yahoo! search – gives highly relevant Web results (at Google level ). Still supports an expert’s humanly-compiled directory (dir.yahoo.com).dir.yahoo.com Has (also) a thin interface ( search.yahoo.com ) while providing a rich set of advanced search features/shortcuts. search.yahoo.com
Why use Yahoo! search? (2) For legacy reasons (oldest of all directories). Puts particular emphasis on personalization and customization ( my.yahoo.com ). my.yahoo.com Had enough of Googlism ( ). It devoured/uses (know-how from) Overture (Inktomi, AltaVista and AllTheWeb, etc…) Has many specialty SEs – better than Google.
Hidden Gem Yahoo! Search Subscriptions
Google in 1998 – looking up at Yahoo!? Source: Internet archive’s Wayback machine
Search Relevancy
6 Reasons to use Yahoo! 1.Long queries (>32 terms, >256 chars) –Especially useful when using OR 2.Search for XML/RSS 3.Better link: search More extensive results More options (linkdomain:, linksite:) 4.Mix syntax Link: site:gov 5.Google is the most exposed to Spams. 6.Some special services.
Why use MSN? Relatively new -- re-written in One of the 3 Major DBs. Direct answers -- from Microsoft Encarta®, encyclopedia. Direct actions -- to MSN channels. 1.When you need more results 2.When you need some unique query options: –prefer: –ip: –contains: (music contains:wma) –Feed:, hasfeed: 3.When you need UI options (especially sorting): Date Popularity Exact/approximate match
Why use Ask? Small Index but interesting results Provides ExpertRank -“subject specific” ranking of pages. Provides a Natural Language interface (uses NLP). Refine: Suggests related searches. Comments: Name AskJeeves changed to Ask Teoma gone with the Resources (results, refine, resources)
Why Use Ask? Query suggestions/fill Q&A engine Smart Answers Query refinements Different results
Ask
Why Use Exalead New Search engine Another stand-alone database Advanced search features: –Words starting with –Words at proximity –Search method: exact search/automatic word stemming/phonetic search/approximate spelling –Document sorting: relevance/oldest/newest –Modification date: simply write date!!!!
Why use Exalead Preferences – instant page translation Filters/refinements: –Related terms –Related categories (DMOZ) –Web site location –Document type (PDF/TXT/DOC/PPT) –Result presentation (documents/ documents+thumbnails/thumbnails) –Preview
A9 Great UI Searches also books Visual Yellow pages and street photos Leader of innovative services –Search history –split view Good for obscure topics (because it searches books)
Some Notes A9 – customize, special features AOL – good for beginners Looksmart – Findarticle Lycos – what people are talking about (people,forums) MSN – generally less results but growing Yahoo – MM, local/people searches and more Gigablast – site: search (but small index) up to 500 sites!
GigaBlast
Practical recommendations Two major SEs (usually use both): 1.Google (GG) 2.Yahoo! search (YH) or MSN One Meta-SE (as a backup): 3.Dogpile or Clusty Don’t forget the invisible web! Note: Choices are not Hebrew oriented.
Hebrew Search Engines? MSN, Google, Yahoo Clusty (MSE) Netex (directory), Walla, Nana Morfix Start, a Many more (try Heb query of your favorite SE)
Bibliography/Credits searchenginewatch.com searchenginewatch.com searchengineshowdown.com/ searchengineshowdown.com/ ngine.html ngine.html infopeople.org/search infopeople.org/search (Hebrew)
Exercises 1.Find the page this was quoted from: "EcoOcean cooperates with the Heschel Center in educating" 2.Find a page that has a Flash communication demo. 3.Who provides search feed to Netscape? 4.Search for pages, books, and pictures about the invisible web. 5.Find a picture of ABC Pizza House in Cambridge MA. 6.Find information about “Meryl Stripp”. You are not sure of the correct spelling (try with the given spelling). Which Search engine is useful here? 7.You get a list of 10 websites you want to run a query on. Which Search engine can run them together? –Example: "taly sharon" (site:acm.org OR site:dblp.com OR site:googleguide.co.il OR site:googleguide.com OR site:sharon-it.com OR site:ifla.org OR site:media.mit.edu OR technion.ac.il OR site:netanya.ac.il OR site:biu.ac.il) 8.What if you had 400 websites? 9.What is the west wing? Suggest options to narrow this search.