Download presentation
Presentation is loading. Please wait.
Published byPhilip Robertson Modified over 9 years ago
1
Hotbot A Search Engine Case Study
2
Introduction Owned by Terra/Lycos. One of the largest web search engines. Uses the Inktomi database combined with Direct Hit and the DMOZ Open Directory. Basic search screen is simple, but the advanced search allows for a full range of search features.
3
Databases Open Directory Open Directory Direct Hit Direct Hit Inktomi Inktomi Direct Hit results display if the option for 10 results at a time is selected and there are 10 results available from Direct Hit. If an option for more than 10 results at a time is selected the Direct Hit results are available via a link. Other content comes from various advertisers, the Lycos Network, and GoTo. The GoTo and other advertiser results may show up above and/or below the other results but are under a separate heading such as "feature listings." Direct Hit results display if the option for 10 results at a time is selected and there are 10 results available from Direct Hit. If an option for more than 10 results at a time is selected the Direct Hit results are available via a link. Other content comes from various advertisers, the Lycos Network, and GoTo. The GoTo and other advertiser results may show up above and/or below the other results but are under a separate heading such as "feature listings."
4
Strengths Advanced searching capabilities Advanced searching capabilities Page depth limit Page depth limit Advanced search help Advanced search help Truncation Truncation
5
Weaknesses Link searches must be exact Link searches must be exact Database size shrunk for awhile Database size shrunk for awhile Advanced features have not always worked right Advanced features have not always worked right
6
Features Default Operation: Processed as an AND Default Operation: Processed as an AND Full Boolean Searching: AND, OR, and NOT Full Boolean Searching: AND, OR, and NOT Proximity Searching Proximity Searching Truncation with the * symbol Truncation with the * symbol Case sensitive Case sensitive Extensive, dynamic stop word list Extensive, dynamic stop word list Word Stemming - Search for grammatical word variants including plural, singular, and tense. Word Stemming - Search for grammatical word variants including plural, singular, and tense.
7
Field Searches Field Searching: Searching title words and links to a specific URL Field Searching: Searching title words and links to a specific URL acrobat/applet/activex/audio/embed/ acrobat/applet/activex/audio/embed/ flash/form/frame/image/script/ flash/form/frame/image/script/ shockwave/table/video/vrml shockwave/table/video/vrml
8
Limits linkdomain: Limits pages containing links to the specified domain linkdomain: Limits pages containing links to the specified domain Outgoingurlext: Limits to pages containing embedded files with the specified extension Outgoingurlext: Limits to pages containing embedded files with the specified extension Scriptlanguage: Limits to pages containing only javascript or vbscript Scriptlanguage: Limits to pages containing only javascript or vbscript after: [day]/[month]/[year] after: [day]/[month]/[year] before: [day]/[month]/[year] before: [day]/[month]/[year] within:[number/unit] within:[number/unit] Language Limit Language Limit
9
Unique for Hotbot Page Type – Page Type – Default is Any (Any pages) Default is Any (Any pages) Top Page (the root page of a URL ie. www.unca.edu) Top Page (the root page of a URL ie. www.unca.edu) Page Depth - Limits how far down a subdirectory hierarchy Hotbot Searches Page Depth - Limits how far down a subdirectory hierarchy Hotbot Searches These are useful for finding the primary sites for organizations or information These are useful for finding the primary sites for organizations or information
10
Sorting Results are sorted by relevance with groupings by site available at the end of each brief record. Results are sorted by relevance with groupings by site available at the end of each brief record. The display includes the relevance score, title, URL, a brief extract, and date. HotBot displays 10 records at a time, by default. The display includes the relevance score, title, URL, a brief extract, and date. HotBot displays 10 records at a time, by default.
11
Architecture Direct Hit: Direct Hit: Provides the breadth of a conventional search engine, with the relevancy of an index which is edited by humans Provides the breadth of a conventional search engine, with the relevancy of an index which is edited by humans References the searching activity of millions of users References the searching activity of millions of users Adjusts rankings based on the popularity of the retrieved documents Adjusts rankings based on the popularity of the retrieved documents
12
Architecture Inktomi Inktomi Hosts Web searches for its clients on coupled- cluster, parallel-computing multiple workstations Hosts Web searches for its clients on coupled- cluster, parallel-computing multiple workstations Receiving a search query from a user, that interface translates the query from HTTP into Inktomi Data Protocol (IDP) and sends it to the Inktomi Master Cluster Receiving a search query from a user, that interface translates the query from HTTP into Inktomi Data Protocol (IDP) and sends it to the Inktomi Master Cluster it sends the results in IDP to the client Web server, which translates the information into HTTP and sends it to the user it sends the results in IDP to the client Web server, which translates the information into HTTP and sends it to the user
13
Results Query 1: Information on Home of the Rockefellers Kykuit - To test the engines on a very specific bit of Americana - Kykuit, the baronial home of the Rockefellers on the Hudson River in New York. Query 1: Information on Home of the Rockefellers Kykuit - To test the engines on a very specific bit of Americana - Kykuit, the baronial home of the Rockefellers on the Hudson River in New York. Query 2: Information on Neuschwanstein Castle - To test the engines on a fairly well-known tourist attraction in Germany - Neuschwanstein Castle Query 2: Information on Neuschwanstein Castle - To test the engines on a fairly well-known tourist attraction in Germany - Neuschwanstein Castle Query 3: Information on Francis Pilkington Madrigals - To test the engines on retrieval of an obscure musical reference - the Elizabethan madrigals of Francis Pilkington. Query 3: Information on Francis Pilkington Madrigals - To test the engines on retrieval of an obscure musical reference - the Elizabethan madrigals of Francis Pilkington.
14
Query 1: Information on Home of the Rockefellers Kykuit Hotbot - 72 Matches Hotbot - 72 Matches FPL: www.gorp.com/gorp/location/ny/kyk_hudv.htm FPL: www.gorp.com/gorp/location/ny/kyk_hudv.htmwww.gorp.com/gorp/location/ny/kyk_hudv.htm Relevance rating: Page 14: County Historys Relevance rating: Page 14: County Historys Google - 91 Matches Google - 91 Matches FPL: www.abbeville.com/booktemplate.asp?stockno=2220 FPL: www.abbeville.com/booktemplate.asp?stockno=2220 Relevance: Page 30: A Book Where Kykuit is mentioned Relevance: Page 30: A Book Where Kykuit is mentioned UNCA Library - 5 Matches UNCA Library - 5 Matches FPL: wncln.appstate.edu/search/...information+on+how+to+use +the+dietary+guidelines&1,1 FPL: wncln.appstate.edu/search/...information+on+how+to+use +the+dietary+guidelines&1,1 wncln.appstate.edu/search/...information+on+how+to+use +the+dietary+guidelines&1,1 wncln.appstate.edu/search/...information+on+how+to+use +the+dietary+guidelines&1,1 Relevance: Page 1: Information on how to use dietary guidelines Relevance: Page 1: Information on how to use dietary guidelines
15
Query 2: Information on Neuschwanstein Castle Hotbot - 2,700 Matches Hotbot - 2,700 Matches FPL: www.castlesoftheworld.com/Brochure/ FPL: www.castlesoftheworld.com/Brochure/www.castlesoftheworld.com/Brochure/ Relevance: Page 10: Castles of the US Relevance: Page 10: Castles of the US Google – 4,060 Matches Google – 4,060 Matches FPL: www.neuschwanstein-castle.com/ FPL: www.neuschwanstein-castle.com/www.neuschwanstein-castle.com/ Relevance: Page 33: A Page on King Ludwig II - No Mention of Neuschwanstein Castle Relevance: Page 33: A Page on King Ludwig II - No Mention of Neuschwanstein Castle UNCA Library - 5 Matches UNCA Library - 5 Matches FPL: wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinform ation+on+self+employment+tax&1,1 FPL: wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinform ation+on+self+employment+tax&1,1 Relevance: Page 1: Information On Self Employment Tax Relevance: Page 1: Information On Self Employment Tax
16
Query 3: Information on Francis Pilkington Madrigals Hotbot - 53 Matches Hotbot - 53 Matches FPL: www.medieval.org/emfaq/cds/van624.htm FPL: www.medieval.org/emfaq/cds/van624.htmwww.medieval.org/emfaq/cds/van624.htm Relevance: Page 5 - A Page about the Lute - no mention of Madrigals Relevance: Page 5 - A Page about the Lute - no mention of Madrigals Google - 33 Matches Google - 33 Matches FPL: www.netstrider.com/search/methods.html FPL: www.netstrider.com/search/methods.htmlwww.netstrider.com/search/methods.html Relevance: Page 3: No mention of Pilkington Madrigals Relevance: Page 3: No mention of Pilkington Madrigals UNCA Library - 5 Matches UNCA Library - 5 Matches FPL: wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinform ation+on+the+red+notice+system&1,1 FPL: wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinform ation+on+the+red+notice+system&1,1 wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinform ation+on+the+red+notice+system&1,1 wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinform ation+on+the+red+notice+system&1,1 Relevance: Page 1: Information On The Red Notice System Relevance: Page 1: Information On The Red Notice System
17
Conclusion HotBot is an interface to advanced web searches, and it presents a dynamically changing backend. Both the Inktomi and Direct Hit technologies serve, in different ways, to provide a relevant list of results through advanced queries, and both seek to minimize the commercial influence over search results. All of these technologies are subject to changes in technology developments, and changes in the business environment. HotBot is an interface to advanced web searches, and it presents a dynamically changing backend. Both the Inktomi and Direct Hit technologies serve, in different ways, to provide a relevant list of results through advanced queries, and both seek to minimize the commercial influence over search results. All of these technologies are subject to changes in technology developments, and changes in the business environment. Its weaknesses include that it still doesn't seem to produce the depth and breadth of some other engines, and that it's advanced features have not always worked correctly. As the proliferation of this engine's index and searching features continues, these weaknesses should be overcome. Its weaknesses include that it still doesn't seem to produce the depth and breadth of some other engines, and that it's advanced features have not always worked correctly. As the proliferation of this engine's index and searching features continues, these weaknesses should be overcome.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.