Download presentation
Presentation is loading. Please wait.
Published bySamuel Clarke Modified over 9 years ago
1
Hyper-Searching the Web
2
Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)
3
Basic search engine Examples: AltaVista, InfoSeek, HotBot, Lycos, Excite, Google, etc Maintains an index for every word found Processes through crawling, indexing, and returning results
4
Basic search engine Different ranking systems used -most use heuristics (easiest solution) counts # of keywords that appear -Google uses PageRank
5
Basic search engine No idea of searcher’s intent so “best” result hard to achieve Problems with synonymy and polysemy ex. car and automobile ex. jaguar One solution: store semantic relations -only can help w/synonmy Can’t identify concepts/author intent ex. IBM site does not say “computer”
6
Cluster search engine Example: Clusty Clusters results into categories/themes Can show results that would be ranked lower in another search engine -due to different meanings in words, can show the less searched-for
7
Meta-search engine Examples: Dogpile, Surfwax, Copernic, etc Sends searcher’s query to a database of search engines Claimed to not be any better than database; often the referenced search engines are small, free, commercial Users can create their own on Google of up to 5,000 URLs as “database”
8
“Smarter” meta-search engine Example: Clever project (n/a online yet) Includes clustering and linguistic analysis “cat” Cat – feline Cat – power Cat – equipment Cat – scans etc.
9
The Clever Project Uses hyperlinks to locate hubs and authorities “a respected authority is a page that is referred to by many good hubs; a useful hub is a location that points to many valuable authorities”
10
The Clever Project Obtains a list of webpages from a standard index & follows hyperlinks to increase own database -resulting collection = “root set” -each page gets numerical hub & authority score
11
The Clever Project Similar to PageRank in determining method – guesses & constant calculations -useful by-product: clusters sites Adds to competition because competitors don’t have to acknowledge their competition through hyperlinks
12
Clever vs. Google GOOGLE - gives initial rankings - keeps pages indpt. of queries - faster - looks forward “link to link” CLEVER - root sets per keyword - page priority through query context - forwards & backwards “hub and authority” - sometimes too broad ex. Fallingwater
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.