Download presentation
Presentation is loading. Please wait.
Published byWilfred McBride Modified over 9 years ago
1
GROUPER: A DYNAMIC CLUSTERING INTERFACE TO WEB SEARCH RESULTS Erdem Sarıgil - 21000089 O ğ uz Yılmaz - 21000082 1
2
Grouper Interface to the results of the HuskySearch Dynamically groups the search results into clusters using Suffix Tree Clustering Algorithm (STC) The goal make search engine results easy to browse by clustering them Grouper receives hit from different engines, and only looks at the top hits from each search engine 2
3
Post-retrieval Clustering 3 Based on the returned document set Superior results than pre-retrieval clustering Some key requirements: Coherent Clusters Efficiently Browsable Speed Algorithmic Speed Snippet-Tolerance
4
Suffix Tree Clustering (STC) 4 Linear time clustering algorithm STC has three logical steps: Document cleaning Identifying base clusters using a suffix tree Merging these base clusters into clusters STC has several novel characteristics: Overlapping clusters Bag-of-words Well suited for Web document clustering Robust in such “noisy” situations
5
User Interface 5
6
User Interface (cont’d) 6
7
Making the Clusters Easy to Browse 7 Three heuristic to identify redundant phases: 1. Word Overlap 2. Sub- and Super- Strings 3. Most General Phase with Low Coverage
8
Speeeeed 8 Quality Search TimeQuality OR TimeQuality the vice president of vice president
9
Coherent Clusters 9
10
Comparison 10 Number of documents followed Time Spent Click Distance
11
Comparison (cont’d) 11
12
12
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.