Download presentation
Presentation is loading. Please wait.
1
Discussion Class 11 Cluster Analysis
2
Question 1: Use of Cluster Analysis
(a) Give an example when clustering of terms might be useful in information retrieval. (b) Give an example when clustering of documents might be useful in information retrieval. (c) Suppose that you have clustered a set of documents using a hierarchical method. Explain the use of (i) top-down search (ii) bottom-up search
3
Question 2: SLINK (a) What is the difference between a clustering method and a clustering algorithm? (b) What method is implemented by the SLINK algorithm? (c) What makes this a desirable algorithm? (d) Can you describe the basic concepts of the algorithm?
4
Question 3: Similarity matrix
(a) What is a similarity measure? (b) What is a similarity matrix? (c) Suppose that you are clustering documents based on co-occurrence of citations. Suggest a similarity measure that you might use. (d) Explain the ideas behind the inverted file algorithm for calculating a similarity matrix.
5
Question 4: Hierarchical methods
(a) What do the single link and complete link methods have in common? (b) Explain the concept of intercluster similarity. (c) Explain the concept behind the single link method of cluster analysis. What are its advantages? What are its disadvantages? (d) Explain the concept behind the complete link method of cluster analysis. What are its advantages? What are its disadvantages?
6
Question 5: Non-hierarchical methods
(a) Under what circumstances might non-hierarchical cluster analysis be valuable? (b) Define single pass methods. What problems do you see with them? (c) Define reallocation methods. What problems do you see with them?
7
Question 6: Evaluation and Validation
(a) What is the objective of cluster evaluation? (b) What is the objective of cluster validation? (c) Discuss clustering tendency, overlap test, nearest neighbor test, and density test.
8
Question 7: Final question
Dubes and Jain ask the question: How does one determine whether the results of a clustering method truly characterize the data? What is your opinion?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.