Chapter 4 - Case Study Clustering Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006
Case Study Clustering Documents This case study comes from a book titled Data Mining for Scientific and Engineering Applications published in 2001. I am particularly grateful for Springer-Verlag to give permission for including this very good article in the book. The case study deals with clustering of a very large collection, more than 100,000, of documents efficiently. A preprocessing algorithm is presented to extract the required information followed by use of the K-Means algorithm to cluster 113,736 documents. 12/7/2018 ©GKGupta