Clustering vs. Classification Traditional Clustering Classification Goal is to identify similar groups of objects Groups (clusters, new classes) are discovered Dataset consists of attributes Unsupervised (class label has to be learned) Important: Similarity assessment which derives a “distance function” is critical, because clusters are discovered based on distances/density. Pre-defined classes Datasets consist of attributes and a class labels Supervised (class label is known) Goal is to predict classes from the object properties/attribute values Classifiers are learnt from sets of classified examples Important: classifiers need to have a high accuracy
An Association Rule Question … have a question about the last class about the example of {Milk} --> {Diaper}. How are these associations alone obtained from the data set that we currently. I mean is that how do we come to know that "If a person buys Milk (One thing) he is bound to buy Diaper (Another Thing)". The dataset is just consists of group of things that a person buys. It never gives us the information whether "Buying of Milk depended on Buying of Diaper" or "Buying of Diaper depended on Buying of Milk". There could be other ways to obtain these associations. So basically, what I meant to ask is that 'How is it inferred? From the dataset, or any other methods are used for it'. Thanks, Shraddha Khaire