Huge Raw Data Cleaning Data Condensation Dimensionality Reduction Data Wrapping/ Description Machine Learning Classification Clustering Rule Generation Knowledge Interpretation Knowledge Extraction Knowledge Evaluation Useful Knowledge Preprocessed Data Mathe- matical Model of Data (Patterns) Data Mining (DM) Knowledge Discovery in Database (KDD) Pattern Recognition, World Scientific, 2001
Data Mining Algorithm Components Model : Function of the model (e.g., classification, clustering, rule generation) and its representational form (e.g., linear discriminates, neural networks, fuzzy logic, GAs, rough sets). Preference criterion : Basis for preference of one model or set of parameters over another. Search algorithm : Specification of an algorithm for finding particular patterns of interest (or models and parameters), given the data, family of models, and preference criterion.
Why Growth of Interest ? Falling cost of large storage devices and increasing ease of collecting data over networks Availability of Robust/Efficient machine learning algorithms to process data. Falling cost of computational power enabling use of computationally intensive methods for data analysis.
Soft Computing Computational Theory of Perception For human like decision making From: uncertainty, approximate reasoning and partial truth To: tractability, robustness, low solution cost, and close resemblance To find an approximate solution to an imprecisely /precisely formulated problem. ‘soft computing rather than hard computing’ as the foundation for Artificial Intelligence.
Computational Theory of Perceptions Provides capability to compute and reason with perception based information Humans have remarkable capability to perform a wide variety of physical and mental tasks without any measurement and computations They use perceptions of time, direction, speed, shape, possibility, likelihood, truth, and other attributes of physical and mental objects
Soft Computing A collection of methodologies Fuzzy Logic : the algorithms for dealing with imprecision and uncertainty Neural Networks: the machinery for learning and curve fitting Genetic Algorithms : the algorithms for search and optimization Rough Sets : handling uncertainty arising from the granularity in the domain of discourse They are Complementary rather than Competitive
Perceptions are fuzzy (F) – granular Boundaries of perceived classes are unsharp Values of attributes are granulated – a clump of indistinguishable points/objects Example: Granules in age: very young, young, not so old,… Granules in direction: slightly left, sharp right F-granularity of perceptions puts them well beyond the reach of traditional methods of analysis (based on predicate logic and probability theory) Is location A in the forest? Defined by membership function u – Certainly yes: u (A) = 1 – Certainly not: u (A) = 0 – It dependence on a subjective (vague) opinion: u (A) = 0.6
Role of Fuzzy Sets Modeling of imprecise/qualitative knowledge Transmission and handling uncertainties at various stages Supporting, to an extent, human type reasoning in natural form
Role of Neural Networks Machinery for learning and curve fitting (Learns from examples) Resistance to Noise Tolerance to Distorted Patterns /Images (Ability to Generalize Superior Ability to Recognize Overlapping Pattern Classes or Classes with Highly Nonlinear Boundaries or Partially Occluded or Degraded Images
Role of Genetic Algorithms Many tasks involved in analyzing/identifying a pattern need Appropriate Parameter Selection and Efficient Search in complex spaces to obtain Optimal Solutions Used more in Prediction (P) than Description(D) – D : Finding human interpretable patterns describing the data – P : Using some variables or attributes in the database to predict unknown/ future values of other variables of interest
Integrated approaches Fuzzy Logic + NN NN + GA Fuzzy Logic + NN + GA Fuzzy Logic + NN + GA + Rough Set Neuro-fuzzy hybridization is the most visible integration realized so far. – Fuzzy Set theoretic models try to mimic human reasoning and the capability of handling uncertainty – Neural Network models attempt to emulate architecture and information representation scheme of human brain
Rough Sets Offer mathematical tools to discover hidden patterns in data Fundamental principle of a rough set-based learning system is to discover redundancies and dependencies between the given features of a data to be classified Approximate a given concept both from below and from above, using lower and upper approximations Rough set learning algorithms can be used to obtain rules in IF-THEN form from a decision table Extract Knowledge from data base – decision table (objects and attributes) remove undesirable attributes (knowledge discovery) analyze data dependency minimum subset of attributes (reducts)
Approximations of the set B-lower: BX = B-upper: BX = If BX = BX, X is B-exact or B-definable Otherwise it is Roughly definable Granules definitely belonging to X w.r.t feature subset B Granules definitely and possibly belonging to X Accuracy of rough set
Rough Sets Uncertainty Handling – Using lower & upper approximations Granular Computing – Using information granules – Computation is performed using information granules and not the data points (objects) low mediumhigh low medium high F1F1 F2F2 Rule Rule provides crude description of the class using granule
Issues in the Decision Table The same or indiscernible objects may be represented se veral times. (redundant) That is, their removal cannot worsen the classification. Keep only those attributes that preserve the indiscernibili ty relation and, consequently, set approximation There are usually several such subsets of attributes and t hose which are minimal are called reducts
Rough Set Rule Generation Decision Table: Object F 1 F 2 F 3 F 4 F 5 Decision x Class 1 x Class 1 x Class 1 x Class 2 x Class 2 Discernibility Matrix (c) for Class 1:Objects x1 x1 x1 x1 x2x2x2x2 x3x3x3x3 x1x1x1x1 F 1, F 3 F 2, F 4 x2x2x2x2 F 1,F 2,F 3,F 4 x3x3x3x3
Discernibility function: Discernibility function considering the object x 1 belonging to Class 1 = Discernibility of x 1 w.r.t x 2 (and) Discernibility of x 1 w.r.t x 3 = Similarly, Discernibility function considering object Dependency Rules (AND-OR form): DNF of discernibility functions
Summary Fuzzy sets provide efficient granulation of feature space (F -granulation) Neural networks are suitable in data-rich environments and are typically used for extracting embedded knowledge in the form of rules. Genetic algorithms provide efficient search algorithms to select a model based on preference criterion function. Rough sets used for generating information granules. They are Complementary for KDD