Unsupervised Problem Decomposition Using Genetic Programming Ahmed Kattan, Alexandros Agapitos, and Riccardo Poli A.I. Esparcia-Alcazar et al. (Eds.):

Slides:

Advertisements

Similar presentations

© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.

Advertisements

ADBIS 2007 Aggregating Multiple Instances in Relational Database Using Semi-Supervised Genetic Algorithm-based Clustering Technique Rayner Alfred Dimitar.

Heuristic Search techniques

K-means Clustering Given a data point v and a set of points X,

Random Forest Predrag Radenković 3237/10

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.

Clustering Basic Concepts and Algorithms

PARTITIONAL CLUSTERING

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.

Fast Algorithms For Hierarchical Range Histogram Constructions

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.

Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.

Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence.

1 Application of Metamorphic Testing to Supervised Classifiers Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology Christian Murphy, Gail.

Algorithms and Problem Solving-1 Algorithms and Problem Solving.

1 Optimisation Although Constraint Logic Programming is somehow focussed in constraint satisfaction (closer to a “logical” view), constraint optimisation.

Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.

Basic Data Mining Techniques

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.

Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.

Birch: An efficient data clustering method for very large databases

Radial Basis Function Networks

ROUGH SET THEORY AND FUZZY LOGIC BASED WAREHOUSING OF HETEROGENEOUS CLINICAL DATABASES Yiwen Fan.

The chapter will address the following questions:

Evaluating Performance for Data Mining Techniques

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Crop area estimates with area frames in the presence of measurement errors Elisabetta Carfagna University of Bologna Department.

Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

Vladyslav Kolbasin Stable Clustering. Clustering data Clustering is part of exploratory process Standard definition:  Clustering - grouping a set of.

What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.

GP-Fileprints Ahmed Kattan, Edgar Galva n-Lo pez, Riccardo Poli and Michael O’Neill File Types Detection Using Genetic Programming A.I. Esparcia-Alcazar.

O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.

Chapter 4 Decision Support System & Artificial Intelligence.

Zeidat&Eick, MLMTA, Las Vegas K-medoid-style Clustering Algorithms for Supervised Summary Generation Nidal Zeidat & Christoph F. Eick Dept. of Computer.

Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.

Machine Learning Queens College Lecture 7: Clustering.

Finite State Machines (FSM) OR Finite State Automation (FSA) - are models of the behaviors of a system or a complex object, with a limited number of defined.

Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav.

Data Mining and Decision Support

Evolving RBF Networks via GP for Estimating Fitness Values using Surrogate Models Ahmed Kattan Edgar Galvan.

Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.

Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 28 Nov 9, 2005 Nanjing University of Science & Technology.

Corresponding Clustering: An Approach to Cluster Multiple Related Spatial Datasets Vadeerat Rinsurongkawong and Christoph F. Eick Department of Computer.

IEEE International Conference on Fuzzy Systems p.p , June 2011, Taipei, Taiwan Short-Term Load Forecasting Via Fuzzy Neural Network With Varied.

Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.

1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.

An application of the genetic programming technique to strategy development Presented By PREMKUMAR.B M.Tech(CSE) PONDICHERRY UNIVERSITY.

 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.

Ensemble Classifiers.

Algorithms and Problem Solving

Semi-Supervised Clustering

Parallel Density-based Hybrid Clustering

Generalization ..

A JMP® SCRIPT FOR GEOSTATISTICAL CLUSTER ANALYSIS OF MIXED DATA SETS WITH SPATIAL INFORMATION Steffen Brammer.

AIM: Clustering the Data together

Data Mining Practical Machine Learning Tools and Techniques

Objective of This Course

Algorithms and Problem Solving

Authors: Barry Smyth, Mark T. Keane, Padraig Cunningham

Text Categorization Berlin Chen 2003 Reference:

Pasi Fränti and Sami Sieranoja

Lecture 4. Niching and Speciation (1)

Evolutionary Ensembles with Negative Correlation Learning

Presentation transcript:

Unsupervised Problem Decomposition Using Genetic Programming Ahmed Kattan, Alexandros Agapitos, and Riccardo Poli A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Introduction Problem decomposition aims to simplify complex real world problems in order to better cope with them. This strategy is regularly used by humans when solving problems. One way to formalise the decomposition process is to assume there exist different patterns in the problem space, each pattern has particular characteristics and therefore it needs a special solution. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Why its useful ? Problem decomposition is important for two reasons: Reduces the complexity of a problem. Automated problem decomposition may help researchers to better understand a problem domain by discovering regularities in the problem space. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

The Problem It is not difficult to split a problem into several sub-problems to be solved in cooperation with different methods. However, using the wrong decomposition may actually increase the problems complexity. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

The ideal problem decomposition system A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg Problem Decomposition System An ideal problem decomposition system would be one that gets the data from the user and identifies different groups in the data; each of these groups should be simpler to solve than the original problem.

Some of the previous works Rosca, J., Johnson, M.P., Maes, P.: Evolutionary Speciation for Problem Decomposition (1996) Proposed a system called Evolutionary Speciation Genetic Programming (ESGP) to automatically discover natural decompositions of problems. Each individual consisted of two parts: A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg Condition treeOutput tree

Some of the previous works Some of the fitness cases may be claimed by more than one individual while others are never chosen. A fitness function was proposed which encourages individuals to fully cover the problem space and minimise the overlap of the claimed fitness cases. Experimentation: Symbolic regression problem standard GP and with GP(IF) Results: GP(IF) evolved conditions in such a way as to effectively assign different fitness cases to different pieces of code and, moreover, that GP(IF) outperformed ESGP. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Other works Iba, H.: Bagging, boosting, and bloating in genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, Florida, USA, July 13-17, vol. 2, pp. 1053–1060. Morgan Kaufmann, San Francisco (1999) Proposed to extend GP using two well-known resampling techniques known as Bagging and Boosting and presented two systems referred to as BagGP and BoostGP. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Other works Jackson, D.: The performance of a selection architecture for genetic programming. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alca zar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP LNCS, vol. 4971, pp. 170–181. Springer, Heidelberg (2008). Proposed a hierarchical architecture for GP for problem decomposition based on partitioning the input test cases into subsets. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

As one can see, none of the previous methods for problem decomposition via test case subdivision is fully automated. Overcoming this limitation is one of the aims of the work presented in this paper. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

In this Research An intelligent decomposition of problems requires understanding the problem domain and usually can only be carried out by experts. In this paper, we propose a GP system that can evolve programs that automatically decompose a problem into a collection of simpler and smaller sub-problems while simultaneously solving the sub- problems. This is an area of GP that has not been thoroughly explored thus far. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

The Approach TrainingResamplingSolvingTesting A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Training The system starts by randomly initialising a population of individuals using the ramped half-and-half method. Each individual composed of two trees: projector x projector y A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Resampling A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg projector xprojector y Training cases XY

Function set A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg GP RUN1 GP RUN2 GP RUN3

Clustering the Training Examples We used standard K-means. (any classification can be used). Why K-mean ? Simplicity of implementation. Execution speed. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Clustering the Training Examples K-means is a partitioning algorithm that normally requires the user to fix the number of clusters to be formed. In our case the optimal number of subdivisions for the problem into sub-problems is unknown. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Find the number of clusters Analyze the density of the data, to select the right number of the clusters K-means is a very fast algorithm. The system repeatedly instructs K-means to divide the data set into k clusters, where k = 2,3,...Kmax (Kmax = 10, in our implementation). After each call the system computes the clusters’ quality. The k value which provided the best quality for clusters is then used to split the training set and invoke GP runs on the corresponding clusters. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Clusters’ Quality The Clusters’ Quality measured as: 1.Separation. 2.Representativeness. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Separation Modified Davis Bouldin Index (DBI) A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg Standard deviation Distances between cluster’s centiod and data members Distance between all cluster’s centiods

Representativeness Check whether the formed clusters are representative enough to classify unseen data ? The system penalises the values of k that lead K- mean to form clusters where less than a minimum number of members is present. In this work, the minimum allowed number of members for each cluster was simply set to 10 samples. (Why 10 !!) A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Clusters’ Quality A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg For each K-man call with k value 2:k_max Quality DBI index Representativeness penalty value

Training Outputs Projector tree x. Projector tree y. Clusters centriods. Best GP solution from each cluster (or sub-problem). A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Testing 1.Unseen data projected via projector trees. 2.Then, they are classified based on the closest centroid. 3.The input data are passed to the evolved program associated to the corresponding cluster. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Fitness Evaluation Two levels of GP-runs Top-level GP run (represented by two projection trees). Inner-GP runs (for each cluster). We evaluate the top-level GP system’s individuals by measuring how well the whole problem is solved. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg Fitness value Inner-GP fitness value Number of clusters This fitness function encourages the individuals to project the fitness cases in such a way that a solution for the fitness cases in each group can easily be found by the inner GP runs.

Experimental Setup Experiments have been conducted to evaluate the proposed framework. To do this we chose a variety of symbolic regression problems, which we felt were difficult enough to demonstrate the characteristics and benefits of the method. We used discontinuous functions as symbolic regression target functions. These allow us to study the ability of the system to decompose a complex problem into simpler tasks. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Experimental Setup A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg Rosca’s experiments [-5,5] [-7,7]

Experimental Setup A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg independent runs for each system (20 runs for each test function). 100 samples were uniformly selected from the training interval. 400 test samples. Each evolved solution has been evaluated with two different test sets: Interpolation. Extrapolation.

Results A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg Better than Rosca’s results

Number of Clusters Functions 1, 2 and 5, the system decided to split the training samples into 4 to 10 clusters. Function 3, the system identified 6 to 10 clusters in the problem space. Function 4, the most complex in our test set, it identified 8 to 10 clusters. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Experimental results summary A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Conclusions In this paper we presented a new framework to automatically decompose difficult symbolic regression tasks into smaller and simpler tasks. The proposed approach is based on the idea of first projecting the training cases onto a two-dimensional Euclidian space, and then clustering them via the K- means algorithm to better see their similarities and differences. The clustering is guaranteed to be optimal thanks to the use of an iterative process. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Conclusions Note that while the projection and clustering steps may seem excessive for scalar domains, they make our problem decomposition technique applicable to much more complex domains. The major disadvantage of the system is the slow training process (measured by hours). A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

Future work Extend the experimentation by testing the technique on multi-varied problems and non- symbolic- regression problems. The system should be able to change the settings of each inner GP run according to the difficulty of the given sub-problem. We intend to investigate the relationship between the identified number of clusters and how this affects the solutions’ accuracy. A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg

A.I. Esparcia-Alcazar et al. (Eds.): EuroGP 2010, LNCS 6021, pp. 122–133, ©Springer-Verlag Berlin Heidelberg 2010 Thank you for paying attention! 36