Towards a Mapping of Modern AIS and Learning Classifier Systems Larry Bull Department of Computer Science & Creative Technologies University of the West.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Yinyin Yuan and Chang-Tsun Li Computer Science Department
Ali Husseinzadeh Kashan Spring 2010
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Yuri R. Tsoy, Vladimir G. Spitsyn, Department of Computer Engineering
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence.
Object Recognition Using Genetic Algorithms CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Basic Data Mining Techniques
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
Neural Optimization of Evolutionary Algorithm Strategy Parameters Hiral Patel.
Learning Classifier Systems Dominic Cockman, Jesper Madsen, Qiuzhen Zhu 1.
Radial-Basis Function Networks
Image Registration of Very Large Images via Genetic Programming Sarit Chicotay Omid E. David Nathan S. Netanyahu CVPR ‘14 Workshop on Registration of Very.
Review of normal distribution. Exercise Solution.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use parallel implementations. The reasons for parallelization.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
Transformation of Input Space using Statistical Moments: EA-Based Approach Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis: University.
Demetris Kennes. Contents Aims Method(The Model) Genetic Component Cellular Component Evolution Test and results Conclusion Questions?
Genetic Algorithm.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Efficient Model Selection for Support Vector Machines
1 Learning Classifier Systems Andrew Cannon Angeline Honggowarsito 1.
Ch. Eick: Evolutionary Machine Learning Classifier Systems n According to Goldberg [113], a classifier system is “a machine learning system that learns.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
GATree: Genetically Evolved Decision Trees 전자전기컴퓨터공학과 데이터베이스 연구실 G 김태종.
Ch. Eick: Evolutionary Machine Learning n Different Forms of Learning: Learning agent receives feedback with respect to its actions (e.g. from a teacher)
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
Learning from Observations Chapter 18 Through
Introduction to Evolutionary Algorithms Session 4 Jim Smith University of the West of England, UK May/June 2012.
ART – Artificial Reasoning Toolkit Evolving a complex system Marco Lamieri
Eick: LCS-Review Bull-Paper Review1 1.Holland (1986): “Classifier systems … rule-based systems with general mechanisms to process rules in parallel, for.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 February 2007 William.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Ch. Eick: Evolutionary Machine Learning n Different Forms of Learning: Learning agent receives feedback with respect to its actions (e.g. from a teacher)
Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.
DIVERSITY PRESERVING EVOLUTIONARY MULTI-OBJECTIVE SEARCH Brian Piper1, Hana Chmielewski2, Ranji Ranjithan1,2 1Operations Research 2Civil Engineering.
Learning Classifier Systems BRANDEN PAPALIA, MICHAEL STEWART, JAMES PATRICK FACULTY OF ENGINEERING, COMPUTING AND MATHEMATICS.
1 Effect of Spatial Locality on An Evolutionary Algorithm for Multimodal Optimization EvoNum 2010 Ka-Chun Wong, Kwong-Sak Leung, and Man-Hon Wong Department.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
1. Genetic Algorithms: An Overview  Objectives - Studying basic principle of GA - Understanding applications in prisoner’s dilemma & sorting network.
Learning Classifier Systems (Introduction) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering and Computer Science Victoria University.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
Alice E. Smith and Mehmet Gulsen Department of Industrial Engineering
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
Data Mining and Decision Support
Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate.
An unsupervised conditional random fields approach for clustering gene expression time series Chang-Tsun Li, Yinyin Yuan and Roland Wilson Bioinformatics,
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Christoph F. Eick: Thoughts on Designing Michigan-style Classifier Systems Thoughts on Selection Methods in Michigan-style Classifier Systems  When solving.
Surface Defect Inspection: an Artificial Immune Approach Dr. Hong Zheng and Dr. Saeid Nahavandi School of Engineering and Technology.
UCSpv: Principled Voting in UCS Rule Populations Gavin Brown, Tim Kovacs, James Marshall.
Does the brain compute confidence estimates about decisions?
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Haploid-Diploid Evolutionary Algorithms
Rule Induction for Classification Using
Presented by: Dr Beatriz de la Iglesia
Haploid-Diploid Evolutionary Algorithms
On Spatial Joins in MapReduce
Aiman H. El-Maleh Sadiq M. Sait Syed Z. Shazli
Coevolutionary Automated Software Correction
Presentation transcript:

Towards a Mapping of Modern AIS and Learning Classifier Systems Larry Bull Department of Computer Science & Creative Technologies University of the West of England, U.K.

Background For 25 years correlations between aspects of AIS and Learning Classifier Systems (LCS) have been highlighted. Neither field appears to have benefitted. More recently, an LCS has been presented for unsupervised learning which, with hindsight, may be viewed as a form of AIS. Purpose is to bring this LCS to the attention of the AIS community with the aim of serving as a catalyst for sharing ideas and mechanisms.

LCS in a Nutshell Invented by John Holland circa Consist of an “ecology” of rules. IF AND THEN Reward Traditionally use reinforcement learning techniques to approximate rule utility. Use evolutionary computing techniques to discover new rules. Often incorporate other heuristics.

Environment reward [P] 10#0:11 EA [M] [A] [A] -1 Action selection Prediction 0,10,2,9 stateaction Q-learning

CS-1 Holland & Reitman ‘78 LCS Holland ‘80 Boole Wilson ‘87 ZCS Wilson ‘94 New Boole Bonelli et al. ‘90 XCS Wilson ‘95 UCS Bernado- Mansilla & Garrell ‘03 XCSF Wilson ‘00 Gofer Booker ‘82 XCSC Tammee et al.’08 CFCS2 Riolo ‘90 ACS Stolzmann ‘98 ACS2 Butz et al. ‘02 Regression (& Reinforcement) Reinforcement SupervisedUnsupervisedModels Learning Classifier Systems Family Tree Animat Wilson ‘85

From LCS to AIS Recently presented a novel variant of XCS for data clustering. Approach exploits the mechanisms inherent to XCS but for unsupervised learning. Aim is to develop an approach to learning rules which accurately describe clusters - without prior assumptions as to their number within a given dataset. With hindsight approach is a form of clonal selection AIS.

YCSC Schematic Data cluster descriptor EA [M] [P] data Error updates

Rule Representation: Bounded Affinity A condition consists of intervals: { {c 1,s 1 }, ….. {c d,s d } } c is the interval’s range centre from [0.0,1.0] s is the “spread” from that centre (truncated). d is the number of dimensions. Each interval predicates’ upper and lower bounds are calculated as: [c i - s i, c i + s i ].

Fitness Each rule maintains a running estimate of matching error and niche size. Error  is derived from the Euclidean distance with respect to the input x and c in the condition of each member of [M]:

Niches Niche size estimates (  ) are based on match sets, i.e., number of concurrently active rules:  j   j +  ( |[M]| -  j ) A time-triggered Genetic Algorithm is run in the match sets.

Selection All rules maintain a time-stamp of the cycle when they were last in an [M] where the GA was used. If  GA cycles or more have passed on average for all rules in a current [M], the GA is triggered. The GA uses roulette-wheel selection with a scalable function: 1 Fitness =  v + 1 Time-stamps are reset for all members of [M]

Search Offspring are produced via mutation (probability  ) where we mutate an allele by adding an amount + or - rand(m 0 ). Crossover (probability , two-point) can occur between any two alleles, i.e., within an interval predicate as well as between predicates. If no rules match on a given time step, then a covering operator is used which creates a rule with its condition centre on the input value and the spread with a range of rand(s 0 ), which then replaces an existing member of the rulebase.

Replacement Rule replacement is population wide and proportional to niche occupancy. Each rule maintains an estimate of the size of [M] in which it occurs. Roulette-wheel selection. Encourages all niches to contain the same number of rules; rule resource is balanced.

Learning Process Generalization Max gen. 0 1 Fitness niche 1/error

Experiments Clustering is an important unsupervised classification technique where a set of data are grouped into clusters. Done in such a way that data in the same cluster are similar in some sense and data in different clusters are dissimilar in the same sense.

Some Data Used randomly generated synthetic datasets. The first dataset is well-separated and has k = 25 true clusters arranged in a 5x5 grid in d = 2 dimension. Each cluster is generated from 400 data points using a Gaussian distribution with a standard deviation of 0.02, for a total of n = 10,000 datum. The second dataset is not well-separated and generated it in the same way as the first except the clusters are not centred on that of their given cell in the grid.

Examples

Experimental Detail The parameters used were: N=800,  =0.2, v =5,  =0.8,  =0.04,  GA =12, s 0 =0.03, m 0 = All results presented are the average of ten runs. Learning trials consisted of 200,000 presentations of a randomly sampled data point.

Example Initial Results

Compaction Many overlapping rules are seen around each true cluster. Developed a four-step rule compaction algorithm to remove overlaps: ◦ Delete useless rules (v.low coverage) ◦ Sort on numerosity ◦ Sort on error ◦ Extract largest [M] rules

Example Result after Compaction

Comparative Performance We use as a measure of the quality of each clustering solution the total of the k-means objective function. Quality of LCS was / and the number of clusters /- 0. The average quality on the not well-separated dataset was / and the number of clusters /- 0. The k-means algorithm (k=25) averaged over 10 runs gives a quality of / and / on the well-separated and less- separated datasets respectively.

Comparative Performance II For estimating the number of clusters we ran, for 10 times each, different k (2 to 30) with different random initializations in k-means. To select the best clustering with different numbers of clusters, the Davies-Bouldin validity index was used. The result on well-separated dataset has a lower negative peak at 23 clusters and the less- separated dataset has a lower negative peak at 14 clusters. Thus LCS better on separated data (25).

A Network-like Extension One of the missing parts of XCS is a niche fitness sharing mechanism. Here rules adjust their fitnesses based on the fitnesses of the other co-active rules. Termed relative accuracy (f’): f’ = f /  f

Gives Improved Performance

Conclusions Similarities (and differences) between AIS and LCS have long been noted. Views taken from many different perspectives: dynamical systems, networks, complex adaptive systems, etc. A recently presented LCS as a clustering technique is essentially a clonal selection AIS. Can mechanisms from both fields now be consolidated to mutual benefit?

Some Possibilities Theory and mechanisms for generalization. Adaptive rates of search. Theory from ensembles/mixture-of- experts. Representation schemes. Memory. N.B. A new theory of neuronal replicators implies innate and adaptive components in learning.