Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University.

Similar presentations


Presentation on theme: "A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University."— Presentation transcript:

1 A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University of Ioannina, GREECE

2 Presentation Outline Global Optimization Problem Global Optimization Problem Particle Swarm Optimization Particle Swarm Optimization Modifying Particle Swarm to form clusters Modifying Particle Swarm to form clusters Clustering Approach Clustering Approach Modifying the affinity matrix Modifying the affinity matrix Putting the pieces together Putting the pieces together Determining the number of minima Determining the number of minima Identification of the clusters Identification of the clusters Preliminary results – Future research Preliminary results – Future research

3 Global Optimization The goal is to find the Global minimum inside a bounded domain: The goal is to find the Global minimum inside a bounded domain: One way to do that, is to find all the local minima and choose among them the global one (or ones). One way to do that, is to find all the local minima and choose among them the global one (or ones). Popular methods of that kind are Multistart, MLSL, TMLSL *, etc. Popular methods of that kind are Multistart, MLSL, TMLSL *, etc. * M. Ali

4 Particle Swarm Optimization  Developed in 1995 by James Kennedy and Russ Eberhart.  It was inspired by social behavior of bird flocking or fish schooling.  PSO applies the concept of social interaction to problem solving.  Finds a global optimum.

5 PSO-Description The method allows the motion of particles to explore the space of interest. The method allows the motion of particles to explore the space of interest. Each particle updates its position in discrete unit time steps. Each particle updates its position in discrete unit time steps. The velocity is updated by a linear combination of two terms: The velocity is updated by a linear combination of two terms: The first along the direction pointing to the best position discovered by the particle The first along the direction pointing to the best position discovered by the particle The second towards the overall best position. The second towards the overall best position.

6 PSO - Relations Particle’s best position Swarm’s best position Where: is the position of the i th particle at step k is its velocity is the best position visited by the i th particle is the overall best position ever visited is the constriction factor

7 PS+Clustering Optimization If the global component is weakened the swarm is expected to form clusters around the minima. If the global component is weakened the swarm is expected to form clusters around the minima. If a bias is added towards the steepest descent direction, this will be accelerated. If a bias is added towards the steepest descent direction, this will be accelerated. Locating the minima then may be tackled, to a large extend, as a Clustering Problem (CP). Locating the minima then may be tackled, to a large extend, as a Clustering Problem (CP). However is not a regular CP, since it can benefit from information supplied by the objective function. However is not a regular CP, since it can benefit from information supplied by the objective function.

8 Modified PSO Global component is set to zero. Global component is set to zero. A component pointing towards the steepest descent direction * is added to accelerate the process. A component pointing towards the steepest descent direction * is added to accelerate the process. So the swarm motion is described by: So the swarm motion is described by: * A. Ismael F. Vaz, M.G.P. Fernantes

9 Modified PSO movie

10 Clustering Clustering problem: “Partition a data set into M disjoint subsets containing points with one or more properties in common” Clustering problem: “Partition a data set into M disjoint subsets containing points with one or more properties in common” A commonly used property refers to topographical grouping based on distances. A commonly used property refers to topographical grouping based on distances. Plethora of Algorithms: Plethora of Algorithms: K-Means, Hierarchical -Single linkage-Quantum-Newtonian clustering. K-Means, Hierarchical -Single linkage-Quantum-Newtonian clustering.

11 Global k-means Minimize the clustering error Minimize the clustering error It is an incremental procedure using the k-Means algorithm repeatedly It is an incremental procedure using the k-Means algorithm repeatedly Independent of the initialization choice. Independent of the initialization choice. Has been successfully applied to many problems. Has been successfully applied to many problems. A. Likas

12 Global K-Means movie

13 Spectral Clustering Algorithms that cluster points using eigenvectors of matrices derived from the data Algorithms that cluster points using eigenvectors of matrices derived from the data Obtain data representation in the low- dimensional space that can be easily clustered Obtain data representation in the low- dimensional space that can be easily clustered Variety of methods that use the eigenvectors differently Variety of methods that use the eigenvectors differently Useful information can be extracted from the eigenvalues Useful information can be extracted from the eigenvalues

14 The Affinity Matrix This symmetric matrix is of key importance. Each off-diagonal element is given by:

15 The Affinity Matrix Let and for Let and for The Matrix is diagonalized and let be its eigenvalues sorted in descending order. be its eigenvalues sorted in descending order. The gap which is biggest, identifies the number of clusters (k).

16 Simple example Subset of Cisi/Medline dataset Subset of Cisi/Medline dataset Two clusters: IR abstracts, Medical abstracts Two clusters: IR abstracts, Medical abstracts 650 documents, 3366 terms after pre-processing 650 documents, 3366 terms after pre-processing Spectral embedded space based constructed from two largest eigenvectors: Spectral embedded space based constructed from two largest eigenvectors:

17 Largest eigengap λ1λ1 λ2λ2 How to select k ? How to select k ? Eigengap: the difference between two consecutive eigenvalues. Eigengap: the difference between two consecutive eigenvalues. Most stable clustering is generally given by the value k that maximises the expression Most stable clustering is generally given by the value k that maximises the expression  Choose k=2

18 Putting the pieces together 1. Apply modified particle swarm to form clusters around the minima 2. Construct the affinity matrix A and compute the eigenvalues of M. A. Use only distance information B. Add gradient information 3. Find the largest eigengap and identify k. 4. Perform global k-means using the determined k A. Use pairwise distances and centroids B. Use affinity matrix and medoids (with gradient info)

19 Adding information to Affinity matrix Use the gradient vectors to zero out pairwise affinities. Use the gradient vectors to zero out pairwise affinities. New formula : New formula : Do not associate particles that would become more distant if they would follow the negative gradient. Do not associate particles that would become more distant if they would follow the negative gradient.

20 Adding information to Affinity matrix Black arrow: Gradient of particle i Green arrows: Gradient of j with non zero affinity to i Red arrows: Gradient of j with zero affinity to i

21 From global k-means to global k-medoids Original global k-means Original global k-means

22 Rastrigin function (49 minima) After modified particle Swarm Gradient information

23 Rastrigin function Estimation of k using distance Estimation of k using gradient info

24 Rastrigin function Global k-means

25 Rastrigin function Global k-medoids

26 Shubert function (100 minima) After modified particle Swarm Gradient information

27 Shubert function Estimation of k using distance Estimation of k using gradient info

28 Shubert function Global k-means

29 Shubert function Global k-medoids

30 Ackley function (25 minima) After modified particle Swarm Gradient information

31 Shubert function Estimation of k using distance Estimation of k using gradient info

32 Shubert function Global k-means

33 Shubert function Global k-medoids


Download ppt "A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University."

Similar presentations


Ads by Google