Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Presentation Arpan Maheshwari Y7082,CSE Supervisor: Prof. Amitav Mukerjee Madan M Dabbeeru.

Similar presentations


Presentation on theme: "Project Presentation Arpan Maheshwari Y7082,CSE Supervisor: Prof. Amitav Mukerjee Madan M Dabbeeru."— Presentation transcript:

1 Project Presentation Arpan Maheshwari Y7082,CSE arpanm@iitk.ac.in Supervisor: Prof. Amitav Mukerjee Madan M Dabbeeru

2

3 Clustering: Organising a collection of k-dimensional vectors into groups whose members share similar features in some way. To reduce large amount of data by categorizing in smaller set of similar items. Clustering is different from classification.

4 Elements of Clustering : Cluster : ordered list of objects sharing some similarities. Distance Between Two Clusters : Implementation Dependent;e.g. Minkowski Metric Similarity : function SIMILAR(D i, D j ) ; 0 : no agreement 1 : perfect agreement Threshold : lowest possible input value of similarity required to join two objects in a cluster.

5 Clustering Algorithms HierarchicalAgglomerativeDivisiveNon-Hierarchical Partiti oning(e.g.GNG,DBS CAN,K-means) Clumping(e.g. Fuzzy C-means) Probabilistic(e.g.Mix ture of Gaussians)

6 Possible Applications: Marketing Biology & Medical Sciences Libraries Insurance City Planning WWW

7 Growing Neural Gas Proposed by Bernd Fritzke Parametres are constant in time Incremental Adaptive Competitive Hebbian Learning

8 Parametres in GNG: e_b : Learning rate of winner node e_n : Learning rate of neighbours lambda: when new node will be inserted alpha : error decrement of winner nodes upon insertion of new node beta : error decrement of all nodes

9 Algorithm: 1) Initialise a set A to contain two nodes randomly chosen according to probability distribution p(ξ). 2) Generate an input signal ξ according to p(ξ). 3) Determine the winner node s 1 and second nearest node s 2 such that s 1,s 2 belong to A. 4) Create an edge between s 1 & s 2 (if not exist).Set its age to 0. 5) Increase error of s 1 by distance between ξ & s 1. 6) Move s 1 and its neighbors towards input signal by e_w and e_n of difference between the coordinates. 7) Increment age of all edges emanating from s 1. 8) Delete all edges with age >= max_age.Delete nodes with no edges. 9) If no. of input signals generated so far is a multiple of λ, insert a new node,r. a)Find node with largest error,q and neighbor of q with largest error,f. b)Assign r the mean position of q and f and error r = (error q + error f )/2 c)error q -= α * error q & error f -= α* error f d)add r in A. 10) Decrease error of all nodes by β *error i.

10 Demo of GNG Reference:http://homepages.feis.herts.ac.uk/~nng roup/software.php

11 DBSCAN : Density Based Spatial Clustering of Application with Noise Proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaovei Xui in 1996. Finds clusters starting from estimated density. Two parametres : epsilon(eps ) and minimum points minPts. eps can be estimated.

12 Algorithm : Reference:slides by Francesco Satini Phd Student IMT

13 Comparing GNG & DBSCAN Time Complexity Capability of tackling high dimensional data Perfomance Number of initial parametres Perfomance with moving data

14 Data to be used Mainly design data

15 References: Jim Holmstrm :Master Thesis,Growing Neural Gas-Experiments with GNG-GNG with Utility and Supervised GNG M Ester, HP Kriegel, J Sander, X Xu : A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise - Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, 1996 Competitive learning:http://homepages.feis.herts.ac.uk/~nngroup/software.p hp www.utdallas.edu/~lkhan/Spring2008G/DBSCAN.ppt B. Fritzke. :A growing neural gas network learns topologies. Jose Alfredo F. Costa and Ricardo S. Oliveira :Cluster Analysis using Growing Neural Gas and Graph Partitioning


Download ppt "Project Presentation Arpan Maheshwari Y7082,CSE Supervisor: Prof. Amitav Mukerjee Madan M Dabbeeru."

Similar presentations


Ads by Google