Unsupervised Learning Business School Institute of Business Informatics Unsupervised Learning Uwe Lämmel www.wi.hs-wismar.de/~laemmel U.laemmel@wi.hs-wismar.de
Neural Networks Idea Artificial Neuron & Network Supervised Learning Unsupervised Learning Data Mining – other Techniques
Unsupervised Learning Self-Organizing Map (SOM) Learning Clustering – Example Visualisation Application: TSP
Self Organizing Maps (SOM) A natural brain can organize itself Now we look at the position of a neuron and its neighbourhood Kohonen Feature Map two layer pattern associator Input layer is fully connected with map-layer Neurons of the map layer are fully connected to each other (virtually)
Clustering f ai output B Input set A objective: All inputs of a class are mapped onto one and the same neuron f Input set A output B ai Problem: classification in the input space is unknown Network performs a clustering
Winner Neuron Kohonen- Layer Input-Layer Winner Neuron
Learning in an SOM Choose an input k randomly Detect the neuron z which has the maximal activity -> winner neuron Adapt the weights in the neighbourhood of z: neuron i within a radius r of z. Stop if a certain number of learning steps is finished otherwise decrease learning rate and radius, go on with step 1.
Centre of Activation Idea: highly activated neurons push down the activation of neurons in the neighbourhood Problem: Finding the centre of activation: Neuron j with a maximal net-input Neuron j, having a weight vector wj which is similar to the input vector (Euklidian Distance): z: x - wz = minj x - wj
SOM Training Kohonen layer input pattern mp Wj find the winner neuron z for an input pattern p (minimal Euclidian distance) adapt weights of connections input neurons winner neuron input neurons neighbours
Example Credit Scoring A1: Credit History A2: Debts A3: Collateral A4: Income We do not look at the Classification SOM performs a Clustering
Credit Scoring good = {5,6,9,10,12} average = {3, 8, 13} bad = {1,2,4,7,11,14}
Credit Scoring Pascal tool box (1991) 10x10 neurons 32,000 training steps
Visualisation of a SOM Colour reflects Euclidian distance to input Weights used as coordinates of a neuron Colour reflects cluster NetDemo ColorDemo TSPDemo
Experiment: Pascal Program, 1998 Example TSP Travelling Salesman Problem A salesman has to visit certain cities and will return to his home. Find an optimal route! problem has exponential complexity: (n-1)! routes 31/32 states in Mexico? Experiment: Pascal Program, 1998
Nearest Neighbour: Example Some cities in Northern Germany: Initial city is Hamburg Kiel Rostock Berlin Hamburg Hannover Frankfurt Essen Schwerin Exercise: Put in the coordinates of 20 important places Find a solution for the TSP using a SOM!
Draw a neuron at position: SOM solves TSP Kohonen layer input Draw a neuron at position: (x,y)=(w1i,w2i) w1i= six X w2i= siy Y
SOM solves TSP Initialisation of weights: weights to input (x,y) are calculated so that all neurons form a circle The initial circle will be expanded to a round trip Solutions for problems of several hundreds of towns are possible Solution may be not optimal!
Applications Data Mining - Clustering Customer Data Weblog ... You have a lot of data, but no teaching data available – unsupervised learning you have at least an idea about the result Can be applied as a first approach to get some training data for supervised learning