Fast algorithm and implementation of dissimilarity self-organizing maps Reporter: Ming-Jui Kuo D9515007.

Slides:



Advertisements
Similar presentations
Coordinatate systems are used to assign numeric values to locations with respect to a particular frame of reference commonly referred to as the origin.
Advertisements

Fast Algorithms For Hierarchical Range Histogram Constructions
Robust Part-Based Hand Gesture Recognition Using Kinect Sensor
Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.
INTELLIGENT EDITOR FOR ANDROID MOBILES PHASE 1 : HANDWRITING RECOGNITION ADVANCED MOBILE SYSTEMS ENGINEERING RESEARCH PROJECT BY NITYATA N KUMAR AND AASHRAY.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Dynamics of Learning VQ and Neural Gas Aree Witoelar, Michael Biehl Mathematics and Computing Science University of Groningen, Netherlands in collaboration.
Kohonen Self Organising Maps Michael J. Watts
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Complexity 7-1 Complexity Andrei Bulatov Complexity of Problems.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
Complexity 5-1 Complexity Andrei Bulatov Complexity of Problems.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Tracking a moving object with real-time obstacle avoidance Chung-Hao Chen, Chang Cheng, David Page, Andreas Koschan and Mongi Abidi Imaging, Robotics and.
CONTENT BASED FACE RECOGNITION Ankur Jain 01D05007 Pranshu Sharma Prashant Baronia 01D05005 Swapnil Zarekar 01D05001 Under the guidance of Prof.
Cmpt-225 Algorithm Efficiency.
Aho-Corasick String Matching An Efficient String Matching.
Programming Logic and Design, Introductory, Fourth Edition1 Understanding Computer Components and Operations (continued) A program must be free of syntax.
Lecture 09 Clustering-based Learning
Data Compression Basics & Huffman Coding
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
Copyright © Cengage Learning. All rights reserved.
Image recognition using analysis of the frequency domain features 1.
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Data Compression By, Keerthi Gundapaneni. Introduction Data Compression is an very effective means to save storage space and network bandwidth. A large.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Self Organization of a Massive Document Collection Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Teuvo Kohonen et al.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Miin-Shen Yang a*, Wen-Liang Hung b, De-Hua Chen a 2012, FSS Self-organizing map.
Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Batch kernel SOM and related Laplacian methods for social network analysis Presenter : Lin, Shu-Han Authors.
Introduction The average running speed of a human being is 5–8 mph. Imagine you set out on an 8-mile run that takes you no longer than 1 hour. You run.
Fast and accurate text classification via multiple linear discriminant projections Soumen Chakrabarti Shourya Roy Mahesh Soundalgekar IIT Bombay
Fundamentals of Algorithms MCS - 2 Lecture # 8. Growth of Functions.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
CUNY Graduate Center December 15 Erdal Kose. Outlines Define SOMs Application Areas Structure Of SOMs (Basic Algorithm) Learning Algorithm Simulation.
CHAPTER 1: Introduction. 2 Why “Learn”? Machine learning is programming computers to optimize a performance criterion using example data or past experience.
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
What are the advantages of using bar code scanner?  Fast  It is fast  It is fast for reading data  It is fast for data input  Accurate  The advantage.
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Presented by: Dardan Xhymshiti Spring 2016:. Authors: Publication:  ICDM 2015 Type:  Research Paper 2 Michael ShekelyamGregor JosseMatthias Schubert.
WHAT IS THIS? Clue…it’s a drink SIMPLE SEQUENCE CONTROL STRUCTURE Introduction A computer is an extremely powerful, fast machine. In less than a second,
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.
Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.
Hashing (part 2) CSE 2011 Winter March 2018.
Semi-Supervised Clustering
Self Organizing Maps: Parametrization of Parton Distribution Functions
A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets Ashok Sharma, Robert Podolsky, Jieping.
Introduction The average running speed of a human being is 5–8 mph. Imagine you set out on an 8-mile run that takes you no longer than 1 hour. You run.
PSG College of Technology
DSS & Warehousing Systems
Theory of Algorithms Introduction.
Unsupervised Learning and Neural Networks
Lecture 22 Clustering (3).
Neural Networks and Their Application in the Fields of Coporate Finance By Eric Séverin Hanna Viinikainen.
Reporter: Ming-Jui Kuo D
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Inductive Clustering: A technique for clustering search results Hieu Khac Le Department of Computer Science - University of Illinois at Urbana-Champaign.
Presentation transcript:

Fast algorithm and implementation of dissimilarity self-organizing maps Reporter: Ming-Jui Kuo D

2/43 Outline Introduction The DSOM Simulation Results

3/43

4/43 A drawback in standard SOM Since vectors from a fixed and finite-dimensional vector space. Unfortunately, many real-world data depart strongly from this model. It is quite common, for instance, to have variable-sized data. They are natural, for example, in online handwriting recognition where the representation of a character drawn by the user can vary in length because of the drawing conditions. Other data, such as texts for instance, are strongly non- numerical and have a complex internal structure: they are very difficult to represent accurately in a vector space.

5/43 Related papers [1] Teuvo Kohonen*, Panu Somervuo, “Self-organizing maps of symbol strings,” Neurocomputing, vol 21, pp , 1998 [2] Teuvo Kohonen*, Panu Somervuo, “How to make large self-organizing maps for nonvectorial data,” vol. 15, pp , 2002 [3] Aïcha El Golli, Brieuc Conan-Guez, and Fabrice Rossi, “A Self Organizing Map for dissimilarity data,” IFCS, 2004, Proceedings. [4] Aïcha El Golli, “Speeding up the self organizing map for dissimilarity data”

6/43 Alias Name Median self-organizing map : Median SOM [2] Dissimilarity self-organizing map : DSOM [3]

7/43 Related web sites by Fabrice Rossi

8/43 A major drawback of the DSOM is that its running time can be very high, especially when compared to the standard vector SOM. It is well known that the SOM algorithm behaves linearly with the number of input data. In contrast, the DSOM behaves quadratically with this number. The goal of this paper

9/43 In this paper, the authors propose several modifications of the basic algorithm that allow a much faster implementation. The quadratic nature of the algorithm cannot be avoided, essentially because dissimilarity data are intrinsically described by a quadratic number of one-to-one dissimilarities. The goal of this paper (cont’d)

10/43 The standard DSOM algorithm cost is proportional to, where N is the number of observations and M the number of clusters that the algorithm has to produce, whereas the modifications of this paper lead to a cost proportional to. (save in the representation phase) An important property of all modifications in this paper is that the obtained algorithm produces exactly the same results as the standard DSOM algorithm. The goal of this paper (cont’d)

11/43 Dissimilarity Data In a given data set X, use the dissimilarity measure to measure the dissimilarity between data instances (one-to-one, pairwise). Sometimes, the distance measurement can be used.

12/43 Dissimilarity Data (cont’d)

13/43 Dissimilarity Data (cont’d)

14/43

15/43 The DSOM algorithm 1: choose initial values for {Initialization phase} 2: for l = 1 to L do 3: for all do {Template for the affectation phase} 4: compute 5: end for 6: for all do {Template for the representation phase} 7: compute 8: end for 9: end for

16/43

17/43 The DSOM

18/43 Partial Sums

19/43 Early Stopping

20/43

21/43

22/43

23/43

24/43

25/43