Self-organizing maps (SOMs) and k-means clustering: Part 1 Steven Feldstein The Pennsylvania State University Trieste, Italy, October 21, 2013 Collaborators:

Slides:



Advertisements
Similar presentations
Self-Organizing Maps Projection of p dimensional observations to a two (or one) dimensional grid space Constraint version of K-means clustering –Prototypes.
Advertisements

Clustering.
Maximum Covariance Analysis Canonical Correlation Analysis.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Self Organization: Competitive Learning
Kohonen Self Organising Maps Michael J. Watts
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
X0 xn w0 wn o Threshold units SOM.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
Regionalized Variables take on values according to spatial location. Given: Where: A “structural” coarse scale forcing or trend A random” Local spatial.
Switch to Top-down Top-down or move-to-nearest Partition documents into ‘k’ clusters Two variants “Hard” (0/1) assignment of documents to clusters “soft”
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
1 Study of Topographic and Equiprobable Mapping with Clustering for Fault Classification Ashish Babbar EE645 Final Project.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker Part of the slides is adapted from Chris Workman.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
Lecture 09 Clustering-based Learning
Radial Basis Function (RBF) Networks
Projection methods in chemistry Autumn 2011 By: Atefe Malek.khatabi M. Daszykowski, B. Walczak, D.L. Massart* Chemometrics and Intelligent Laboratory.
Introduction to undirected Data Mining: Clustering
Steven B. Feldstein Department of Meteorology, The Pennsylvania State University, University Park, Pennsylvania, U.S.A. Presented at Tel Aviv University,
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Project reminder Deadline: Monday :00 Prepare 10 minutes long pesentation (in Czech/Slovak), which you’ll present on Wednesday during.
Lecture 12 Self-organizing maps of Kohonen RBF-networks
Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques.
Identifying Teleconnection Patterns from Point Correlation Maps using Self Organizing Maps Freja Hunt, Joel Hirschi and Bablu Sinha National Oceanography.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Self-organizing Maps Kevin Pang. Goal Research SOMs Research SOMs Create an introductory tutorial on the algorithm Create an introductory tutorial on.
Teleconnections and the MJO: intraseasonal and interannual variability Steven Feldstein June 25, 2012 University of Hawaii.
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.
Self-organizing maps (SOMs) and k-means clustering: Part 2 Steven Feldstein The Pennsylvania State University Trieste, Italy, October 21, 2013 Collaborators:
Speaker/ Pei-Ning Kirsten Feng Advisor/ Yu-Heng Tseng
Gridding Daily Climate Variables for use in ENSEMBLES Malcolm Haylock, Climatic Research Unit Nynke Hofstra, Mark New, Phil Jones.
A two-stage approach for multi- objective decision making with applications to system reliability optimization Zhaojun Li, Haitao Liao, David W. Coit Reliability.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
NEURAL NETWORKS FOR DATA MINING
Evaluation of AMPS Forecasts Using Self-Organizing Maps (SOMs) John J. Cassano Cooperative Institute for Research in Environmental Science and Program.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
Data Extraction using Image Similarity CIS 601 Image Processing Ajay Kumar Yadav.
Multidimensional Scaling Vuokko Vuori Based on: Data Exploration Using Self-Organizing Maps, Samuel Kaski, Ph.D. Thesis, 1997 Multivariate Statistical.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
Hierarchical Clustering of Gene Expression Data Author : Feng Luo, Kun Tang Latifur Khan Graduate : Chien-Ming Hsiao.
Clustering.
381 Self Organization Map Learning without Examples.
2010/ 11/ 16 Speaker/ Pei-Ning Kirsten Feng Advisor/ Yu-Heng Tseng
CUNY Graduate Center December 15 Erdal Kose. Outlines Define SOMs Application Areas Structure Of SOMs (Basic Algorithm) Learning Algorithm Simulation.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
© Vipin Kumar IIT Mumbai Case Study 2: Dipoles Teleconnections are recurring long distance patterns of climate anomalies. Typically, teleconnections.
3 “Products” of Principle Component Analysis
Soft Computing Lecture 15 Constructive learning algorithms. Network of Hamming.
The impact of tropical convection and interference on the extratropical circulation Steven Feldstein and Michael Goss The Pennsylvania State University.
Central limit theorem revisited Throw a dice twelve times- the distribution of values is not Gaussian Dice Value Number Of Occurrences.
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
4.0 - Data Mining Sébastien Lemieux Elitra Canada Ltd.
Self-Organizing Network Model (SOM) Session 11
Data Mining, Neural Network and Genetic Programming
Self-organizing map numeric vectors and sequence motifs
Feature mapping: Self-organizing Maps
Presentation transcript:

Self-organizing maps (SOMs) and k-means clustering: Part 1 Steven Feldstein The Pennsylvania State University Trieste, Italy, October 21, 2013 Collaborators: Sukyoung Lee, Nat Johnson

Teleconnection Patterns Atmospheric teleconnections are spatial patterns that link remote locations across the globe (Wallace and Gutzler 1981; Barnston and Livezey 1987) Teleconnection patterns span a broad range of time scales, from just beyond the period of synoptic-scale variability, to interannual and interdecadal time scales.

Methods for DeterminingTeleconnection Patterns Empirical Orthogonal Functions (EOFs) (Kutzbach 1967) Rotated EOFs (Barnston and Livezey 1987) One-point correlation maps (Wallace and Gutzler 1981) Empirical Orthogonal Teleconnections (van den Dool 2000) Self Organizing Maps (SOMs) (Hewiston and Crane 2002) k-means cluster analysis (Michelangeli et al. 1995)

Advantages and Disadvantages of various techniques Empirical Orthogonal Functions (EOFs): patterns maximize variance, easy to use, but patterns orthogonal in space and time, symmetry between phases, i.e., may not be realistic, can’t identify continuum Rotated EOFs: patterns more realistic than EOFs, but some arbitrariness, can’t identify continuum One-point correlation maps: realistic patterns, but patterns not objective organized, i.e., different pattern for each grid point Self Organizing Maps (SOMs): realistic patterns, allows for a continuum, i.e., many NAO-like patterns, asymmetry between phases, but harder to use k-means cluster analysis: Michelangeli et al. 1995

Climate Prediction Center The dominant Northern Hemisphere teleconnection patterns North Atlantic Oscillation Pacific/North American pattern

Aim of EOF, SOM analysis, and k- means clustering To reduce a large amount of data into a small number of representative patterns that capture a large fraction of the variability with spatial patterns that resemble the observed data

Link between the PNA and Tropical Convection From Horel and Wallace (1981) Enhanced Convection

A SOM Example Northern Hemispheric Sea Level Pressure (SLP) P1= P2= P3=

Another SOM Example (Higgins and Cassano 2009)

A third example

How SOM patterns are determined Transform 2D sea-level pressure (SLP) data onto an N-dimension phase space, where N is the number of gridpoints. Then, minimize the Euclidean between the daily data and SOM patterns where is the daily data (SLP) in the N-dimensional phase, are the SOM patterns, and i is the SOM pattern number.

How SOM patterns are determined E is the average quantization error, The (SOM patterns) are obtained by minimizing E.

SOM Learning Initial Lattice (set of nodes) Data Randomly- chosen vector BMU Nearby Nodes Adjusted (with neighbourhood kernel) Convergence: Nodes Match Data

SOM Learning 1. Initial lattice (set of nodes) specified (from random data or from EOFs) 2. Vector chosen at random and compared to lattice. 3. Winning node (Best Matching Unit; BMU) based on smallest Euclidean distance is selected. 4. Nodes within a certain radius of BMU are adjusted. Radius diminishes with time step. 5. Repeat steps 2-4 until convergence.

How SOM spatial patterns are determined Transform SOM patterns from phase space back to physical space (obtain SLP SOM patterns) Each day is associated with a SOM pattern Calculate a frequency, f, for each SOM pattern, i.e., f ( ) = number of days is chosen / total number of days

SOMs are special! Amongst cluster techniques, SOM analysis is unique in that it generates a 2D grid with similar patterns nearby and dissimilar patterns widely separated.

Some Background on SOMs SOM analysis is a type of Artificial Neural Network which generates a 2-dimensional map (usually). This results in a low-dimensional view of the original high-dimension data, e.g., reducing thousands of daily maps into a small number of maps. SOMs were developed by Teuvo Kohonen of Finland.

Artificial Neural Networks Artificial Neural Networks are used in many fields. They are based upon the central nervous system of animals. Input = Daily Fields Hidden = Minimization of Euclidean Distance Output = SOM patterns

A simple conceptual example of SOM analysis Uniformly distributed data between 0 and 1 in 2-dimensions

A table tennis example (spin of ball) Spin occurs primarily along 2 axes of rotation. Infinite number of angular velocities along both axes components. Input - Three senses (sight, sound, touch) feedback as in SOM learning Hidden - Brain processes information from senses to produce output Output - SOM grid of various amounts of spin on ball. SOM grid different for every person Joo SaeHyuk 주세혁