A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence Yue Ming NJIT#: 31351707.

Slides:



Advertisements
Similar presentations
Conceptual Clustering
Advertisements

Random Forest Predrag Radenković 3237/10
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Graduate : Sheng-Hsuan Wang
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Learning Classifier Systems to Intrusion Detection Monu Bambroo 12/01/03.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Data Mining – Intro.
Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Data mining and machine learning A brief introduction.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
1 SciCSM: Novel Contrast Set Mining over Scientific Datasets Using Bitmap Indices Gangyi Zhu, Yi Wang, Gagan Agrawal The Ohio State University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
2010 IEEE International Conference on Systems, Man, and Cybernetics (SMC2010) A Hybrid Particle Swarm Optimization Considering Accuracy and Diversity.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Presenter : Lin, Shu-Han Authors : Jeen-Shing Wang, Jen-Chieh Chiang
CLUSTER ANALYSIS Introduction to Clustering Major Clustering Methods.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
The Restricted Matched Filter for Distributed Detection Charles Sestok and Alan Oppenheim MIT DARPA SensIT PI Meeting Jan. 16, 2002.
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison.
1 Motion Fuzzy Controller Structure(1/7) In this part, we start design the fuzzy logic controller aimed at producing the velocities of the robot right.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Large Scale Distributed Distance Metric Learning by Pengtao Xie and Eric Xing PRESENTED BY: PRIYANKA.
Faculty of Information Engineering, Shenzhen University Liao Huilian SZU TI-DSPs LAB Aug 27, 2007 Optimizer based on particle swarm optimization and LBG.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Journal of Computational and Applied Mathematics Volume 253, 1 December 2013, Pages 14–25 Reporter : Zong-Dian Lee A hybrid quantum inspired harmony search.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Book web site:
Cluster Analysis This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed under a Creative Commons.
Introduction to Machine Learning, its potential usage in network area,
Big data classification using neural network
Data Mining – Intro.
Enhanced MWO Training Algorithm to Improve Classification Accuracy
What Is Cluster Analysis?
Support Feature Machine for DNA microarray data
Discrete ABC Based on Similarity for GCP
Parallel Density-based Hybrid Clustering
Motion Detection And Analysis
Machine Learning Basics
A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence Yue Ming NJIT#:
Lecture 22 Clustering (3).
Prepared by: Mahmoud Rafeek Al-Farra
Deep Learning Hierarchical Representations for Image Steganalysis
Open-Category Classification by Adversarial Sample Generation
Topic Oriented Semi-supervised Document Clustering
COSC 4335: Other Classification Techniques
iSRD Spam Review Detection with Imbalanced Data Distributions
CSE572, CBS572: Data Mining by H. Liu
Clustering Wei Wang.
Automatic Segmentation of Data Sequences
University of Wisconsin - Madison
Discriminative Training
Presentation transcript:

A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence Yue Ming NJIT#: 31351707

Outline Background Introduction about K-means K-MWO Algorithm Clustering ensemble method Simulation and conclusion

Clustering   Courses on data mining or machine learning will usually start with clustering, because it is both simple and useful. It is an important part of a somewhat wider area of Unsupervised Learning, where the data we want to describe is not labeled. Difference with classification There is no label

Clustering   Given a set of data points, group them into a clusters so that: Points within each cluster are similar to each other Points from different clusters are dissimilar

K-Means algorithm An iterative clustering algorithm  Effective, widely used, all-around clustering algorithm An iterative clustering algorithm Aim find local maxima in each iteration. 

Mussels Wandering Optimization Algorithm K-Means algorithm Drawbacks it is very sensitive to its initial value as different initial values may lead to different solutions (2) it is based on an objective function simply and usually solves the extreme value problem by the gradient method So… Mussels Wandering Optimization Algorithm

A New Algorithm   Based on K-means and mussels wandering optimization (MWO) K-MWO

Mussels Wandering Optimization (MWO)   Based on swarm intelligence The population of mussels consists of N individuals, these individuals are in a certain spatial region of marine bed called the habitat. Mapped to a d-dimensional space Sd Each mussel has position.

Mussels Wandering Optimization (MWO)   MWO performs well and provides a new approach for solving complex optimization problems. Hence, by combining MWO with classic K-means, this work proposes K-MWO as a new clustering method based on swarm intelligence.

K-MWO  Each mussel

K-MWO 

K-MWO 

K-MWO 

BUT…   Although many clustering algorithms are efficient in dealing with specific problems, every one of them has its own limitations. They may produce different results on the same dataset. No clustering algorithm is applicable to various structures and different types of datasets. Clustering ensemble (CE) is recognized to be an important way to address the above problems.

Clustering ensemble (CE) CE uses an ensemble technology to produce a new clustering result by integrating several clustering results obtained from different clustering methods or the same method with different parameters. Advantages: it can improve the quality of clustering results, cluster a dataset with a categorical attribute, and detect and handle isolated points and noises. It can also deal with distributed data sources and process the data in parallel. This paper presents a new similarity-based clustering method based on weight information.

Clustering ensemble method   Similarity-based clustering ensemble Key idea: The weights for data points change in every iteration

Clustering ensemble method   Key idea: Based on a similarity-based method, by adding weight information.

Simulation & conclusion

Conclusion This work intends to test the clustering performance of K-MWO, and compare it with K-means and K-PSO to prove its effectiveness at first. The paper choose three clustering methods as the basic input ones to perform the proposed CE method WSCE.

Conclusion

Conclusion

Conclusion

Conclusion This work has proposed a new clustering algorithm called K- MWO, clustering method. It makes full use of the global optimization ability of MWO and local search ability of K-means. This work has also proposed a new clustering ensemble framework based on weight. The author substantiate the framework on various datasets, whose results show its validity and excellent performance on all considered datasets.

Thank you