FLOSCAN: An Artificial Life Based Data Mining Algorithm

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Clustering Basic Concepts and Algorithms
Particle Swarm Optimization
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
Flocking and more.  NPC groups can move in cohesive groups not just independently ◦ Meadow of sheep grazing? ◦ Hunting flock of birds? ◦ Ants? Bees?
More on Clustering Hierarchical Clustering to be discussed in Clustering Part2 DBSCAN will be used in programming project.
1 CO Games Development 2 Week 22 Flocking Gareth Bellaby.
G. Folino, A. Forestiero, G. Spezzano Swarming Agents for Discovering Clusters in Spatial Data Second International.
Florian Klein Flocking Cooperation with Limited Communication in Mobile Networks.
Behavioral Animation Procedural Animation Type?. Behavioral Animation Introduced by C. Reynolds (1987) Animating many things at one time –A group of the.
Particle Swarm Optimization
Data Mining Techniques: Clustering
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
1Notes  Assignment 2 is out  Flocking references  Reynolds, “Flocks, Herds, and Schools…”, SIGGRAPH’87  Tu and Terzopoulos, “Artificial Fishes…”, SIGGRAPH’94.
John S Gero Agents – Agent Simulations AGENT-BASED SIMULATIONS.
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
Locally Constraint Support Vector Clustering
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Data Mining CS 341, Spring 2007 Project Discussion.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Collective Animal Behavior Ariana Strandburg-Peshkin.
CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling
An Effective Fuzzy Clustering Algorithm for Web Document Classification: A Case Study in Cultural Content Mining Nils Murrugarra.
Distributed, Physics- based Control of Swarms of Vehicles W. M. Spears, D.F. Spears, J. C. Hamann and R. Heil (2000) Presentation by Herke van Hoof.
Biology: flocking, herding & schooling Day 5 COLQ 201 Multiagent modeling Harry Howard Tulane University.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
An Efficient Approach to Clustering in Large Multimedia Databases with Noise Alexander Hinneburg and Daniel A. Keim.
(Particle Swarm Optimisation)
The 5th annual UK Workshop on Computational Intelligence London, 5-7 September 2005 The 5th annual UK Workshop on Computational Intelligence London, 5-7.
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
Automatic Detection of Social Tag Spams Using a Text Mining Approach Hsin-Chang Yang Associate Professor Department of Information Management National.
Anomaly Detection in Data Mining. Hybrid Approach between Filtering- and-refinement and DBSCAN Eng. Ştefan-Iulian Handra Prof. Dr. Eng. Horia Cioc ârlie.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Controlling the Behavior of Swarm Systems Zachary Kurtz CMSC 601, 5/4/
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
Particle Swarm Optimization † Spencer Vogel † This presentation contains cheesy graphics and animations and they will be awesome.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
REFERENCES: FLOCKING.
On Utillizing LVQ3-Type Algorithms to Enhance Prototype Reduction Schemes Sang-Woon Kim and B. John Oommen* Myongji University, Carleton University*
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Particle Swarm Optimization (PSO)
Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
Students: Yossi Turgeman Avi Deri Self-Stabilizing and Efficient Robust Uncertainty Management Instructor: Prof Michel Segal.
Queensland University of Technology
Data Mining: Basic Cluster Analysis
DATA MINING Spatial Clustering
More on Clustering in COSC 4335
Scientific Research Group in Egypt (SRGE)
Scientific Research Group in Egypt (SRGE)
Ana Wu Daniel A. Sabol A Novel Approach for Library Materials Acquisition using Discrete Particle Swarm Optimization.
Parallel Density-based Hybrid Clustering
数据挖掘 Introduction to Data Mining
K-means and Hierarchical Clustering
11/18/2018 In the name of God A Fish School Clustering Algorithm: Applied to Student Sectioning Problem By: Mahmood Amintoosi, Mahmoud Fathy, Naser Mozayani,
CSE572, CBS598: Data Mining by H. Liu
CSE572, CBS572: Data Mining by H. Liu
Prepared by: Mahmoud Rafeek Al-Farra
Relevance and Reinforcement in Interactive Browsing
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Agent-Based Models Hiroki Sayama
CSE572: Data Mining by H. Liu
Mean-shift outlier detection
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
Presentation transcript:

FLOSCAN: An Artificial Life Based Data Mining Algorithm A. Bellaachia Computer Science Department School of Engineering and Applied Sciences George Washington University Washington, DC 20052 E-mail: bell@gwu.edu A. Bellaachia April 21, 2019

Outline Introduction Artificial Life Flocking Behavior Flocking Parameters FLOSCAN Experimental Results Conclusion & Future Work A. Bellaachia April 21, 2019

Artificial Life Behaviors A. Bellaachia April 21, 2019

Biologically inspired A. Bellaachia April 21, 2019

Flocking Behavior Introduced by Reynolds in 1987 for computer graphics applications Based on the flocking behavior of birds. Each boid is defined by its direction, speed, the position of a set of birds (Boids) are related to the positions and velocities of its neighbors Simple rules on individual boids yields a very interesting global behavior: No leader, shape, global constraint. A. Bellaachia April 21, 2019

Flocking Behavior Flocking basic rules: Separation: steer to avoid crowding local flock-mates or collide with neighboring boids. Alignment: steer towards the average heading of local flock-mates. Cohesion: steer to move toward the average position of local flock-mates. A. Bellaachia April 21, 2019

Flocking Behavior (Reynold’s Model) Boid Parameters: Velocity: refers to the combination of heading and speed. Minimum distance Angle: Boid’s vision  Position Maximum distance A. Bellaachia April 21, 2019

Vj (Or average velocity of neighborhood boids) Flocking Model y Vj (Or average velocity of neighborhood boids) Vi VND VAL  Vj VAT New Position  x A. Bellaachia April 21, 2019

Attraction: Where is the maximum distance between two boids. A. Bellaachia April 21, 2019

FLOSCAN: Flocking Model Boids Movement Velocity = Current Position VAL (align Vi) = average of velocity of ( Boid i and Velocity of neighboring boid(s) (Vj) ) New Direction: VND = KND *(VAL+VAT) No Separation y Vj  VND VAL= (Vi+Vj)/2 Vj Vi: Velocity = Current Position VAT New Position  x A. Bellaachia April 21, 2019

FLOSCAN: Flocking Model Similar to the previous model except for VAL y Velocity = Current Position Ignore Vi VAL (align Vi) = KAL*V New Direction: VND = KND *(VAL+VAT) No Separation Vj  VND VAL = KAL*Vj Vj VAT Vi: Velocity = Current Position New Position  x A. Bellaachia April 21, 2019

FLOSCAN: Attraction Alignment: Attraction: Where K1 is constant Note that Vj can also be the average velocity of the neighborhood boids. Attraction: Where is the maximum distance between two boids. A. Bellaachia April 21, 2019

FLOSCAN: New Direction & Position Where KND is constant New Position: First Get the Speed vector: Where A. Bellaachia April 21, 2019

FLOSCAN: New Position The new position vector of boid i is calculated as follows: A. Bellaachia April 21, 2019

FLOSCAN: Example A. Bellaachia April 21, 2019

FLOSCAN Objectives FLSCAN: It is a density-based algorithm Objectives: FLOSCAN as a pre-clustering step to a clustering algorithm: Data points, sharing some common features, will be closer to each other. This will enhance the shape of potential clusters and therefore improve the efficiency of a clustering algorithm. FLOSCAN can also be used as either a clustering algorithm or a classification algorithm: capable of discovering clusters of different shapes and detecting noise points. A. Bellaachia April 21, 2019

FLOSCAN Pseudo Code: Initialization of the above parameters. Calculate the distance between each document di For each iteration do For each document di do Find the neighbors, N, of di using the minimum and maximum distances. For each neighbor dj in N do Calculate the align vector and attract vector of di using dj Calculate the new direction vector for di End do Calculate the new direction vector of di. End for. A. Bellaachia April 21, 2019

Experimental results To measure the performance of FLOSCAN, we use the LDC TDT Corpus. We have randomly chosen about 1,200 stories: about half collected from Reuters newswire and half from CNN broadcast news transcripts. Twenty-five topics were defined in the original release. A. Bellaachia April 21, 2019

Document Representation Use vector model to represent documents in the collection. Remove stopwords Assign a weight to each term in each document: Augmented weight L(tji) = 0.5 + 0.5 * (tf(tji)/tf(max)) where, tf(max) = max{ tf(t1i), tf(t2i), ... , tf(tmi)} and m is the max number of terms in the collection. A. Bellaachia April 21, 2019

Evaluations F-Measure: A combination of IR precision and recall. A. Bellaachia April 21, 2019

Evaluations Centroid Similarity (CS): It computes the similarity between the centroids of all clusters. Given a set of k clusters, CS is defined as follows: A. Bellaachia April 21, 2019

Experimental Results A. Bellaachia April 21, 2019

Experimental Results A. Bellaachia April 21, 2019

Experimental Results A. Bellaachia April 21, 2019

Experimental Results A. Bellaachia April 21, 2019

Experimental Results A. Bellaachia April 21, 2019

Conclusion & Future Work FLOSCAN: Introduce a new flocking based algorithm that can be used: Clustering Pre-preprocessing step in a data mining algorithm. Experimental results and comparison to DBSCAN. Future Work include: Other Experiments with large datasets and other artificial-life algorithms such as Ant algorithm. Analyze the scalability of FLOSCAN Use FLOSCAN as a classification algorithm Theoretical analysis of the initial parameters required by FLOSCAN, namely maximum distance, number of iterations, speed value. A. Bellaachia April 21, 2019

Questions .. Thank you… A. Bellaachia April 21, 2019