Presentation is loading. Please wait.

Presentation is loading. Please wait.

FLOSCAN: An Artificial Life Based Data Mining Algorithm

Similar presentations


Presentation on theme: "FLOSCAN: An Artificial Life Based Data Mining Algorithm"— Presentation transcript:

1 FLOSCAN: An Artificial Life Based Data Mining Algorithm
A. Bellaachia Computer Science Department School of Engineering and Applied Sciences George Washington University Washington, DC 20052 A. Bellaachia April 21, 2019

2 Outline Introduction Artificial Life Flocking Behavior
Flocking Parameters FLOSCAN Experimental Results Conclusion & Future Work A. Bellaachia April 21, 2019

3 Artificial Life Behaviors
A. Bellaachia April 21, 2019

4 Biologically inspired
A. Bellaachia April 21, 2019

5 Flocking Behavior Introduced by Reynolds in 1987 for computer graphics applications Based on the flocking behavior of birds. Each boid is defined by its direction, speed, the position of a set of birds (Boids) are related to the positions and velocities of its neighbors Simple rules on individual boids yields a very interesting global behavior: No leader, shape, global constraint. A. Bellaachia April 21, 2019

6 Flocking Behavior Flocking basic rules:
Separation: steer to avoid crowding local flock-mates or collide with neighboring boids. Alignment: steer towards the average heading of local flock-mates. Cohesion: steer to move toward the average position of local flock-mates. A. Bellaachia April 21, 2019

7 Flocking Behavior (Reynold’s Model)
Boid Parameters: Velocity: refers to the combination of heading and speed. Minimum distance Angle: Boid’s vision Position Maximum distance A. Bellaachia April 21, 2019

8 Vj (Or average velocity of neighborhood boids)
Flocking Model y Vj (Or average velocity of neighborhood boids) Vi VND VAL Vj VAT New Position x A. Bellaachia April 21, 2019

9 Attraction: Where is the maximum distance between two boids.
A. Bellaachia April 21, 2019

10 FLOSCAN: Flocking Model
Boids Movement Velocity = Current Position VAL (align Vi) = average of velocity of ( Boid i and Velocity of neighboring boid(s) (Vj) ) New Direction: VND = KND *(VAL+VAT) No Separation y Vj VND VAL= (Vi+Vj)/2 Vj Vi: Velocity = Current Position VAT New Position x A. Bellaachia April 21, 2019

11 FLOSCAN: Flocking Model
Similar to the previous model except for VAL y Velocity = Current Position Ignore Vi VAL (align Vi) = KAL*V New Direction: VND = KND *(VAL+VAT) No Separation Vj VND VAL = KAL*Vj Vj VAT Vi: Velocity = Current Position New Position x A. Bellaachia April 21, 2019

12 FLOSCAN: Attraction Alignment: Attraction:
Where K1 is constant Note that Vj can also be the average velocity of the neighborhood boids. Attraction: Where is the maximum distance between two boids. A. Bellaachia April 21, 2019

13 FLOSCAN: New Direction & Position
Where KND is constant New Position: First Get the Speed vector: Where A. Bellaachia April 21, 2019

14 FLOSCAN: New Position The new position vector of boid i is calculated as follows: A. Bellaachia April 21, 2019

15 FLOSCAN: Example A. Bellaachia April 21, 2019

16 FLOSCAN Objectives FLSCAN: It is a density-based algorithm Objectives:
FLOSCAN as a pre-clustering step to a clustering algorithm: Data points, sharing some common features, will be closer to each other. This will enhance the shape of potential clusters and therefore improve the efficiency of a clustering algorithm. FLOSCAN can also be used as either a clustering algorithm or a classification algorithm: capable of discovering clusters of different shapes and detecting noise points. A. Bellaachia April 21, 2019

17 FLOSCAN Pseudo Code: Initialization of the above parameters.
Calculate the distance between each document di For each iteration do For each document di do Find the neighbors, N, of di using the minimum and maximum distances. For each neighbor dj in N do Calculate the align vector and attract vector of di using dj Calculate the new direction vector for di End do Calculate the new direction vector of di. End for. A. Bellaachia April 21, 2019

18 Experimental results To measure the performance of FLOSCAN, we use the LDC TDT Corpus. We have randomly chosen about 1,200 stories: about half collected from Reuters newswire and half from CNN broadcast news transcripts. Twenty-five topics were defined in the original release. A. Bellaachia April 21, 2019

19 Document Representation
Use vector model to represent documents in the collection. Remove stopwords Assign a weight to each term in each document: Augmented weight L(tji) = * (tf(tji)/tf(max)) where, tf(max) = max{ tf(t1i), tf(t2i), ... , tf(tmi)} and m is the max number of terms in the collection. A. Bellaachia April 21, 2019

20 Evaluations F-Measure: A combination of IR precision and recall.
A. Bellaachia April 21, 2019

21 Evaluations Centroid Similarity (CS):
It computes the similarity between the centroids of all clusters. Given a set of k clusters, CS is defined as follows: A. Bellaachia April 21, 2019

22 Experimental Results A. Bellaachia April 21, 2019

23 Experimental Results A. Bellaachia April 21, 2019

24 Experimental Results A. Bellaachia April 21, 2019

25 Experimental Results A. Bellaachia April 21, 2019

26 Experimental Results A. Bellaachia April 21, 2019

27 Conclusion & Future Work
FLOSCAN: Introduce a new flocking based algorithm that can be used: Clustering Pre-preprocessing step in a data mining algorithm. Experimental results and comparison to DBSCAN. Future Work include: Other Experiments with large datasets and other artificial-life algorithms such as Ant algorithm. Analyze the scalability of FLOSCAN Use FLOSCAN as a classification algorithm Theoretical analysis of the initial parameters required by FLOSCAN, namely maximum distance, number of iterations, speed value. A. Bellaachia April 21, 2019

28 Questions .. Thank you… A. Bellaachia April 21, 2019


Download ppt "FLOSCAN: An Artificial Life Based Data Mining Algorithm"

Similar presentations


Ads by Google