Chaotic Mining: Knowledge Discovery Using the Fractal Dimension Daniel Barbara George Mason University Information and Software Engineering Department By Dhruva Gopal
Fractals What are fractals Property of a fractal Self Similarity
Uses of fractals Geologic activity Planetary orbits Weather Fluid flow databases
Fractal Dimensions Number of possible dimensions? Fractal dimension computation D q = 1/(q-1)*(log i p i q )/(log r) Hausdorff dimension Information dimension Correlation dimension
Examples Event Anomalies in time series Self similarity in association rules Analyzing patterns in datacubes Incremental clustering
Event Anomalies Time series Stock price changes TCP connection occurrence Example Half open TCP connections Network Spoofing
Methodology Half open connections are self similar Collect data points every seconds Moving window of k * (k is an integer) Fractal dimension will show a drastic decrease in case of spoofing Other applications of fractals with time series Password port in FTP service
Self Similarity in Association Rules Parameters associated with a rule Support Confidence Distribution of these transactions??? Seasonal Promotional Regular
Fractals in Association rules Compute Fractal dimension of a k- itemset while computing its support Information about the fractal dimension should be kept for use when computing k+1th itemset
Analyzing Patterns in datacubes Patterns Null cells (no aggregate) Compute fractal dimension of null cells Drastic changes imply anomalous trends
Incremental Clustering Clustering algorithms are needed to deal with large datasets Extended K means algorithm Use a variation of extended K means algorithm using fractal dimensions for deciding point membership
Conclusions Fractals are powerful parameters used to uncover anomalous patterns in the databases Paper discusses techniques that can be used, but none are implemented.
References Fast Discovery of Association rules,R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A.I. Verkamo John Sarraille and P. DiFalco, FD3,