Graph-Based Anomaly Detection Eiman Alshammari
Problem Definition Why and What … ??
Anomaly detection is an area that has received much attention in recent years. Little work has focused on anomaly detection in graph-based data. In this project, a new technique for graph-based anomaly detection is introduced . Clustering technique is applied afterwards to determine the likelihood of successful anomaly detection within graph-based data. Experimental results is provided using artificially-created data.
Nodes represent pages / web pages Edges represent hyperlinks Represent Web as Graph page university texas learning group projects subdue robotics parallel hyperlink work word planning Nodes represent pages / web pages Edges represent hyperlinks
Graph To Subgraphs Data to Graph Subgraphs Similarities Clustering
There are many tools to convert Data to graphs. In an advanced level of the research , these tools will be used 1
Graph to Subgraph 1 2 3 5 4 Here I am going to explain to explain what is graph and what are the basic elements of graph: Graph , subgraph vertex, edge 2
Given Graph G
Step 1
M S1 A B C D E F G H I J K L M 1 L D K J A E H C B G I F
A B C D E F G H I J K L M 1
Step 2 Will be repeated for each link
H A B G C I S2 F J D A B C D E F G H I J K L 1 K E L
Subgraphs Similarities Adjacency Matrices 3
Subgraphs Similarities W S W L W L W S Similar matrices have the same eigenvalues If they are exactly similar … Isomorphisim X W L
Remember That 1 in the matrix means An extra link or a missing link
Find the minimum difference using the XOR Similarity 1-(number of 1’s in the composed algorithm) ____________________________________ (number of one’s in S1
We define similarity The similarity threshold will be application-dependent; meaning that its value will be determined according to the performance and safety of the application that the algorithm is embedded into.
A Link is anomalous A link is not anomalous If there exist no similarity between its sub graph and any other sub graphs A link is not anomalous If there exist at least one sub graph that allows a similarity >= the assigned similarity
Something New… Something Borrowed Algorithm Something New… Something Borrowed
The algorithm
Algorithm & Complexity
Did we solve the problem? Experimental Results Did we solve the problem?
20 nodes 37 edges
15 nodes – 21 edges
Future Direction Experimental results will be provided using real-world network intrusion data.