Graph-Based Anomaly Detection

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Advanced Topics in Algorithms and Data Structures
Social network partition Presenter: Xiaofei Cao Partick Berg.
22C:19 Discrete Math Graphs Fall 2010 Sukumar Ghosh.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
22C:19 Discrete Math Graphs Fall 2014 Sukumar Ghosh.
Modularity and community structure in networks
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Edited by Malak Abdullah Jordan University of Science and Technology Data Structures Using C++ 2E Chapter 12 Graphs.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 CHAPTER 4 - PART 2 GRAPHS 1.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
 Graph Graph  Types of Graphs Types of Graphs  Data Structures to Store Graphs Data Structures to Store Graphs  Graph Definitions Graph Definitions.
Spectrum Based RLA Detection Spectral property : the eigenvector entries for the attacking nodes,, has the normal distribution with mean and variance bounded.
1 Representing Graphs. 2 Adjacency Matrix Suppose we have a graph G with n nodes. The adjacency matrix is the n x n matrix A=[a ij ] with: a ij = 1 if.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Structural Web Search Using a Graph-Based Discovery System Nitish Manocha, Diane J. Cook, and Lawrence B. Holder University of Texas at Arlington
HCS Clustering Algorithm
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Introduction to Graph  A graph consists of a set of vertices, and a set of edges that link together the vertices.  A graph can be: Directed: Edges are.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Chapter 9: Graphs Basic Concepts
The Shortest Path Problem
GRAPH Learning Outcomes Students should be able to:
Minimum Spanning Trees and Clustering By Swee-Ling Tang April 20, /20/20101.
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
7.1 and 7.2: Spanning Trees. A network is a graph that is connected –The network must be a sub-graph of the original graph (its edges must come from the.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
Algorithmic Detection of Semantic Similarity WWW 2005.
Topics Paths and Circuits (11.2) A B C D E F G.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Solve by Factoring Zero Product Property.
Graphs. Introduction Graphs are a collection of vertices and edges Graphs are a collection of vertices and edges The solid circles are the vertices A,
- Murtuza Shareef Authoritative Sources in a Hyperlinked Environment More specifically “Link Analysis” using HITS Algorithm.
Graph Theory. undirected graph node: a, b, c, d, e, f edge: (a, b), (a, c), (b, c), (b, e), (c, d), (c, f), (d, e), (d, f), (e, f) subgraph.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Graphs. Graph Definitions A graph G is denoted by G = (V, E) where  V is the set of vertices or nodes of the graph  E is the set of edges or arcs connecting.
::Network Optimization:: Minimum Spanning Trees and Clustering Taufik Djatna, Dr.Eng. 1.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
18 Aug, 2009University of EdinburghIstván Juhos 1 /23 Graph Colouring through Clustering István Juhos University of Szeged Hungary.
Visualization in Process Mining
Data Structures & Algorithm Analysis lec(8):Graph T. Souad alonazi
PreCalculus Section 14.3 Solve linear equations using matrices
Random Walk for Similarity Testing in Complex Networks
Topo Sort on Spark GraphX Lecturer: 苟毓川
CS 367 – Introduction to Data Structures
Minimum Spanning Tree 8/7/2018 4:26 AM
by Hyunwoo Park and Kichun Lee Knowledge-Based Systems 60 (2014) 58–72
Patterns extraction from process executions
2-5 Absolute Value Functions and Graphs
Greedy Algorithm for Community Detection
June 2017 High Density Clusters.
Bipartite Matching and Other Graph Algorithms
Linear Systems Chapter 3.
Graph Theory.
Chapter 9: Graphs Basic Concepts
Connectivity Section 10.4.
Numerical Algorithms Quiz questions
Matrix Operations and Their Applications
3.3 Network-Centric Community Detection
Asymmetric Transitivity Preserving Graph Embedding
Solving Systems of Equations Using Matrices
Important Problem Types and Fundamental Data Structures
Chapter 9: Graphs Basic Concepts
Network Models Michael Goodrich Some slides adapted from:
GRAPHS.
Presentation transcript:

Graph-Based Anomaly Detection Eiman Alshammari

Problem Definition Why and What … ??

Anomaly detection is an area that has received much attention in recent years. Little work has focused on anomaly detection in graph-based data. In this project, a new technique for graph-based anomaly detection is introduced . Clustering technique is applied afterwards to determine the likelihood of successful anomaly detection within graph-based data. Experimental results is provided using artificially-created data.

Nodes represent pages / web pages Edges represent hyperlinks Represent Web as Graph page university texas learning group projects subdue robotics parallel hyperlink work word planning Nodes represent pages / web pages Edges represent hyperlinks

Graph To Subgraphs Data to Graph Subgraphs Similarities Clustering

There are many tools to convert Data to graphs. In an advanced level of the research , these tools will be used 1

Graph to Subgraph 1 2 3 5 4 Here I am going to explain to explain what is graph and what are the basic elements of graph: Graph , subgraph vertex, edge 2

Given Graph G

Step 1

M S1 A B C D E F G H I J K L M 1 L D K J A E H C B G I F

A B C D E F G H I J K L M 1

Step 2 Will be repeated for each link

H A B G C I S2 F J D A B C D E F G H I J K L 1 K E L

Subgraphs Similarities Adjacency Matrices 3

Subgraphs Similarities W S W L W L W S Similar matrices have the same eigenvalues If they are exactly similar … Isomorphisim X W L

Remember That 1 in the matrix means An extra link or a missing link

Find the minimum difference using the XOR Similarity 1-(number of 1’s in the composed algorithm) ____________________________________ (number of one’s in S1

We define similarity The similarity threshold will be application-dependent; meaning that its value will be determined according to the performance and safety of the application that the algorithm is embedded into.

A Link is anomalous A link is not anomalous If there exist no similarity between its sub graph and any other sub graphs A link is not anomalous If there exist at least one sub graph that allows a similarity >= the assigned similarity

Something New… Something Borrowed Algorithm Something New… Something Borrowed

The algorithm

Algorithm & Complexity

Did we solve the problem? Experimental Results Did we solve the problem?

20 nodes 37 edges

15 nodes – 21 edges

Future Direction Experimental results will be provided using real-world network intrusion data.