Anomaly Detection in Communication Networks Brian Thompson James Abello.

Slides:



Advertisements
Similar presentations
Dynamic Graph Algorithms - I
Advertisements

Walter Willinger AT&T Research Labs Reza Rejaie, Mojtaba Torkjazi, Masoud Valafar University of Oregon Mauro Maggioni Duke University HotMetrics’09, Seattle.
Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.
1 BotGraph: Large Scale Spamming Botnet Detection Yao Zhao EECS Department Northwestern University.
Segmentation Graph-Theoretic Clustering.
Streaming Models and Algorithms for Communication and Information Networks Brian Thompson (joint work with James Abello)
Presented by Ozgur D. Sahin. Outline Introduction Neighborhood Functions ANF Algorithm Modifications Experimental Results Data Mining using ANF Conclusions.
The max-divergence of E’ is: Intuitively, p-divergence of d means that the probability of at least X E’,p edges occurring p-recently is 1/d A (maximal)
Managing Agent Platforms with the Simple Network Management Protocol Brian Remick Thesis Defense June 26, 2015.
Application of Graph Theory to OO Software Engineering Alexander Chatzigeorgiou, Nikolaos Tsantalis, George Stephanides Department of Applied Informatics.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao Yinglian Xie *, Fang Yu *, Qifa Ke *, Yuan Yu *, Yan Chen and Eliot Gillum ‡ EECS Department,
1 An Information Theoretic Approach to Network Trace Compression Y. Liu, D. Towsley, J. Weng and D. Goeckel.
Algorithm: For all e E t, define X e = {w e if e G t, 1 - w e otherwise}. Measure likelihood of substructure S by. Flag S as anomalous if, where is an.
A Framework For Community Identification in Dynamic Social Networks Chayant Tantipathananandh Tanya Berger-Wolf David Kempe Presented by Victor Lee.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
Stream Clustering CSE 902. Big Data Stream analysis Stream: Continuous flow of data Challenges ◦Volume: Not possible to store all the data ◦One-time.
Models of Influence in Online Social Networks
On Anomalous Hot Spot Discovery in Graph Streams
Detecting Node encounters through WiFi By: Karim Keramat Jahromi Supervisor: Prof Adriano Moreira Co-Supervisor: Prof Filipe Meneses Oct 2013.
Neighbourhood Sampling for Local Properties on a Graph Stream A. Pavan, Iowa State University Kanat Tangwongsan, IBM Research Srikanta Tirthapura, Iowa.
What is FORENSICS? Why do we need Network Forensics?
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Network Aware Resource Allocation in Distributed Clouds.
Demo. Overview Overall the project has two main goals: 1) Develop a method to use sensor data to determine behavior probability. 2) Use the behavior probability.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
Selfishness, Altruism and Message Spreading in Mobile Social Networks September 2012 In-Seok Kang
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
Group 8: Denial Hess, Yun Zhang Project presentation.
KAIS T On the problem of placing Mobility Anchor Points in Wireless Mesh Networks Lei Wu & Bjorn Lanfeldt, Wireless Mesh Community Networks Workshop, 2006.
LogTree: A Framework for Generating System Events from Raw Textual Logs Liang Tang and Tao Li School of Computing and Information Sciences Florida International.
Tracking Malicious Regions of the IP Address Space Dynamically.
Models and Algorithms for Event-Driven Networks PhD Defense Brian Thompson Committee: Muthu Muthukrishnan (advisor), Danfeng Yao (Virginia Tech), Rebecca.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.
Tokyo Research Laboratory © Copyright IBM Corporation 2003KDD 04 | 2004/08/24 | Eigenspace-based anomaly detection in computer systems IBM Research, Tokyo.
Association Mining via Co-clustering of Sparse Matrices Brian Thompson *, Linda Ness †, David Shallcross †, Devasis Bassu † *†
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
1 Travel Times from Mobile Sensors Ram Rajagopal, Raffi Sevlian and Pravin Varaiya University of California, Berkeley Singapore Road Traffic Control TexPoint.
Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
FELICIAN UNIVERSITY Creating a Learning Community Using Knowledge Management and Social Media Dr. John Zanetich, Associate Professor Felician University.
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
 DM-Group Meeting Liangzhe Chen, Oct Papers to be present  RSC: Mining and Modeling Temporal Activity in Social Media  KDD’15  A. F. Costa,
Cohesive Subgraph Computation over Large Graphs
Gedas Adomavicius Jesse Bockstedt
Workshop on Data Mining in Networks ICDM 2015
What Are Routers? Routers are an intermediate system at the network layer that is used to connect networks together based on a common network layer protocol.
NetMine: Mining Tools for Large Graphs
Kijung Shin1 Mohammad Hammoud1
Introduction Secondary Users (SUs) Primary Users (PUs)
William Norris Professor and Head, Department of Computer Science
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
11/17/2018 9:32 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
William Norris Professor and Head, Department of Computer Science
SEG5010 Presentation Zhou Lanjun.
Resource Allocation for Distributed Streaming Applications
Discovering Important Nodes through Graph Entropy
GhostLink: Latent Network Inference for Influence-aware Recommendation
Presentation transcript:

Anomaly Detection in Communication Networks Brian Thompson James Abello

Outline Introduction and Motivation Model and Approach The MCD Algorithm Experimental Results Conclusions and Future Work

Outline Introduction and Motivation Model and Approach The MCD Algorithm Experimental Results Conclusions and Future Work

Communication Networks People like to talk phone calls Twitter IP traffic May be stored in communication logs: SenderReceiverTimestamp AliceBobNov. 16, :20pm AliceChrisNov. 17, :45am BobDaveNov. 17, :00pm

Communication Networks Commonly visualized as graphs, where nodes are people and edges signify communication Edge weights may indicate the quantity or rate of communication Graph analysis tools may then be applied to gain insights into network structure or identify outliers most connected subgraph! highest degree vertex! highest weight edge! Alice Bob Chris Dave Eve

Motivation Communication networks are BIG and highly volatile Traditional techniques are ineffective for analyzing network dynamics, dealing with lots of streaming data We address questions of temporal and structural nature Traditional QuestionOur Question Which nodes have the highest degree? Which nodes have seen a recent change in connectivity patterns? Which subgraphs are the most well- connected? Which subgraphs have shown a sudden increase in activity?

Applications Monitor suspected malicious individuals or groups Detect the spread of viruses on a local network Identify blog posts that have achieved sudden popularity among a small subset of users, that might otherwise fall below the radar Flag suspicious or calling patterns without examining the content or recording conversations

Related Work  Approach 1: segment time into blocks, construct summary graph, apply static graph metrics  Metric forensics: a multi-level approach for mining volatile graphs. Henderson, et. al. KDD  Anomaly Detection Using Scan Statistics on Time Series Hypergraphs. Park, et. al. SDM  GraphScope: Parameter-free Mining of Large Time- evolving Graphs. Sun, et. al. KDD  Approach 2: time series analysis  Monitoring Time-Varying Network Streams Using State-Space Models. Cao, et. al. InfoCom 2009.

Challenges Time scale bias – appropriate time granularity may depend on which subgraph is being studied Detect local changes that may not have a visible effect on global metrics Combinatorial explosion – may not know beforehand which subgraphs to monitor Scalability – want streaming algorithms with minimal storage space costs

Outline Introduction and Motivation Model and Approach The MCD Algorithm Experimental Results Conclusions and Future Work

Recency For each pair of people, at a given point in time, tells us how much time has passed since those people communicated Allows us to focus on the most relevant activity in the network 8:00 am10:00 am12:00 pmNOW! Problem: The most frequent communicators will always seem “recent”, overshadowing others’ behavior. We call this time scale bias.

An Edge Model We model communication across an edge as a renewal process: a sequence of time-stamped events sampled from a distribution of inter-arrival times (IATs) 9:00 am10:30 am 11:30 am 2:00 pm 9:30 am

An Edge Model IATs for human interaction follow a power law The Bounded Pareto distribution gives us a concise model that matches well with real-world data and can be updated in real-time and constant space

Recency The recency function Rec : 2 T x T  [0,1] assigns a weight to an edge e at time t based on the age of the renewal process (time since the last event), decreasing from age = 0 to x max. Given an IAT distribution, there is a unique such function that is uniform over [0,1] when sampled uniformly in time. This property eliminates time scale bias. x min x max Recency of Edge in Bluetooth Dataset Bounded Pareto Distribution

Divergence Consider the weighted graph G = (V,E) representing a communication network, with w(e) = Rec(e). For, let = # of edges in E’ with Rec(e) ≥ θ. We define, where θ = 0.7  |E’| = 6  X E’,0.7 = 4  P(X ≥ 4) = 0.07  Div 0.7 (E’) = E’ Question: How do we know what’s the right threshold?

Max-Divergence Problem: No fixed threshold will catch all anomalies |E’| = |E’’| = 10 w(E’) = {0.1, 0.15, 0.25, 0.3, 0.35, 0.5, 0.6, 0.9, 0.93, 0.95} w(E’’) = {0.05, 0.1, 0.3, 0.4, 0.45, 0.76, 0.78, 0.8, 0.85, 0.93} θE[X]X E’, θ Div θ (E’)X E’’, θ Div θ (E’’) Solution:

A (maximal) θ-component of G = (V,E) is a connected subgraph C = (V’,E’) such that 1. w(e) ≥ θ for all e in E’ 2. w(e) < θ for all e not in E’ incident to V’ Maximal Components The set of θ-components form a partition of V, for all θ in [0,1]. θ = 0.7

Outline Introduction and Motivation Model and Approach The MCD Algorithm Experimental Results Conclusions and Future Work

The MCD Algorithm V2V2 V3V3 V1V1 V5V5 V4V V1V1 V2V2 V3V3 V4V4 V5V5 θComponentDiv(C) 0.9{V 1,V 2 } {V 1,V 2,V 3 } {V 1,V 2,V 3 } {V 4,V 5 } {V 1,V 2,V 3,V 4,V 5 } {V 1,V 2,V 3,V 4,V 5 } Calculate edge weights using the Recency function 2. Gradually decrease the threshold, updating components and divergence values as necessary 3. Output: Disjoint components with max divergence

Sample Output MCDθ#V(C)E-frac%E(C)%E(G) / / / / /

Outline Introduction and Motivation Model and Approach The MCD Algorithm Experimental Results Conclusions and Future Work

Data Dataset# Nodes# Edges# Timestamps ENRON – network of Enron employees BLUETOOTH – proximity of mobile devices LBNL – logs of IP traffic TWITTER – directed messages

Validation of Results

LBNL Case Study

Complexity Analysis Dataset: Twitter messages, Nov – Oct (263k nodes, 308k edges, 1.1 million timestamps) Updates O(1) per communication MCD Algorithm O(m log m), where m = # of edges; can be approximated in O(m) time

Outline Introduction and Motivation Model and Approach The MCD Algorithm Experimental Results Conclusions and Future Work

Conclusions A novel approach for analyzing temporal and structural properties of communication networks Effective as a stand-alone application, or as a tool to assist IT staff or security personnel Flexible: parameter-free, applicable to a wide variety of real-world domains Scalable: algorithms are streaming, and run in O(m) space and O(m) time in practice, where m is # of edges

Future Work Incorporate duration of communication and other edge properties into our model Extend our method to accommodate other data types, such as recommendation systems or hypergraphs Develop techniques to take past correlation of edges into account (to avoid recurring “anomalies”) Make it even more efficient – linear in number of nodes?

Acknowledgements Part of this work was conducted at Lawrence Livermore National Laboratory, under the guidance of Tina Eliassi- Rad. This project is partially supported by a DHS Career Development Grant, under the auspices of CCICADA, a DHS Center of Excellence.