A survey of network anomaly detection techniques Journal of Network and Computer Applications 60 (2016) 19–31 A survey of network anomaly detection techniques Mohiuddin Ahmed Abdun Naser Mahmood Jiankun Hu School of Engineering and Information Technology, UNSW Canberra, ACT 2600, Australia Otto
Motivation Information and Communication Technology (ICT) ICT includes Social wellbeing Economic growth National security ICT includes Computers Mobile communication devices Networks Legitimate users People with malicious intent
Motivation
We must have tools to detect malicious intent Motivation We must have tools to detect malicious intent
The Survey Anomaly discussion Anomaly detection technique groups Types and detection Network attacks Mapping network attacks to anomalies Anomaly detection technique groups Classification based Statistical based Information theory based Clustering Based Datasets, evaluation and issues
Anomalies “An anomaly is an observation which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism”
Anomalies In a given dataset, anomalies may be Abnormal data Anomalous data Indicate significant but rare events Prompt critical actions to be taken Unusual network traffic patterns A change in service usage patterns A computer has been hacked Unauthorized data is transmitted
Generic anomaly detection framework
Challenges Lack of universally applicable technique Data contains noise Lack of publicly available labeled dataset Privacy concerns Normal behaviors continually evolving Techniques may not be useful forever Intruders are already aware
Taxonomy of Techniques
Taxonomy of Techniques
Types of Anomalies Point anomaly Contextual anomaly Collective anomaly Single entry Universally anomalous Contextual anomaly Anomalous just in context Conditional Collective anomaly Multiple entries May be correlated
Techinique Output Scores Binary Label Ranked Thresholds Either anomalous or normal Label Multiple well-defined categories
Types of Network Attacks Denial of Service Probes User to Root (U2R) Remote to Local (R2L)
Attack to Anomaly Mapping
Techniques: Classification-based Rely on expert knowledge Signatures Behavioral knowledge Training Normal profile Attacks deviate from norm False positives Datasets Expensive Time intensive
Techniques: Classification-based Support vector machines (SVM) Bayesian Networks Neural Network Rule Based
Techniques: Statistical-based Creation of normal profile False positive False negative Creation of statistical model Distance metric Anomaly threshold Techniques Mixture Model Signal processing techniques Principal component analysis
Techniques: Information theory Translate distributions in single metrics Computationally efficient Metrics Entropy Relative entropy Conditional entropy Relative conditional entropy Information gain
Techniques: Information theory Correlation analysis Multivariate Dissimilarity distance metric
Techniques: Clustering-based Unsupervised Not dependent on expert knowledge Three key assumptions Main clusters are for normal data Small and sparse clusters are anomalous Detection based on distance score K-Means, K-Medoids, EM-Clusters, others
Techniques: Clustering-based Regular clustering Grouping of data rows Co-clustering Grouping of data rows and columns Dimensionality reduction Greater computational efficiency
Techniques Evaluation
Conclusion Existing anomaly detection techniques Single system Single network Local analysis No communication and interaction exists Challenges Comprehensive systems Large networks Dataset availability