Unknown Malware Detection Using Network Traffic Classification

Slides:



Advertisements
Similar presentations
Marios Iliofotou (UC Riverside) Brian Gallagher (LLNL)Tina Eliassi-Rad (Rutgers University) Guowu Xi (UC Riverside)Michalis Faloutsos (UC Riverside) ACM.
Advertisements

Building Your Own Firewall Chapter 10. Learning Objectives List and define the two categories of firewalls Explain why desktop firewalls are used Explain.
 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.
Snort - an network intrusion prevention and detection system Student: Yue Jiang Professor: Dr. Bojan Cukic CS665 class presentation.
5/1/2006Sireesha/IDS1 Intrusion Detection Systems (A preliminary study) Sireesha Dasaraju CS526 - Advanced Internet Systems UCCS.
Privacy-Preserving Cross-Domain Network Reachability Quantification
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
IBM Security Network Protection (XGS)
© 2012 IBM Corporation IBM Security Systems 1 © 2014 IBM Corporation IBM Security Network Protection (XGS) Advanced Threat Protection Integration Framework.
Host Intrusion Prevention Systems & Beyond
Department Of Computer Engineering
BotFinder: Finding Bots in Network Traffic Without Deep Packet Inspection F. Tegeler, X. Fu (U Goe), G. Vigna, C. Kruegel (UCSB)
INTRUSION DETECTION SYSTEMS Tristan Walters Rayce West.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
Automated malware classification based on network behavior
A fast identification method for P2P flow based on nodes connection degree LING XING, WEI-WEI ZHENG, JIAN-GUO MA, WEI- DONG MA Apperceiving Computing and.
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ.
What is FORENSICS? Why do we need Network Forensics?
AUTHORS: ASAF SHABTAI, URI KANONOV, YUVAL ELOVICI, CHANAN GLEZER, AND YAEL WEISS "ANDROMALY": A BEHAVIORAL MALWARE DETECTION FRAMEWORK FOR ANDROID.
CSCI 530 Lab Intrusion Detection Systems IDS. A collection of techniques and methodologies used to monitor suspicious activities both at the network and.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
Adaptive Data Visualization Packet Information Collection and Transformation for Network Intrusion Detection and Prevention Richard A. Aló,
Network Security. 2 SECURITY REQUIREMENTS Privacy (Confidentiality) Data only be accessible by authorized parties Authenticity A host or service be able.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES Lesson №18 Telecommunication software design for analyzing and control packets on the networks by using.
Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo.
Centre de Comunicacions Avançades de Banda Ampla (CCABA) Universitat Politècnica de Catalunya (UPC) Identification of Network Applications based on Machine.
BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.
Selective Packet Inspection to Detect DoS Flooding Using Software Defined Networking Author : Tommy Chin Jr., Xenia Mountrouidou, Xiangyang Li and Kaiqi.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
High Throughput and Programmable Online Traffic Classifier on FPGA Author: Da Tong, Lu Sun, Kiran Kumar Matam, Viktor Prasanna Publisher: FPGA 2013 Presenter:
2009/6/221 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Reporter : Fong-Ruei, Li Machine.
Regan Little. Definition Methods of Screening Types of Firewall Network-Level Firewalls Circuit-Level Firewalls Application-Level Firewalls Stateful Multi-Level.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Using Honeypots to Improve Network Security Dr. Saleh Ibrahim Almotairi Research and Development Centre National Information Centre - Ministry of Interior.
Firewalls. Overview of Firewalls As the name implies, a firewall acts to provide secured access between two networks A firewall may be implemented as.
SESSION HIJACKING It is a method of taking over a secure/unsecure Web user session by secretly obtaining the session ID and masquerading as an authorized.
DOWeR Detecting Outliers in Web Service Requests Master’s Presentation of Christian Blass.
Defining Network Infrastructure and Network Security Lesson 8.
Internet Vulnerabilities & Criminal Activity Internet Forensics 12.1 April 26, 2010 Internet Forensics 12.1 April 26, 2010.
Learning to Detect and Classify Malicious Executables in the Wild by J
CompTIA Security+ Study Guide (SY0-401)
Snort – IDS / IPS.
On-line Detection of Real Time Multimedia Traffic
A lustrum of malware network communication: Evolution & insights
V. A. Memos and K. E. Psannis*
Distributed Network Traffic Feature Extraction for a Real-time IDS
Securing the Network Perimeter with ISA 2004
Prof. Dr. Marc Rennhard Head of Information Security Research Group
Firewalls.
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Working at a Small-to-Medium Business or ISP – Chapter 7
NET323 D: Network Protocols
Security in Networking
CompTIA Security+ Study Guide (SY0-401)
Working at a Small-to-Medium Business or ISP – Chapter 7
DDoS Attack Detection under SDN Context
Working at a Small-to-Medium Business or ISP – Chapter 7
NET323 D: Network Protocols
Intrusion Prevention Systems
2019/1/1 High Performance Intrusion Detection Using HTTP-Based Payload Aggregation 2017 IEEE 42nd Conference on Local Computer Networks (LCN) Author: Felix.
iSRD Spam Review Detection with Imbalanced Data Distributions
Hyperlinks and Protocols
Transport Layer Identification of P2P Traffic
2019/7/26 OpenFlow-Enabled User Traffic Profiling in Campus Software Defined Networks Presenter: Wei-Li,Wang Date: 2016/1/4 Author: Taimur Bakhshi and.
Hosted Security.
Presentation transcript:

Unknown Malware Detection Using Network Traffic Classification Dmitri Bekerman, Bracha Shapira, Lior Rokach, Ariel Bar Presentation by Venkat Kotha

Outline •  Introduction and Related Work •  System Overview •  Dataset •  Evaluation •  Conclusion

Malware Modern malware software utilizes sophisticated ways to hide itself stealing confidential data, disrupting enterprise systems Malware programs use the Internet in order to communicate with the initiator of the attack However, malware tries to communicate with its Command and Control (C&C) center, most likely using common and known network protocol to pass through firewalls Thus Network behavioral modeling can be a useful approach for malware detection and malware family classification

Related Work To analyze network traffic previous studies focused on • a specific network layer or protocols Rossow 2011, Stakhanova 2011 • on certain malware or malware families Stone-Gross 2009 • But behavior of various malware might be reflected in different layers or protocols, rendering these “partial” perspectives inadequate and difficult to adapt to the constant evolution of existing malware types, as well as new types of malware or techniques

Network traffic classification •  Packet level methods examine each packet's characteristics and application signatures. •  Flow level methods bases on the aggregation of packets to flows and extraction of characteristics from the flow. •  Attributes: Port based attributes are based on the target TCP or UDP port numbers that are assigned by the Internet Assigned Numbers Authority (IANA). •  Payload based attributes are based on signatures of the traffic at the application layer level. •  Statistical based attributes relate to traffic statistical characteristics

Methodology •  In this study, first task to detect malicious communication such as interaction with C&C servers in order to enable alerts about the intrusion. •  Authors’ proposed solution based on cross-layers and cross- protocols traffic classification, using supervised learning methods to learn previously unknown malware, based on previously learned ones. •  Solution is dynamically adaptive, always remaining one step ahead of attackers, These traits enable us to discover malicious activities

Network Features •  Uniqueness of the solution lies in the fact that we observe data stream analysis in four resolutions, based on Internet and Transport and Application layers, with features generated

Network Features Transaction: Interaction between a client and server - HTTP transaction - DNS transaction - SSL transaction Session: Unique 4-tuple consisting of source and destination IP addresses and port numbers - TCP session - UDP session Flow: A group of sessions between two network addresses during the aggregation period Conversation Windows: A group of flows between a client and a server over an observation period

Examples of Features

Feature Extraction Extraction of the features is done by a dedicated feature component Processes the raw network traffic, extracts the features and provides the features as an input to the Machine Learning analyzer Data Flow: inputs( *.pcap files and external dbs such as Alexa Rank, GeoIP) •  Input Processor to objects •  Parallel Executor computation engine which extracts features •  Output processor- CSV file containing feature

Data Flow

Machine Learning Features are passed to classification algorithms Naïve Bayes, Decision tree (J48), Random Forest AUC – The receiver operating characteristic (ROC) curve is a standard technique for summarizing classifier (0-1 range, 0.5 is diagonal line) TPR: Measurement of proportion of actual positives (malicious network activities) FPR: Measurement of the proportion of actual negatives

Dataset The Sandbox malicious captures included: a) 2,585 records obtained from the Verint sandbox. b) 7,991 records obtained from an academic sandbox. c) 4,167 records obtained from the Virus Total. d) 23,600 records obtained from the Emerging Threats. e) 12,377 malicious records collected from the web and open source community. Benign corporate traffic was captured for 10 days in a students' lab at Ben-Gurion University. Corporate traffic gathered by Verint from a real network including malicious and benign traffic

Labelling of data Carried out using two labeling methods Using NIDs - Snort and Suricata SIDs - based on deep-packet inspection rules Using Verint’s blacklist labeling - unites all malicious activities to 52 unique families - based on the domains, URLs and destination IP blacklist The corporate traffic gathered by Verint was labeled only with their labels, all other datasets were labeled with both types of labels

Dataset distribution Dataset has 19 malware families sampled mixing

Evaluation Cross family classification accuracy - Differentiating benign traffic from malicious traffic - Classify the malicious traffic to known labeled families

Evaluation Unknown malicious family accuracy - To identify previously unknown malware families

Network environment Comparison of Network environment robustness Results are very accurate in classifying sandbox environments trained on other sandbox environments. Results are lower (AUC of 0.7) for the real network environment

Real network traffic experiment Classifying benign and malicious network traffic in a real environment In the real network dataset, 2,693 malicious instances were observed to be related to four different families

Data Split Used the time split method, and split the data based on the deployment dates of SIDs The results reveal that from the split point 60/40 (60% of data in the train test and the remaining 40% in test set), the AUC remains consistently above the level of 0.8. From the split point 80/20, the stable level is raised to 0.9.

Chronological Experiments Random Forest classifier succeeds in detecting most of the malware (except in 2 cases) 4 weeks before a relevant rule was deployed Detection rate deteriorates as we look further in time

Robustness of features Analysis shows that only 204 features of the available 927 were ever selected by the CFS algorithm. On average, only 27 features were selected during each week for the respected model

Robustness of features Some of the features were constantly selected for most of the models during all periods Implies that a very small set of features can serve the model for a long period of time

Robustness in cross environments Specific set of features were extracted for each environment Also, Global set is prepared, based on features selected from all other environments Difference of AUC between private set and global set is measured as Robustness

Conclusion In the proposed method, different observation resolution, cross layers and protocols features improve classification performance Accuracy was not affected by network environments, neither sandbox nor real networks Analysis has shown predictive performance results significantly improved over modern rule based network intrusion detection systems (Snort, Suricata)

Conclusion Proposed method was implemented with a small amount of network behavior features, which are suitable over time and for different network environments The method analyzes only traffic behavior and not its content or has no payload analysis, it is effective - encrypted traffic - malware using legitimate network resources - users' privacy is preserved (can be integrated with enterprise network systems)

Future work To extend the research one can transfer learning techniques to improve detection from untrained network environments To evaluate the proposed methods and models on mobile network traffic To test the proposed methods for malware family clustering Adjust the method for online detection for high bandwidth networks

Thank You