Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Network Traffic Feature Extraction for a Real-time IDS

Similar presentations


Presentation on theme: "Distributed Network Traffic Feature Extraction for a Real-time IDS"— Presentation transcript:

1 Distributed Network Traffic Feature Extraction for a Real-time IDS
Presented by: Dr. Ahmad Javaid Co-authors: Ahmad Karimi, Quamar Niyaz, Dr. Weiqing Sun, and Dr. Vijay Devabhaktuni

2 Outline Introduction Tools Overview IDS Architecture
Experimental Setup & Results Conclusion

3 Introduction Intrusion Detection System (IDS) monitors attacks on the network Installed at different hierarchical layers Backbone Distribution Access Modern networks observe huge amount of traffic

4 Introduction Challenge: IDS has to monitor all the incoming traffic
Solution: Distributed Systems provide parallel processing Huge disk space Reliable Scalable We emphasize on efficient network traffic processing and feature extraction 1. IDS has to monitor all the traffic within the same stipulated time interval. 2. Distributed system utilizes resource of all the participating nodes. 3. Same Disk space of all the individual machine. 4. Reliable because of redundancy of data and there’s no single point of failure

5 Outline Introduction Tools Overview IDS Architecture
Experimental Setup & Results Conclusion

6 Tools Overview Our proposed system involves following stages
Traffic collection Data storage Feature extraction Traffic classification Different tools are used at different stages

7 Tools Overview Traffic Collection - Netmap-libpcap:
Framework for high- speed packet I/O Real-time traffic collection with negligible loss of less than 1%. Other tools used for comparison are Dumpcap and Tshark. Significant losses observed when packet frequency exceede .5 million per second

8 Tools Overview Data Storage – HDFS: Feature extraction – Apache Spark:
Hadoop Distributed File System Provides scalable disk space Fault-tolerant Feature extraction – Apache Spark: Distributed and high-speed data processing framework In-memory processing faster compared to others Each node processes blocks of data, hence parallel execution Uses commodity machines. Can be expanded to virtually any number of machines. Fault tolerant because of redundant data or data replication.

9 Outline Introduction Tools Overview IDS Architecture
Experimental Setup & Results Conclusion

10 IDS Architecture The IDS consists of: Traffic collection
Traffic feature extraction Traffic classification Traffic collection has not been focused now. For future work using Spark Mlib library that provides many machine learning algorithms like Naïve Bayesian, random forest, decision tress, etc.

11 IDS Architecture Traffic collection Traffic feature extraction
Traffic is mirrored to a particular port Every packet is copied to the IDS from there Traffic feature extraction Stores captured packets on HDFS Extracts features using Spark Sends extracted features to monitoring system

12 Outline Introduction Tools Overview IDS Architecture
Experimental Setup & Results Conclusion

13 Experimental Setup And Results
Six Spark nodes and HDFS cluster Nodes hosted on VMWare ESXi host ESXi runs on Supermicro SYS-6028RWTRT Nodes assigned 4 vCPUs, 8 GB RAM, and 60 GB disk storage Performance evaluated by modelling CAIDA DDoS attack dataset tcpreplay used to run a dataset and traffic collected at 5 min interval Supermicro server shipped with Intel (R) Xeon 2.30 GHz, 96 GB RAM, and 20 CPU core × 2.99 GHz

14 Experimental Setup And Results
CAIDA data collected for an hour at intervals of 5 mins Maximum data observed in a time window was 2.9 GB We generated upto ≈3.0 GB in 5 mins for TCP traffic

15 Experimental Setup And Results
Comparison on varying cluster and file size The 3 GB files took 3.17 ± 0.05 and 3.5±0.1 min on 6 and 4 nodes 1 or 2 nodes took >5 mins for 2 GB and 3 GB files 1 GB files were processed within 5 mins in all cases 5 minute threshold

16 Experimental Setup And Results
Current work focuses on TCP traffic feature extraction for TCP based attack detection Following headers were collected Source IP Destination IP Source Port Destination Port IP Payload TCP Flags Features extracted using these headers Extracted features are described in table 1.

17 Experimental Setup And Results
Feature extraction output on Spark cluster

18 Outline Introduction Tools Overview IDS Architecture
Experimental Setup & Results Conclusion

19 Conclusion Feature extraction time is less than the period of traffic generation Supports real-time evaluation for a fixed time interval Useful for network with high traffic Current system can be implemented for small organizations System may be applicable to larger organizations if number of nodes in the cluster increased Useful for network with high traffic : because for low traffic networks, there tendency that inter node communication neutralizes the parallel processing feature of spark. Good for huge header data files.

20 Thank you Questions?


Download ppt "Distributed Network Traffic Feature Extraction for a Real-time IDS"

Similar presentations


Ads by Google