Centre de Comunicacions Avançades de Banda Ampla (CCABA) Universitat Politècnica de Catalunya (UPC) Identification of Network Applications based on Machine.

Slides:

Advertisements

Similar presentations

New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.

Advertisements

Fundamentals of Computer Networks ECE 478/578 Lecture #20: Transmission Control Protocol Instructor: Loukas Lazos Dept of Electrical and Computer Engineering.

Introduction1-1 message segment datagram frame source application transport network link physical HtHt HnHn HlHl M HtHt HnHn M HtHt M M destination application.

Decision Trees for Server Flow Authentication James P. Early and Carla E. Brodley Purdue University West Lafayette, IN 47907

Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg Rule-Based Anomaly Detection on IP Flows.

Polymorphic blending attacks Prahlad Fogla et al USENIX 2006 Presented By Himanshu Pagey.

Determining applications and characteristics of encrypted wireless traffic. Chris Hanks CMPE 257 3/17/2011.

 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.

Copyright © 2005 Department of Computer Science CPSC 641 Winter WAN Traffic Measurements There have been several studies of wide area network traffic.

Application Identification in information-poor environments Charalampos Rotsos 02/02/20101 What is application identification Current status My work Future.

Intrusion Detection/Prevention Systems. Objectives and Deliverable Understand the concept of IDS/IPS and the two major categorizations: by features/models,

Advanced Broadband Communications Center (CCABA) Universitat Politècnica de Catalunya (UPC) SMARTxAC: A Passive Monitoring and Analysis System for High-Speed.

Application Identification in Information-poor Environments Charalampos (Haris) Rotsos Computer Laboratory University of Cambridge

ECE 526 – Network Processing Systems Design Packet Processing II: algorithms and data structures Chapter 5: D. E. Comer.

Licentiate Seminar: On Measurement and Analysis of Internet Backbone Traffic Wolfgang John Department of Computer Science and Engineering Chalmers University.

Intrusion Detection System Marmagna Desai [ 520 Presentation]

Sven Ubik, CESNET TNC2004, Rhodos, 9 June 2004 Performance monitoring of high-speed networks from NREN perspective.

Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Automated malware classification based on network behavior

SECURING NETWORKS USING SDN AND MACHINE LEARNING DRAGOS COMANECI –

A fast identification method for P2P flow based on nodes connection degree LING XING, WEI-WEI ZHENG, JIAN-GUO MA, WEI- DONG MA Apperceiving Computing and.

A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ.

Traffic Classification through Simple Statistical Fingerprinting M. Crotti, M. Dusi, F. Gringoli, L. Salgarelli ACM SIGCOMM Computer Communication Review,

Differences between In- and Outbound Internet Backbone Traffic Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University.

Network Flow-Based Anomaly Detection of DDoS Attacks Vassilis Chatzigiannakis National Technical University of Athens, Greece TNC.

Computer Networks: Multimedia Applications Ivan Marsic Rutgers University Chapter 3 – Multimedia & Real-time Applications.

11 Automatic Discovery of Botnet Communities on Large-Scale Communication Networks Wei Lu, Mahbod Tavallaee and Ali A. Ghorbani - in ACM Symposium on InformAtion,

NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.

DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.

FiG: Automatic Fingerprint Generation Shobha Venkataraman Joint work with Juan Caballero, Pongsin Poosankam, Min Gyung Kang, Dawn Song & Avrim Blum Carnegie.

Vladimír Smotlacha CESNET Full Packet Monitoring Sensors: Hardware and Software Challenges.

MonNet – a project for network and traffic monitoring Detection of malicious Traffic on Backbone Links via Packet Header Analysis Wolfgang John and Tomas.

TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.

On the processing time for detection of Skype traffic P.M. Santiago del Río, J. Ramos, J.L. García-Dorado, J. Aracil Universidad Autónoma de Madrid A.

An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.

Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,

Firewall Fingerprinting Amir R. Khakpour 1, Joshua W. Hulst 1, Zhihui Ge 2, Alex X. Liu 1, Dan Pei 2, Jia Wang 2 1 Michigan State University 2 AT&T Labs.

LP1 LP4 LP3 LP2 Monitors Abstract: We propose a novel monitor placement algorithm that reduces significantly the number of monitors required to accurately.

Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah.

Probabilistic Graphical Models for Semi-Supervised Traffic Classification Rotsos Charalampos, Jurgen Van Gael, Andrew W. Moore, Zoubin Ghahramani Computer.

Heuristics to Classify Internet Backbone Traffic based on Connection Patterns Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering.

TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger NLANR/Pittsburgh Supercomputing Center.

Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.

Open-Eye Georgios Androulidakis National Technical University of Athens.

Centre de Comunicacions Avançades de Banda Ampla (CCABA) Universitat Politècnica de Catalunya (UPC) Identification of Network Applications based on Machine.

Fuzzy Control of Sampling Interval for Measurement of QoS Parameters Juraj Giertl.

Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection Jing Gao 1, Wei Fan 2, Deepak Turaga 2,

BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.

Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

High Throughput and Programmable Online Traffic Classifier on FPGA Author: Da Tong, Lu Sun, Kiran Kumar Matam, Viktor Prasanna Publisher: FPGA 2013 Presenter:

An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.

Sven Ubik, Aleš Friedl CESNET TNC 2009, Malaga, Spain, 11 June 2009 Experience with passive monitoring deployment in GEANT2 network.

1 TCP ProtocolsLayer name DNSApplication TCP, UDPTransport IPInternet (Network ) WiFi, Ethernet Link (Physical)

Experience Report: System Log Analysis for Anomaly Detection

P.Demestichas (1), S. Vassaki(2,3), A.Georgakopoulos(2,3)

2018/5/8 An approach for detecting encrypted insider attacks on OpenFlow SDN Networks Author: Charles V. Neu , Avelino F. Zorzox , Alex M. S. Orozcoy and.

An Artificial Intelligence Approach to Precision Oncology

Lightweight Application Classification for Network Management

Impact of Packet Sampling on Anomaly Detection Metrics

Damiano Bolzoni, Sandro Etalle, Pieter H. Hartel

Unknown Malware Detection Using Network Traffic Classification

DDoS Attack Detection under SDN Context

Automatic Discovery of Network Applications: A Hybrid Approach

CSCI N317 Computation for Scientific Applications Unit Weka

Identifying Slow HTTP DoS/DDoS Attacks against Web Servers DEPARTMENT ANDDepartment of Computer Science & Information SPECIALIZATIONTechnology, University.

A maximum likelihood estimation and training on the fly approach

Sofia Pediaditaki and Mahesh Marina University of Edinburgh

Transport Layer Identification of P2P Traffic

Internet Traffic Classification Using Bayesian Analysis Techniques

Presentation transcript:

Centre de Comunicacions Avançades de Banda Ampla (CCABA) Universitat Politècnica de Catalunya (UPC) Identification of Network Applications based on Machine Learning Techniques COST-TMA Meeting, Samos 2008 Valentín Carela-Español Pere Barlet-Ros Josep Solé-Pareta {vcarela, pbarlet,

Outline Scenario and objectives Existing solutions  Well-known ports  Payload based (pattern matching)  Machine Learning –Supervised –Unsupervised Proposed method Results Conclusions and Future work

Scenario and objectives Scenario: SMARTxAC Traffic Monitoring and Analysis System for the Anella Científica  Real-time classification  Independent from packet contents  High-speed link Objectives:  Development of a ML Technique to identify applications in SMARTxAC  Automate the ML training phase  Adapt our solution to Netflow  Study how it affects the sampling

Outline Scenario and objectives Existing solutions  Well-known ports  Payload based (pattern matching)  Machine Learning –Supervised –Unsupervised Proposed method Results Conclusions and Future work

Existing Solutions Well-known ports + Computationally lightweight - Very low accuracy Payload based (pattern matching) + High accuracy - Packet contents are required - Computationally expensive - Content encryption - Privacy legislations Consequence: Not a feasible solutions

Existing Solutions Machine Learning Techniques - Difficult training phase + Packet contents are not required + High accuracy + Computationally viable Two main possibilities:  Supervised methods: + Better accuracy for classes expected - Need a complete pre-labeled dataset - Difficult detection of retraining necessity - No detection of new classes  Unsupervised methods: + Do not need a full labeled dataset + Automatic detection of new classes + Better accuracy for new classes

Outline Scenario and objectives Existing solutions  Well-known ports  Payload based (pattern matching)  Machine Learning –Supervised –Unsupervised Proposed method Results Conclusions and Future work

Proposed method Supervised identification based on C4.5 algorithm  Developed by Ross Quinlan as extension of ID3  Based on the construction of a classification tree Training set  Actual traffic flows  Pairs  Feature vector contains relevant characteristics of traffic flows  Application is identified using L7-filter

Machine Learning process 1) Collection of the training set Representative flows of the environment to be monitored 2)Automatic flow classification → application class Pattern matching using L7-filter It can be simplified if an artificial training set is used in 1) 3) Feature extraction from the training flows 4) Construction of a C4.5 classification tree E.g. using Weka 5) Deployment of the tree obtained in 4) in the monitoring system 6) Retraining of the system Starting from phase 1)

Outline Scenario and objectives Existing solutions  Well-known ports  Payload based (pattern matching)  Machine Learning –Supervised –Unsupervised Proposed method Results Conclusions and Future work

Accuracy

Netflow Accuracy

Accuracy

Features Accuracy · Best Normal Feature Subset : dport, bytes_out, avg_out_size, sport, avg_in_size, push_in. · Best Netflow Feature Subset: dport, bytes, push

How it affects the sampling?

Outline Scenario and objectives Existing solutions  Well-known ports  Payload based (pattern matching)  Machine Learning –Supervised –Unsupervised Proposed method Results Conclusions and Future work

Conclusions and Future Work  Machine learning techniques are a good solution to identify applications  The identification in sampled scenarios are still very open Future work:  Find a more accurate automatic system to label the dataset  Build early decision trees to identify the flow as soon as possible  Find features that achieves more accuracy and more resilient to sampling  Test with traces from another networks to check the generality of the solution.

Thank you for your attention Questions?

SMARTxAC SMARTxAC: Traffic Monitoring and Analysis System for the Anella Científica  Operative since July 2003  Developed under a collaboration agreement CESCA-UPC  Tailor-made traffic monitoring system for the Anella Científica Main objectives  Low-cost platform  Continuous monitoring of high-speed links without packet loss  Detection of network anomalies and irregular usage  Multi-user system: Network operators and Institutions Measurement of two full-duplex 10GigE links  Connection between Anella Científica and RedIRIS  Current load: > 5 Gbps / > 300 Kpps

Features Requirements  Real-time extraction  Independence from packet contents Feature examples (total: 25)  Packets and bytes per flow  Flow duration  min/avg/max paquet size  min/avg/max TCP window size  min/avg/max packet interarrival time  Packets with flags PUSH, URG, DF, … set  Average increase of IPID  OS estimation (source and destination)  Also ports and protocols (but not in the traditional way)  …

Netflow Features Requirements  Available in the Netflow traces (version 5) –Unidirectional flows Feature examples (total: 15)  Packets and bytes per flow  Flow duration  average paquet size  average packet interarrival time  Flows with flags PUSH, URG, SYN, FIN, RST, ACK set  Type of service  Also ports and protocols (but not in the traditional way)