Presenter: Kuei-Yu Hsu Advisor: Dr. Kai-Wei Ke 2013/4/29 Detecting Skype flows Hidden in Web Traffic.

Slides:



Advertisements
Similar presentations
KISS: Stochastic Packet Inspection for UDP Traffic Classification
Advertisements

Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
The testbed environment for this research to generate real-world Skype behaviors for analyzation is as follows: A NAT-ed LAN consisting of 7 machines running.
TransAD: A Content Based Anomaly Detector Sharath Hiremagalore Advisor: Dr. Angelos Stavrou October 23, 2013.
An Empirical Study of Real Audio Traffic A. Mena and J. Heidemann USC/Information Sciences Institute In Proceedings of IEEE Infocom Tel-Aviv, Israel March.
Determining applications and characteristics of encrypted wireless traffic. Chris Hanks CMPE 257 3/17/2011.
 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.
Intrusion Detection Systems and Practices
A Framework for Classifying Denial of Service Attacks Alefiya Hussain, John Heidemann and Christos Papadopoulos presented by Nahur Fonseca NRG, June, 22.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
5/1/2006Sireesha/IDS1 Intrusion Detection Systems (A preliminary study) Sireesha Dasaraju CS526 - Advanced Internet Systems UCCS.
Distinguishing Photographic Images and Photorealistic Computer Graphics Using Visual Vocabulary on Local Image Edges Rong Zhang,Rand-Ding Wang, and Tian-Tsong.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
1 TCP Traffic Analysis in cooperation with Motorola Todd DeSantis and David Loose Advisor: Professor Mark Claypool Co-Advisor: Professor Robert Kinicki.
Treatment-Based Traffic Signatures Mark Claypool Robert Kinicki Craig Wills Computer Science Department Worcester Polytechnic Institute
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Licentiate Seminar: On Measurement and Analysis of Internet Backbone Traffic Wolfgang John Department of Computer Science and Engineering Chalmers University.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
1 Intrusion Detection Systems. 2 Intrusion Detection Intrusion is any use or attempted use of a system that exceeds authentication limits Intrusions are.
A.C. Chen ADL M Zubair Rafique Muhammad Khurram Khan Khaled Alghathbar Muddassar Farooq The 8th FTRA International Conference on Secure and.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Traffic Classification through Simple Statistical Fingerprinting M. Crotti, M. Dusi, F. Gringoli, L. Salgarelli ACM SIGCOMM Computer Communication Review,
Revealing Skype Traffic: When Randomness Plays with You D. Bonfiglio 1, M. Mellia 1, M. Meo 1, D. Rossi 2, P. Tofanelli 3 Dipartimento di Elettronica,
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
What is FORENSICS? Why do we need Network Forensics?
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
11 Automatic Discovery of Botnet Communities on Large-Scale Communication Networks Wei Lu, Mahbod Tavallaee and Ali A. Ghorbani - in ACM Symposium on InformAtion,
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Source-End Defense System against DDoS attacks Fu-Yuan Lee, Shiuhpyng Shieh, Jui-Ting Shieh and Sheng Hsuan Wang Distributed System and Network Security.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Tests for Random Numbers Dr. Akram Ibrahim Aly Lecture (9)
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
Detection and Prevention of SIP Flooding Attacks in Voice over IP Networks Jin Tang, Yu Cheng and Yong Hao Department of Electrical and Computer Engineering.
Chapter 16 The Chi-Square Statistic
Network Security. 2 SECURITY REQUIREMENTS Privacy (Confidentiality) Data only be accessible by authorized parties Authenticity A host or service be able.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Heuristics to Classify Internet Backbone Traffic based on Connection Patterns Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering.
1 Measuring Congestion Responsiveness of Windows Streaming Media James Nichols Advisors: Prof. Mark Claypool Prof. Bob Kinicki Reader: Prof. David Finkel.
©2009 Mladen Kezunovic. Improving Relay Performance By Off-line and On-line Evaluation Mladen Kezunovic Jinfeng Ren, Chengzong Pang Texas A&M University,
Tests of Random Number Generators
Bradley Cowie Supervised by Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University DATA CLASSIFICATION FOR CLASSIFIER.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
© Copyright McGraw-Hill 2004
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.
Firewalls A brief introduction to firewalls. What does a Firewall do? Firewalls are essential tools in managing and controlling network traffic Firewalls.
COMP2322 Lab 1 Introduction to Wireshark Weichao Li Jan. 22, 2016.
Voice Over Internet Protocol (VoIP) Copyright © 2006 Heathkit Company, Inc. All Rights Reserved Presentation 5 – VoIP and the OSI Model.
PANACEA: AUTOMATING ATTACK CLASSIFICATION FOR ANOMALY-BASED NETWORK INTRUSION DETECTION SYSTEMS Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao.
-Mayukh, clemson university1 Project Overview Study of Tfrc Verification, Analysis and Development Verification : Experiments. Analysis : Check for short.
2009/6/221 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Reporter : Fong-Ruei, Li Machine.
#16 Application Measurement Presentation by Bobin John.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
Regan Little. Definition Methods of Screening Types of Firewall Network-Level Firewalls Circuit-Level Firewalls Application-Level Firewalls Stateful Multi-Level.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Skype.
NET 536 Network Security Firewalls and VPN
An IP-based multimedia traffic generator
Packet Sniffing.
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Firewalls Jiang Long Spring 2002.
Transport Layer Identification of P2P Traffic
Presentation transcript:

Presenter: Kuei-Yu Hsu Advisor: Dr. Kai-Wei Ke 2013/4/29 Detecting Skype flows Hidden in Web Traffic

Outline Introduction Proposed Methodology Experimental Datasets Experimental Results Conclusions 2

Introduction What is VoIP? Delude restrictive firewalls Skype Proprietary Protocol About Detection 3

What is VoIP? 4 VoIP(Voice over Internet Protocol): Refers to a way to carry phone calls over an IP data network, whether on the Internet or your own internal network. VoIP calls are usually much cheaper than traditional long distance telephone calls to PSTN users, or even free if a call is placed directly from a VoIP end user to another one.

Delude restrictive firewalls 5 Restrictive firewalls are commonly adopted by network managers in an effort to give a better security to the internal network and optimize the use of network resources. Such firewalls are unlikely to block Web traffic because it is usually perceived as a fundamental service considered essential for Internet access. Using TCP ports 80 (HTTP) or 443 (HTTPS) for delivering non-HTTP traffic, thus fooling restrictive firewalls to gain network access.

Skype Proprietary Protocol 6 Skype can delude a network firewall by using Web ports to establish communication with other Skype peers. This strategy is adopted by Skype as a fallback mechanism in the case of other strategies fail to get through a restrictive firewall. Such a strategy renders Skype traffic disguised as Web traffic quite difficult to be detected by network operators.

About Detection 7 Detection of Skype flows in Web traffic 1. HTTP Workload Model 2. Goodness-of-fit tests 1) Chi-square test 2) Kolmogorov-Smirnov test 3. P2P VoIP characteristics Detection Process 1. Training Datasets 2. Evaluation Datasets

Proposed Methodology HTTP Workload Model Goodness-of-fit tests 1) Chi-square test 2) Kolmogorov-Smirnov test Skype characteristics 8

Proposed Methodology 9 1. Define a HTTP workload model and capture real Web data to build empirical distributions of some relevant parameters. 2. Capture Web traffic with VoIP calls hidden in it, calculate the same relevant parameters for each flow and use metrics taken from two Goodness-of-fit tests to decide whether the computed parameters are compatible (or not) with the empirical distributions derived in the previous step, classifying each flow as legitimate Web traffic or not.

Proposed Methodology 10

HTTP Workload Model 11 Define a model for evaluate Web “normal” behavior. This model has the following parameters: 1. Web request size; 2. Web Response size; 3. Interarrival time between requests; 4. Number of requests per page; 5. Page retrieval time;

Goodness-of-fit tests Chi-square test It was first investigated by Karl Pearson in Oi: an observed frequency; Ei: an expected (theoretical) frequency, asserted by the null hypothesis; K: the number of classes.

Goodness-of-fit tests Kolmogorov-Smirnov test It quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution. F0(x): the empirical distribution function derived from the training part. Sn(x):the cumulative step function of a sample of N observations.

Skype characteristics 14 It does not use SIP or other known signaling protocol for VoIP calls and all its traffic is end-to- end encrypted. Automatically detect network characteristics and choose the best option available to communicate with other Skype peers. It only uses Web ports as a fallback mechanism, when UDP is not available.

Experimental Datasets 1. Training Datasets – model part 2. Evaluation Datasets – detection part 15

Training Datasets - model part 16 Using a training dataset to characterize a “normal” Web traffic behavior. 1. tcpdump: capture HTTP full packet traces, generating dump files. 2. tcpflow: read these dump files and calculate the parameters present in the Web workload model.

Training Datasets 17 read HTTP headers to clearly identify a Web request or a Web response and we also compute the inactivity time between Web messages. ISP: Internet service provider ACD: academic institution

Training Datasets 18

Training Datasets 19

Training Datasets 20

Evaluation Datasets - detection part tcpdump: captured Web packet traces, but this time only TCP/IP headers were captured. 2. Another software: the calculations and the division of flows in Web pages are done without examining TCP payload (HTTP headers) information. Web Message Size: consider every MTU-sized packet as a part of the same Web message, if there is not too much inactive time between them.

Evaluation Datasets 22 We used the number of requests per page as a filter to remove smaller flows. The other three parameters(Web request size 、 Web Response size 、 Interarrival time between requests) are represented by a list of values and they are used in Equations (1) and (2) to generate a χ2 or a Kolmogorov-Smirnov D score.

Evaluation Datasets 23 we have three values that can be compared with thresholds to define if this set of related request- response messages is likely to be Skype or not. VoIP calls of different durations were produced in a controlled way by a small network of computers behind port-restrictive firewalls running the Skype program.

Experimental Results Sensitivity and specificity ROC curves Detecting Skype flows Evaluating real-time detection 24

Sensitivity and specificity 25 Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in statistics as classification function. The test outcome can be positive or negative True positive = correctly identified False positive = incorrectly identified True negative = correctly rejected False negative = incorrectly rejected

ROC curves 26 ROC curves: Receiver Operating Characteristic curves A graphical plot of the sensitivity against (1−specificity) of a binary classifier. Sensitivity is the same as true positive rate and (1−specificity) is equal to false positive rate. The classifier has a discrimination threshold that is varied to produce different points in the curve.

Detecting Skype flows 27

28

Detecting Skype flows 29

Detecting Skype flows 30

Detecting Skype flows 31 Fig. 5. χ2 detection. 90% of 80 Skype flows correctly identified (i.e. true positive rate) with less than 2% of 17,294 non-Skype flows incorrectly identified (i.e. false positive rate) a 100% detection rate with around 5% of false positives. Fig. 6. Kolmogorov-Smirnov D detection. a true positive rate of 70% with a false positive rate around 2% a 80% detection with 5% of false positives. χ2 ROC curve are always closer to the top left corner in comparison with the K-S curve.

Evaluating real-time detection 32 a network administrator may want to identify the Skype calls that are currently using the network, not the calls made some minutes or hours ago. here the data is captured and analyzed using limited short time intervals. the χ2 detection using the newly generated trace (the set of all 10s capture files) had a true positive rate up to 85% with a smaller number of false positives compared to the χ2 detection using the ISP-3 trace.

Evaluating real-time detection 33

Conclusions 34

Conclusions 35 It is rather common to find non-HTTP traffic using Web ports to delude firewalls and other network elements. We evaluated a Skype detection system based on statistical tests to efficiently detect Skype flows hidden among Web traffic without a search for particular Skype patterns or signatures and without regarding payload information.

Conclusions 36 We manually produced Skype traffic to build our Web evaluation dataset and verify that the proposed parameters are able to identify Skype flows hidden among HTTP traffic. Using simple metrics taken from two Goodness- of-Fit tests, the χ2 value and the Kolmogorov- Smirnov distance, we show that Skype flows can be clearly detected, but our results suggests that the χ2 metric is a much better choice.

Conclusions 37 considering the experimental results for the chi- square detection, our methodology provides enough flexibility for the network management to adopt different approaches regarding the possible detection of Skype flows in Web traffic. As future work intend to further analyze the real-time detection by investigating the minimum time interval needed. intend to build and evaluate an optimized version of our tool to perform real-time monitoring in network links.

References E. P. Freire, A. Ziviani, and R. M. Salles, " Detecting Skype Flows in Web Traffic," Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS 2008), April 2008, pp Emanuel P. Freire, Artur Ziviani and Ronaldo M. Salles, "Detecting VoIP Calls Hidden in Web Traffic," IEEE transaction on network and service management, Vol no. 5, pp , December

Thanks for listening 39