Download presentation
Presentation is loading. Please wait.
2
Firewalls and Application Level Gateways (ALGs) Usually configured to protect from at least two types of attack ▪ Control sites which local users connect to ▪ (Try to) limit attacks coming from the Internet Firewalls check TCP ports and destination addresses ALGs verify that the nature of the traffic crossing the network boundary is conforming to the security policies and that is not malicious
3
Tunneling techniques Disguise one application-layer protocol into another one Make security policies ineffective and can lead to a dangerous illusion of security Can be based on DNS, HTTP or SSH protocols
4
Packets are encoded at the application-layer conforming to specific allowed protocol(s). Most commonly, three protocols are used to tunnel Internet traffic: DNS, HTTP, SSH.
5
DNS Tunneling Exploits the way regular DNS requests for a given domain are forwarded Powerful technique since DNS is rarely blocked Can rarely achieve throughputs higher than a few kb/s due to the mechanism’s complexity and is therefore rarely used
6
HTTP Tunneling The packets of the tunneled flow are encoded so that they can be incorporated in one or more regular, semantically valid HTTP sessions SSH Tunneling SSH tunneling is also known as port forwarding ▪ Deep-packet-inspection techniques become useless due to data encryption ▪ Therefore, today’s ALGs allow any protocol to be tunneled through SSH ▪ That makes SSH tunneling a very powerful technique
8
The two previous authentication phases are not used by possible tunnels: The host authentication is not encrypted therefore its packets can be easily discarded. The user authentication is encrypted therefore it is difficult to know when it ends and the actual data exchange begins.
9
Definition: Automatic (machine) recognition, description, classification, and grouping of patterns according to specific features. If the information about how to group the data into classes is known before examining the data, the approach is called supervised, otherwise it is called unsupervised The goal of a pattern recognition technique is to represent each element as a pattern and to assign the pattern to the class that best describes it
10
Stages in a pattern recognition problem Data collection Feature selection or feature extraction Definition of patterns and classes Definition and application of the discrimination procedure Assessment and interpretation of results
11
Class description…revisited Once classes have been identified, a training set T s (ω i ) can be created for every class ω i A thorough inspection of T s can lead to an analytical model describing the corresponding class Then, a decision function f has to be determined with input the observed data x and output a prediction of the class that generated it, ω(x) = f(x)
12
Aims at detecting tunneling activities over the HTTP and SSH protocols Focuses on building an accurate description of legitimate traffic Builds on known pattern recognition techniques
13
Building patterns and classes (1/2) The features are gathered directly from the legitimate flows composing the TCP session TCP flow represented by a pattern which takes into account the: ▪ packet size s i ▪ inter-arrival time Δt between two consecutive packets ▪ number of packets r that are useful for measurement
14
Building patterns and classes (2/2) Class model: the concept of protocol fingerprint ▪ A protocol may be used for N different purposes ▪ Issue: How many classes one has to consider ▪ Two approaches to the issue 1.Train the classifier with flows from a single target class (one-class classifier) 2.New classes composed of outlier flows are added to the analysis (multi-class classifier)
15
One-class tunnel detection algorithm Algorithm definition: the decision function ▪ App = The application-layer protocol that is examined ▪ ω t = The acceptance region (“legitimate” use of App) ▪ ω r = The rejected region (complementary to ω t ) ▪ Given an unknown flow F, the algorithm compares its pattern representation with the fingerprint (for ω t and ω r ) and returns an index of (dis-)similarity (anomaly score)
16
Tunnel Hunter can perform better if is provided with more knowledge about the nature of the traffic. Multi-class classification adds an outlier class ω o which can reduce the number of cases where the uncertainty could allow a packet that should have been rejected.
18
Experiments are for HTTP and SSH Run on a 100Base-TX link Packet size s range [40, 1500] Inter-arrival times Δt range [10 -7, 10 3 ] sec
19
The HTTP case (1/2) 20,000 flows used for gathering the training sets T s and T ” s About 17,000 tunneled sessions were collected, divided among four protocols: POP3, SMTP, CHAT, P2P At the same time, about 15,000 non-tunneled sessions were collected in order to detect if the classifier lets legitimate HTTP traffic to pass
20
The HTTP case (2/2)
21
The SSH case (1/2) 4,000 flows used for gathering the training sets T s and T ” s About 10,000 tunneled sessions were collected, divided among four protocols: POP3, SMTP, CHAT, P2P At the same time, about 600 interactive sessions and about 1700 bulk-transfer sessions were collected in order to detect if the classifier lets legitimate SSH/SCP traffic to pass
22
The SSH case (2/2)
23
State A results (same as in one-class algorithm)
24
State B results
25
State C results
26
Tunnel Hunter problems If an SSH tunnel is initially used for remote administration and then for tunneling other protocols ▪ The first state is legitimate and the classifier will label the session as authorized Sensitive to packet-size and timing value manipulation
27
Tunnel Hunter can successfully recognize whenever a generic application protocol is tunneled on top of HTTP or SSH Increasing the knowledge of the system can significantly improve its performance The experimental results are very promising Virtually no legitimate traffic is blocked The vast majority of tunneled traffic is blocked Completeness near 100% (exactly 100% for HTTP)
28
Tunnel Hunter can be used to improve existing ALGs By augmenting their ability to recognize tunneled traffic The model can be improved By introducing new variables and studying better the role of the existing variables in order to produce stronger fingerprints
29
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.