Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis Presented by Yang Gao 11/2/2011 Charles V. Wright MIT Lincoln Laboratory Scott E. Coull Johns Hopkins University Fabian Monrose University of North Carolina
Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results
Privacy Security
Packet Size and Timing Information Privacy Leakage Classification Tools Language of a VoIP call Password in SSH Web browsing habits...
How does the attack happen Webpage browsing Statistical Identification of Encrypted Web Browsing Traffic (Sun,Q. Stanford University)
A 2000 sample from 100,000 WebPages Only Objects number and sizes are recorded Jaccard’s coefficient Trained classifier
How does the attack happen Webpage browsing Statistical Identification of Encrypted Web Browsing Traffic (Sun,Q. Et Stanford University) Inferring the Source of Encrypted HTTP Connections (Marc Liberatore and Brian Neil Levine UMA) Identification of Encrypted VoIP Traffic
Results of the Classifiers
Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results
Countermeasures Padding Mimicking Morphing Sending at fixed time intervals(counter the timing analysis)
Comparison
Traffic Morphing morphing
How does the morphing work? L1L2L1 L2L1L2 N L1 : N L2 = 2 : 1 N L1 : N L2 = 1 : 2
Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results
Traffic Morphing Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction
Morphing Matrix Size x1 Size xn Size y1 Size yn 2*n equations and n 2 unknowns
How to solve these equations? We won't solve them directly. Convex Optimization Cost Function Restrictions
Example L1L2L1 L2L1L2
Example L1L2L1 L2 Reduce? Add more constrains to avoid this situation.
Steps for Traffic Morphing Matrix Construction Select the source process and calculate the probability distribution of the packets size. Select the target process and calculate the probability distribution of the packets size. Solve the morphing matrix with optimization method which could minimize the cost while following the restrictions. Traffic Morphing Get the packet to send. set up a random number to select the element in the matrix Calculate the corresponding packet size. Padding or reduce the packet size Transmit the new packet.
Traffic Morphing Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Additional Morphing Constraints
Pitfall 1 System is over-specified Y = AX Solution: Multi-level programming Find Z which is closest to Y Find A which such that most efficiently maps X to Z Z=A’X => Minimize( f d (Y,Z) ) Z=AX => Minimize( f 0 (A) )
Traffic Morphing Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Additional Morphing Constraints Dealing with Large Sample Spaces
Pitfall 2 Pool Scalability Pentium 4 2.8G run 1 hr for 80x80 matrix with 6560 constraints MTU(40~1500) means 1460x1460 Matrix Solution Multi-level method Sub-matrix Morphing
Multi-level method
Traffic Morphing in sum Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Convex optimization Additional Morphing Constraints 2 level Multi-level programming Dealing with Large Sample Spaces Sub-matrix Morphing
Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results
Evaluation Encrypted Voice over IP Web Page Identification Defeating Original Classifier Evaluating Indistinguishability
Encrypted Voice over IP Language Identification of Encrypted VoIP Traffic:Alejandra y Roberto or Alice and Bob? Charles V. Wright Lucas Ballard Fabian Monrose Gerald M. Masson from Department of Computer Science Johns Hopkins University
White box encode
Why even the encrypted voice packet will leak information Unigram frequencies of bit rates
2-gram resemblance
Blackbox
Results for original classifier
Results for Indistinguishablity
Overhead
Web page Identification
Overhead
Practical Considerations Short Network Sessions Short of packets generated by source? Keep generating until reach a distance threshold Variations in Source Distribution Packets size difference for training and using? Divide and conquer Reduced Packet Sizes How to deal with the reduced packet size in HTTP Packing to the next
Traffic Morphing in a nut shell Resemblance Morphing Matrix Convex Optimization Overhead Minimization Additional Morphing Constraints Dealing with Large Sample Spaces Practical Considerations Short Network Sessions Variations in Source Distribution Reduced Packet Sizes
Conclusion User privacy are vulnerable even under encryption protected. Traffic morphing is effective and robust Traffic morphing is applicable. Traffic morphing is much more efficient than padding.
Discussion The other side of morphing Anti-intrude-detection. Mimicry attack System call sequence Malicious call combination library deny accept morphing
Thank you!