Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Similar presentations

Presentation on theme: "The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,"— Presentation transcript:

1 The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review, Volume 36, Issue 1,January 2006 Presenter: Ping Wang

2 Overview Problem  How to anonymize the packet traces before released Goal  Try to preserve as much as possible information

3 Background Why share?  Verify the previous results  Compare to the competing ideas on the same data  Provide a broader view Who share?  NLANR’s PMA packet traces  CAIDA’s skitter measurement  LBNL’s internal traffic

4 Background cont. Available anonymization tools  tcpdpriv  Ipsumdump  tcpurify Not general enough, and most of them focus on only the header field, primarily IP addresses

5 A New tool - tcpmkpub Provides a general framework for anonymizing traces It is based on explicit rules for each header field

6 An example specification All fileds must be specified with a name, length, action(“KEEP”, “ZERO”, function)

7 An example specification cont. Supports case statement for the header fields which can vary

8 Anonymization Policy Checksums Link layer Network layer Transport layer

9 Checksums Replace the original checksum C 0 with C c For those cannot be verified checksum  The packet has been corrupted Insert “1”  The original packet is truncated Use C c (note in meta-data) For those checksum is optional, like UCP, use zero as the checksum

10 Link layer Ethernet address is 6 bytes  High 3 bytes represent the NIC vendor Scrambling the entire 6 byte address is not good for research Scrambling only the lower 3 bytes is not good for the vendor Remapping these two parts seperately

11 Network layer (1) – focus on IP address External addresses  Use the prefix-preserving address anonymization scheme proposed in other paper Internal addresses  not use prefix-preserving address anonymization scheme  Use a prefix which is not used by external addresses within anonymous packet  subnet and host portions are mapped seperately.

12 Network layer (1) Scanners  Many organizations run a scanner as part of security operation  Trend to hit addresses in some order, like a.b.c.1, a.b.c.2, a.b.c.3, etc.  Keep the scanner’s IP address uniform across the trace, and flag it in the meta-data. And for the destinations of the sans, use different mapping. For exmaple: X1, X2 belongs to one subnet Y Not involve scanner, map to X’1, X’2 in subnet Y’ Involve scanner, map to X’’1, X’’2 in subnet Z1 and Z2

13 Network layer (3) Multicast addresses  preserved Private addresses  preserved Invalid addresses  Remap it as the subnet existed, but note this information in the meta-data.

14 Transport layer Preserve both port numbers and sequence numbers Rewrite timestamp options  Transform the timestamp into separate increasing counters  Reason: Clock drift manifest in timestamp options can be leveraged to fingerprint a physical machine

15 Testing Can the transformed traces really be used?  Use p0f to do OS fingerprinting  Use tcpsum to find the number of packets and bytes in both the original and transformed traces

16 Test cont. Are the transformed traces really anonymous?  Check tcpmkpub’s own log file  Look for some string in the anonymized traces e.g. “Document”, “Setting”, “ConfirmFIleOp”  Look for like IP addresses  Look for string versions of IP addresses  MAC addresses  Check timestamps

17 Paper contributions Develop a tool, tcpmkpub, for implementing arbitrary anonymization policy; Use meta-data to help researchers to deal with lost information  Invalid checksum, scanner IP Beyond IP address obfuscation, explore many other dangerous details  timestamp, Ethernet addresses, etc.

18 Paper weaknesses Only give two experiments to show the anonymized traces are useful Could have given some anonymization results to make the policy more clear.  For example, in the scanner case, addresses a.b.c.1, a.b.c.2, a.b.c.3, what they would look like if they are involved in scaning traffic, and what if not

19 Future work Keep more consistency between the original and anonymized traces Study online anonymization Provide a tool which can be easily used for validation the anonymized traces Provide a tool for creating an anonymization policy for tcpmkpub

20 Questions?

Download ppt "The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,"

Similar presentations

Ads by Google