Balancing Risk and Utility in Flow Trace Anonymization

Slides:

Advertisements

Similar presentations

Netflow Data-Mining Techniques Chris Poetzel Argonne National Laboratory Scott Pinkerton.

Advertisements

New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.

Detectability of Traffic Anomalies in Two Adjacent Networks Augustin Soule, Haakon Ringberg, Fernando Silveira, Jennifer Rexford, Christophe Diot.

FLAME: A Flow-level Anomaly Modeling Engine

MULTOPS A data-structure for bandwidth attack detection Thomer M. Gil Vrije Universiteit, Amsterdam, Netherlands MIT, Cambridge, MA, USA

Robust Network Compressive Sensing Lili Qiu UT Austin NSF Workshop Nov. 12, 2014.

Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.

1 Internet Networking Spring 2004 Tutorial 1 Subnetting and CIDR Proxy ARP.

Outsourcing Security Analysis with Anonymized Logs Jianqing Zhang, Nikita Borisov, William Yurcik 2 nd International Workshop on the Value of Security.

Shivkumar KalyanaramanRensselaer Q1-1 ECSE-6600: Internet Protocols Quiz 1 Time: 60 min (strictly enforced) Points: 50 YOUR NAME: Be brief, but DO NOT.

Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.

National Center for Supercomputing Applications Adam Slagell, Jun Wang and William Yurcik, National Center for Supercomputing Applications (NCSA) University.

Modeling/Detecting the Spread of Active Worms Lixin Gao Dept. Of Electrical & Computer Engineering Univ. of Massachusetts

A Signal Analysis of Network Traffic Anomalies Paul Barford, Jeffrey Kline, David Plonka, and Amos Ron.

0 Mining call data to increase the robustness of cellular networks to DoS attacks Hui Zang and Jean Bolot Sprint

A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ.

Differences between In- and Outbound Internet Backbone Traffic Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University.

Connect communicate collaborate Anomaly Detection in Backbone Networks: Building A Security Service Upon An Innovative Tool Wayne Routly, Maurizio Molina.

Improving Intrusion Detection System Taminee Shinasharkey CS689 11/2/00.

APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.

1 Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Speaker: Jun-Yi Zheng 2010/03/29.

Understanding the Network-Level Behavior of Spammers Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster Ye Wang (sando)

1 Characterizing Botnet from Spam Records Presenter: Yi-Ren Yeh ( 葉倚任 ) Authors: L. Zhuang, J. Dunagan, D. R. Simon, H. J. Wang, I. Osipkov, G. Hulten,

DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.

Connect. Communicate. Collaborate Experiences with tools for network anomaly detection in the GÉANT2 core Maurizio Molina, DANTE COST TMA tech. Seminar.

© 2010 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual.

Mapping Internet Sensors with Probe Response Attacks Authors: John Bethencourt, Jason Franklin, Mary Vernon Published At: Usenix Security Symposium, 2005.

Resource/Accuracy Tradeoffs in Software-Defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan HotSDN’13.

Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,

What is Network and Security Research? Network and Security Research, or Information Communication Technology (ICT) Research involves: the collection,

Institute of Technology Sligo - Dept of Computing Sem 2 Chapter 12 Routing Protocols.

Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:

Automating Analysis of Large-Scale Botnet Probing Events Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST)

Open-Eye Georgios Androulidakis National Technical University of Athens.

The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Early Detection of DDoS Attacks against SDN Controllers

Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.

Mapping Internet Sensor With Probe Response Attacks Authors: John Bethencourt, Jason Franklin, and Mary Vernon. University of Wisconsin, Madison. Usenix.

1 Monitoring and Early Warning for Internet Worms Authors: Cliff C. Zou, Lixin Gao, Weibo Gong, Don Towsley Univ. Massachusetts, Amherst Publish: 10th.

Network Anomaly Detection Using Autonomous System Flow Aggregates Thienne Johnson 1,2 and Loukas Lazos 1 1 Department of Electrical and Computer Engineering.

SCREAM: Sketch Resource Allocation for Software-defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan, Amin Vahdat (CoNEXT’15)

Ethernet Basics – 7 IP Addressing. Introducing IP Addressing  IP address (TCP/IP address)  Not unique (but should be), user assigned  Layer 3  4 byte.

Introduction Structure Operating Value Legislation Conclusion Reference Domain Name System.

Sem 2 v2 Chapter 12: Routing. Routers can be configured to use one or more IP routing protocols. Two of these IP routing protocols are RIP and IGRP. After.

SketchVisor: Robust Network Measurement for Software Packet Processing

Anonymization of Network Trace Using Differential Privacy

DNS-sly: Avoiding Censorship through Network Complexity

The Devil and Packet Trace Anonymization

Authors – Johannes Krupp, Michael Backes, and Christian Rossow(2016)

Internet Networking recitation #1

Data Streaming in Computer Networking

Worm Origin Identification Using Random Moonwalks

De-anonymizing the Internet Using Unreliable IDs

Impact of Packet Sampling on Anomaly Detection Metrics

Optical Networks & Smart Grid Lab.

De-anonymizing the Internet Using Unreliable IDs By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Peng Cheng 03/22/2017.

DDoS Attack Detection under SDN Context

AKAMAI INTELLIGENT PLATFORM™

SCREAM: Sketch Resource Allocation for Software-defined Measurement

CORE Security Technologies

Mapping Internet Sensors With Probe Response Attacks

Privacy-Preserving Dynamic Learning of Tor Network Traffic

Memento: Making Sliding Windows Efficient for Heavy Hitters

Modeling, Early Detection, and Mitigation of Internet Worm Attacks

Transport Layer Identification of P2P Traffic

Network Security Mark Creighton GBA 576 6/4/2019.

Introduction to Internet Worm

When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.

Report 7 Brandon Silva.

Presentation transcript:

Balancing Risk and Utility in Flow Trace Anonymization Martin Burkhart, ETH Zurich burkhart@tik.ee.ethz.ch Joint work with Daniela Brauckhoff, Elisa Boschi, Martin May

Motivation Sharing of traffic measurements is crucial Only a limited set of sources available Reproducibility of results Dynamics / variability of traffic Get the big picture (e.g. Internet Storm Center) Keep up with globalized attacks (e.g. botnets) More and more traces are collected but not shared Data protection legislation Security concerns Competitive advantage

State-Of-The-Art: Anonymization Black Marking Truncation E.g. last bits of IP addresses Permutation Random (Partial) Prefix-preserving IP address permutation Enumeration E.g. Timestamps: keep the logical order of events Categorization Randomization (data mining community) K-Anonymity (data mining community)

The Tradeoff in Anonymization It‘s a trade-off RU-Maps t: Anony. Strength X-Axis: Utility(t) Y-Axis: Risk(t) Not quantitatively studied, lack of metrics Strongly dependent on the application / attacker model Risk(t) Algorithm X X t=0.1 X t=0.2 X t=0.4 X Prefix Pres. X Random Perm. X t=0.7 Sweet Spot Utility(t)

A Case Study: IP Address Truncation Techniques that permute IP addresses 1:1 are reversible Characteristic object sizes/frequencies, behavioral profiling, fingerprint active ports, exploit prefix structure Apply IP address truncation and evaluate the risk and utility dimensions Lower risk: Hosts are aggregated to subnets Lower utility: Resolution of entities is reduced Quantifying the tradeoff: How bad is it in numbers? IP address 8 bits trunc. 16 bits trunc. 123.45.67.89 123.45.67.0 123.45.0.0 123.45.67.123 123.45.12.34 123.45.12.0

Internal vs. External Prefixes Factor 3 Factor 53 x = 8 Asymmetry in prefixes external Internal (AS 559) Is this reflected in Risk reduction? Utility reduction? Unique Count (log) Prefix length (32-x)

Measuring Utility of Truncated Data Specific application: anomaly detection Compare detection quality of scans and (D)DoS attacks in original and truncated data Two IP-based metrics Unique address count Address entropy 3 weeks of NetFlow data ~ 43 billion flows SWITCH network

Measuring Detection Quality Ground truth: Manual identification of scans/(D)DoS attacks Run a Kalman filter on metric timeseries Utility measured by AUC (area under the ROC curve) Vary threshold

Utility of Truncated Data Internal metrics degrade faster than external metrics Counts degrade faster than Entropy

Approximating Risk of Host Identification In general: Truncation of x bits leads to 2^(32-x) prefixes with 2^x addresses per prefix But: only a fraction (A) of potential addresses is usually active Hence, On average A*2^x addresses per prefix 1, 2, 3, ... 10, 11, 12, ... 240, 241, ... 254, 255 129.130.80. e.g. A = 10%

Risk of Truncated Data (total: 2.2 million) (total: 4.3 billion) Risk for external addresses is higher due to sparcity! Constant offset:

The Risk-Utility Tradeoff No truncation 4 bits 8 bits 12 bits 16 bits best tradeoff Metric x Utility Risk internal entropy 8 0.94 0.035 12 0.87 0.002 external entropy 16 0.97 0.02

Conclusion We made a quantitative evaluation of the risk-utility tradeoff in anonymization Entropy is much more resistant to truncation than unique counts Risk and utility degrade faster for internal addresses For detection of scans and (D)DoS attacks, it is possible to get a good tradeoff with high utility and low risk

Thank You for the Attention