Outsourcing Security Analysis with Anonymized Logs Jianqing Zhang, Nikita Borisov, William Yurcik 2 nd International Workshop on the Value of Security through Collaboration Friday, September 1, 2006
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 2 Motivation Managed Security Service Providers: Security outsourcing is a trend Security monitoring is getting more complicated and sophisticated Economical: assemble skilled security professionals Effective: shared security infrastructure across organizational boundaries Challenges Sensitive data is shared Data protected by privacy laws Valuable information to competitors Useful information to adversaries
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 3 Managed Security Service Provider
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 4 Problem Statement What are the criteria for log anonymization that sufficiently protect privacy and guarantee MSSP ’ s efficiency?
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 5 Contributions Case studies of common attack types based on classic logs Derive a common set of anonymization criteria Retain time interval dependence between records Pseudonymize the external IP addresses re- identifiably Pseudonymize the internal IP addresses re- identifiably and preserve some network topology information First step for privacy-preserving MSSPs
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 6 NetFlows and Syslogs NetFlows: network-based log Timestamps IP address pairs (source/destination) Port pairs (source/destination) … Syslog: host-based log Application level critical events
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 7 Which Data is Sensitive? Identity information External (source) IP Partner, common guest and adversary Internal (destination) IP Internal user System privacy & security Timestamp When the transactions happen Destination port number Services and applications hosted on the system Subnet number Internal network structure Records number Overall resource usage
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 8 Log Anonymization Mechanisms Timestamp anonymization Time unit annihilation Random time shifts Enumeration IP address anonymization Truncation Random permutation Prefix-preserving pseudonymization Port number anonymization Bilateral Classification Black Marker Anonymization Random permutation
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 9 Traffic Traces Logs: Port Scan Start timeSrcIPaddrSrcPortDstIPaddrDstPortPPkts 18:56: :56: :56: :56: … Scan all ports of a single host: Source: same address, different port numbers Destination: Same addresses Different ports (sequentially) In a short time
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 10 Traffic Traces Logs: DoS/DDoS SYN Flood Source: same addresses, same (or different) port numbers Destination: Same addresses Same port (intended to a particular protocol or application) Protocol / Packets/ Packet size In a short time Start timeSrcIPaddrSrcPortDstIPaddrDstPortPPktsB/Pk 21:47: :47: :47: :47: …
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 11 Anonymization Constraints on Traffic Traces Logs Timestamp (Start Time) Events interval and time dependence should be retained Anonymization Time unit annihilation Random time shifts Enumeration
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 12 Anonymization Constraints on Traffic Traces Logs (cont.) Source/Destination IP address Anonymized and re-identifiable Retain virtual network topology (dest.) Anonymization Truncation Random permutation (pseudonyms) Source (external) IP address Prefix-preserving pseudonymization Destination (internal) IP address
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 13 Anonymization Constraints on Traffic Traces Logs (cont.) Source/Destination port number Contain sensitive information More efficient if retained Anonymization Bilateral Classification Black Marker Anonymization Random permutation
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 14 Active Operating System Fingerprinting Syslog Syslog + Tcplog Time StampHost Name (IP)MessageSource PortDest. Port
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 15 Anonymization Constraints on Syslog Attributes List Anonymization Constraints Recommended Anonymization Start Time Retain events interval and time dependence Random Time Shifts Source IP AddressAnonymized and Re-identifiablePseudonyms Source PortMore efficient if retainedPseudonyms Dest. IP Address Retain virtual network topology Re-identifiable if anonymized Pseudonyms + Prefix-preserving Dest. Port More efficient if retained Re-identifiable if anonymized Pseudonyms Msg.Retained--
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 16 Sensitive Data After Anonymization Traffic volumes Batched upload Aggregate volumes Dummy log records Sacrifice the efficiency at MSSP False positives and false negatives Size of customer base; customer retention Change the pseudonym mappings periodically Structure of the internal network Simple pseudonyms Periodic rotation of pseudonyms Policy dependent
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 17 Conclusion Sensitive data should be anonymized for security monitoring Constraints on log anonymization Sensitive data leakage after anonymization and countermeasures Privacy and efficiency is a trade-off
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 18 Future Work Analyze other attacks Anonymization strategies for wide range of attacks Patterns of attack detection and general principles Study other log formats and types Analyze correlation of different logs across different organizations
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 19 Q & A Jianqing Zhang Nikita Borisov William Yurcik
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 20 Anonymization Constraints on Traffic Traces Logs Attributes List Anonymization Constraints Recommended Anonymization Start Time Retain events interval and time dependence Random Time Shifts Source IP AddressAnonymized and Re-identifiablePseudonyms Source PortMore efficient if retainedPseudonyms Dest. IP Address Retain virtual network topology Anonymized and Re-identifiable Pseudonyms + Prefix-preserving Dest. PortMore efficient if retained--
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 21 Port Scan (cont.) Portmap scan: Source: same address, different port numbers Destination: various addresses, same port (portmap daemon) In a short time Start timeSrcIPaddrSrcPortDstIPaddrDstPortPPkts 10:53: :53: :53: :53: …
Sep. 1, 2006Outsourcing Security Analysis with Anonymized Logs 22 DoS/DDoS (cont.) Distributed SYN Flood Source: different addresses, different port numbers Destination: Same addresses Same ports (intended for a particular protocol) Protocol / Packets/ Packet size In a short time Start timeSrcIPaddrSrcPortDstIPaddrDstPortPPktsB/Pk 19:08: :08: :08: :08: …