A Privacy-Preserving Interdomain Audit Framework Adam J. Lee Parisa Tabriz Nikita Borisov University of Illinois, Urbana-Champaign WPES 2006
Security Auditing Necessary for the maintenance of secure and robust systems Logs contain sensitive information Often performed centrally within one organization
Motivation for Distributed Audit Coordinated attacks are a growing threat [1] –Correlated network reconnaissance –Application-level abuses But there is still that whole privacy thing… [1] S. Katti, B. Krishnamurthy, and D. Katabi. Collaborating Against Common Enemies. Internet Measurement Conference, Privacy- Preserving Now we can… Detect coordinated attacks Avoid single point of failure Analyze data otherwise protected under privacy legislation
Practical Scenarios Virtual Organizations Grid Computing Research Labs Organizations with multiple sites Raw Logs Anonymized Logs Privacy Policy Spectrum This work
Plan of Action… 1.System Architecture 2.Threat Model 3.Log Obfuscation Techniques 4.Implementation and Evaluation 5.Discussion and Future Work
System Architecture Audit Group Auditor Organization Alert !
Threat Model The Organizations… –Keep secrets secret –May try to probe other organizations The Auditor… –An “honest, but curious” adversary –Probabilistic guarantees against a Byzantine adversary
Data Formats Identifiers (ie. DEBUG, WARN) Numbers (ie. 80, 3.14) Trees (ie ) Partially Ordered Sets (ie. RBAC systems) Lists (ie. Packet header fields)
Obfuscation Levels Full Disclosure Local Exact Match Portion Dropping Local Prefix Match Local Greater-Than Basic Numeric Transformations Local Blinded Arithmetic Complete Obfuscation
Local Exact Match Suppose we want an auditor to verify if some message value of a log matches, but not leak any information about the value of that field… Use a keyed-hash MAC to obfuscate value –Can only recover original data by brute force search in space of possible values Warn Error Debug ErrorWarn
Local Prefix Match Suppose we are only interested in certain IP address subnets matching in a log field… Use the keyed-hash MAC construction on each “portion” of a hierarchical log field. –Compared to other prefix-preserving schemes, can be done in one pass
Local Greater-Than Suppose we want to know if some user belongs to a group role in a system… Represent a transformed poset as a bloom filter to test set membership Student User Staff GraduateUndergrad Student User Staff GraduateUndergrad
Local Blinded Summation Suppose we want to provide daily summary reports on intrusions and alerts to all audit members without leaking information about actual statistics. Use homomorphic encryption –Given the complexity of homomorphic computation, appropriate for batched processing =
Analysis Engine A Basic Implementation IDS Logs Application Logs Traffic Logs GLO Alert Manager OrganizationAuditor Alert !
Evaluation On a standard computer… –P4 2.5GHz Processor, 512M RAM, Linux, blah, blah The processing rates are reasonable… –NCSA IDS rates: ~30 records/second –GLO Fastest: Complete obfuscation on a number, poset, identifier is ~20,000 records/second. Slowest: Prefix-preserving match on a tree is ~7,000 records/second A typical network log is processed fast enough… –A log similar to tcpdump processes at ~3,500 records/second
Catching Liars and Cheaters How do we assure the auditor is running the correct software? Trusted computing platforms How can we detect false or incomplete alarms? Sign logs to verify alerts Plant fake log sequences How do we detect probing organizations? Define rules to detect gaming
Information Disclosure Fields in logs are often related Common knowledge can circumvent obfuscation (the crowd boos) Choose data fields to be reported carefully Consider functional dependencies
Future Work Combating information leakage Standard log conversion and optimized obfuscation Investigation into distributed attack detection Key management protocol for audit group
Cliff’s Notes Architecture and obfuscation methods for privacy-preserving distributed audit An encouraging evaluation of obfuscation techniques Some challenges and incentive for further research Questio ns?