Collaborative learning for security and repair in application communities MIT site visit April 10, 2007 Conclusion
Application Community Benefits Increased accuracy –Collect and process more data (behavior variations) Amortized risk –Each member is a sentry: failures yield information –Evaluate proposed fixes (patches) in many situations –A community can afford to sacrifice a few members Shared burden –Distribute tasks: monitoring, evaluating patches, etc.
Monitor LearnCreate Analyze Monitor Enforce How the community cooperates
Two protection approaches Constraints approach –Detect code injection, crashes –Learn constraints correlated with problems –Avoid problems by avoiding bad states –Evaluate multiple fixes Genealogy (DNA) approach –Assign new executions to whitelist or blacklist –Use similarities to other executions
Attacks protected against Handles the most important attacks in practice: –Execution of Malicious Code Memory-based (constraints approach) Script-based (constraints approach) Executable-based (genealogy approach) –Denial of Service (constraints approach) Attacks not handled: –Privilege escalation –Cross-site scripting –Weak/missing permissions –Information leak (but see Stephen McCamant’s work)
Accomplishments New approach to detection –Fewer false positives than constraint violation Instrumentation of stripped Windows binaries –Variables and program points in binaries Technique for creating LiveShield patches Investigated real exploits Program genealogy approach and experiments
Future work: Constraints approach Logging: Based on detected problems, select subset of program points to examine Instrumentation: scaling, expressiveness Determine which constraints to enforce Generate multiple repairs for violated constraints Evaluate repairs, select the best one(s) Evaluate on more real exploits
Future work: Genealogy approach Release Determina infrastructure to researchers –Closed proprietary code, open ‘client’ interface –For reverse engineering, tracing, application communities More fully investigate malware family recognition –Implement the signature and trace databases –Sand-boxed execution before classification
Future work: Red Team evaluation Rules of engagement for Red Team evaluation
Evaluation goals (from proposal) At the end of the project (30 months) –Injected code attacks Detect 95% Repair 60% of those –Attacks that ‘damage the information representation’ Detect 50% Repair 30% of those Proposed 18 month goals –Meet injected code attack goal
Proposed 30 month goals Meet original injected code goals ‘Damage the information representation’ goals: –Define ‘damage the information representation’ as crashes This will miss some information representation attacks It will also catch attacks that don’t damage information representation Reasonable compromise that is clearly defined –Detect 50% and fix 30% of those (as per the proposal)