DARPA Challenges for Anomaly Detection of Program Exploits Anup K. Ghosh, Ph.D. DARPA/ATO JHU Workshop on Intrusion Detection Johns Hopkins University June 13, 2002
DARPA Overview Detecting Code-Driven Threats Prior work in program anomaly detection Applying anomaly detection to Windows processes Challenges to anomaly detection
DARPA Code-Driven Threats
DARPA Background 20 years of intrusion detection research has yielded tools sometimes capable of detecting malicious hackers A viable anti-virus commercial industry has emerged in the same period in the wake of PC viruses
DARPA However… While both approaches are very good at detecting known attacks/viruses… They do not perform well in –detecting novel attacks/malicious code –scaling to Internet-wide attacks –responding in computer time
DARPA A New Threat… Code-driven attacks Malicious hackers spend their time breaking into systems one at a time Code-driven attacks are written once, unleashed everywhere
DARPA How Big is this Problem? Code Red costed an estimated $2.6 billion Worms will continue to exploit vulnerabilities in online software Newly reported vulnerabilities to CERT CC from 1995 to Copyright IEEE, Security and Privacy , supplement to IEEE Computer.
DARPA Why Don’t Existing Defenses Work? Code-driven attacks: –Go through firewalls unimpeded –Go unnoticed by intrusion detection systems –Propagate too fast for anti-virus vendors to disseminate signatures in time –Have complete access to our network and file systems –Execute with our own privileges –Can send sensitive information out over networks –Can spy on our computer and Web usage patterns
DARPA The Future is Ominous --- Nimda was a harbinger Future worms will be: –Architecture independent –Stealthy to its victims using process hiding –Autonomous, so it can independently migrate –Intelligent, so it can learn new exploits on the fly –Polymorphic, to avoid signature detection –Programmable, to learn vulnerabilities and be remotely controllable
DARPA This Problem Requires New Thinking Consider: –Intrusion detection techniques are designed to handle Internet and network-based attacks –Anti-virus software is designed to address malicious code attacks But, neither handle code-driven attacks effectively We need to either learn from the strengths of these approaches, or to develop a new approach entirely
DARPA Prior Work in Program-Based Anomaly Detection
DARPA Intrusion Detection Approaches Misuse Detection scan packets, logs, commands for known malicious patterns. (pattern matching) Upside: known attacks can be detected. Downside: unknown, novel threats not detected. Reactionary. Anomaly Detection Detect intrusions by statistical aberrations from normal usage. Upside: novel or unknown intrusions can be detected. Downside: well-known intrusions may go undetected
DARPA Network-Based vs. Host-Based Intrusion Detection Network-based Scans network packet logs for signatures of intrusive activities. Increasing bandwidth is a challenge. End-to-end encryption could obsolete this approach. Host-based Scans machine audit logs for signatures of intrusive activities. Traditionally monitors users’ behavior. Many sensors/hosts require enterprise management.
DARPA Process-Based Anomaly Detection Premise of process-based approach: “Abnormally behaving programs are a primary indicator of computer misuse.” Approach: –build program behavior profiles for monitored programs and use these to detect intrusions.
DARPA Goals of Process-Based Anomaly Detection Learn Benign Program Behavior Generalize from Observed Behavior Flag Deviations from Learned Behavior
DARPA Cigital’s Three Systems for Anomaly Detection Recurrent Neural Network String Transducer State Tester
DARPA Summary of Cigital System Performance Scope: Detects program misuse --- mainly U2R attacks. Recurrent Neural Network 100% of U2R attacks at a rate of 3 FA/day. String Transducer 100% of U2R attacks at a rate of 3 FA/day. State Tester 100% of U2R attacks at a rate of 9 FA/day.
DARPA Comparison, Strengths, Weaknesses Systems perform comparably --- short training time for string transducer and state tester make them more desirable. Detects program misuse attacks very reliably with few false alarms. Will not detect either programs that are not monitored or attacks that are legitimate uses of programs.
DARPA Performance as a Function of Training Data The horizontal axis represents the percentage of available data used for training. The vertical axis is the percentage of sessions creating false alarms when all possible attacks are detected Table lookup String transducer State tester
DARPA Applying Anomaly Detection to Windows Processes
DARPA Approach for Windows NT Collects system events and identifies anomalous patterns Ported to use Windows NT/2000 base- object audit data Cigital algorithms show high performance with low false positive rates.
DARPA Using strace for NT Data needs to be collected as it is created and streamed to the ID system – NT auditing does not meet these requirements Advantages of using strace for NT Provides additional information such as Thread IDs Can be altered to stream data directly to our system Selectively captures system calls that we need Can be turned On/Off on-the-fly
DARPA Collecting Data in Real-Time Streams of events arrive from multiple processes and multiple threads and need to be sorted accordingly EventsProcess Splitter
DARPA Performing Anomaly Detection Data from each application must be matched with the appropriate model and the state must be updated by the ID algorithm. 42 Algorithm State Model New State State Model
DARPA Performance Against Code Red 11-fold x-validation Includes 2 Code Red attack traces
DARPA Anomaly Detection Challenges
DARPA Training Statistical and machine learning techniques that require baseline behavior profiles require extensive training. –Time consuming –Determines quality of results –Training in one environment may not map well to another environment –Over training is a problem for some classes of machine learning
DARPA False Positives Operators have low thresholds for false positives An acceptable rate might be < 1 per day
DARPA Identification Anomaly detection approaches tell you when something is wrong, not what is wrong, what specific attack is executing, nor where it is coming from.
DARPA Real-Time Response Once an intrusion is detected, systems need to identify, alert, isolate, and respond according to local security policies.
DARPA Summary Much work has been performed in process- based anomaly detection Many challenges remain… Foremost among them, can we leverage process-based anomaly detection to detect future code-driven threats?
DARPA Questions? Anup Ghosh For more info, see: C. Michael & A. Ghosh, “Simple state- based approaches to program-based anomaly detection”, to appear in ACM Transactions on Information and System Security (TISSEC), 2002.