Slide 1 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Intrusion Tolerance by Unpredictable Adaptation (ITUA) Probabilistic Validation of Intrusion Tolerance Presented by William Sanders and Michel Cukier OASIS PI Meeting, August 21, 2002
Slide 2 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Motivation Aim of intrusion tolerance: –Increase the likelihood that an application will be able to continue to operate correctly in spite of malicious attacks that may occur and may result in successful intrusions Before intrusion tolerance can be accepted as an approach to providing security, techniques must be developed to validate its efficacy Validation should be done: –During all phases of the design process to make design choices –During testing and operation to gain confidence that the “amount” of intrusion tolerance provided is as advertised
Slide 3 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Our Approach We take a total-lifecycle approach to validation, using –Probabilistic analytic models throughout the system lifecycle –Detailed simulation models as the design matures –Intrusion injection and controlled experimentation on an implemented prototype –Detailed models combined with results from experimentation to build an overall model –Red teaming on the complete prototype Models have two components: –a model, of an attacker, the system, and the workload required to be supported by a system –a set of measures that provide estimates of the desired survivability properties
Slide 4 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Validation Throughout System Life Cycle Prototype Implementation Detailed Design High-Level Design High-Level Analytic/Simulation Model Analysis/ Detailed Simulation Analysis/ Detailed Simulation Red Teaming Intrusion Injection & Controlled Experimentation Specification of Attacks/Faults to Consider Specification of Attacks/Faults to Inject Specification of Workload High-level Design Prototype Implementation Detailed Design
Slide 5 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Proposed Survivability Model Structure Survivability Measure Resource/Privilege State Intrusion-Tolerance Mechanism Application Workload Attacker System
Slide 6 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Outline Will report on two new models: –High-level analytic model of IP Tables/SNORT control loop –Detailed model to numerically analyzed to understand fine-grained tradeoffs between system and environmental parameter values in ITUA Replication Management Scheme Will report on preliminary results obtained from data collection and the link to model parameters
Slide 7 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Modeling an ITUA Control Loop: Resource Consumption The IP Tables/SNORT control loop: –Monitor network ports for suspicious activity –Respond to suspected attacks by filtering traffic –When monitoring and response is local to a host, response can be quick Model: –Attack uses a single port each request on this port reserves a resource resource is limited –Attack requests cannot initially be distinguished from legitimate requests e.g., source is spoofed and random otherwise, selective filtering would be possible –Defense times-out attack requests e.g. TCP SYN flood, attacker does not correctly execute protocol timeout interval randomly varied to prevent prediction –Attack has maximum rate
Slide 8 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Possible Defense Strategy: Periodically Close Port resource exhaustion time open close open close open close average timeout
Slide 9 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Optimal Strategy Maximize availability for legitimate requests Periodic strategy has good property that availability is constant regardless of attack duration if resource is never exhausted Optimal periodic strategy is to minimize close-time while making resource exhaustion rare in the steady state –optimum close-time is function of open-time, attack rate, timeout, and resource limit –availability decreases only slowly if resource exhaustion is allowed, because an “exhausted” resource is still significantly available due to timeouts
Slide 10 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Stochastic Activity Network Model of ITUA Intrusion-Tolerant Replication Probabilistic modeling of intrusion-tolerant replication management system using Stochastic Activity Networks (SANs) A detailed model that includes attackers with varying degree of sophistication, correlations between different phases of a single attack, attacks against individual processes and hosts, unpredictable defense strategies, and several layers of intrusion detection Multiple measures are considered: –Unavailability and unreliability –Process load on a host –Fraction of security domains excluded, and –Fraction of hosts corrupted before a domain is excluded
Slide 11 Probabilistic Validation of Intrusion Tolerance Not for public distribution. SAN Models Replica submodel models behavior of a single replica: starting of replica, attacks on replica, detection of infiltration by IDS, misbehavior by infiltrated replicas and its detection by other replicas. Management submodel models the process of recovery by the management infrastructure through the starting of new replicas Host submodel models the activities on a single host: attacks on host, detections and false alarms by IDS, starting replicas and management entities, and shutting down Composed Rep/Join Model
Slide 12 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Comparative Performance Under Different Distributions of Constant Number of Hosts Into Domains Observations: Low unavailability possible even when system left without any human intervention. Unavailability for a particular application does not change much with increase in the number of applications. Unavailability increases significantly as we increase the hosts per domain. Fraction of domains excluded also increases with the increase in hosts per domain, along with decrease in the total number of domains, both resulting in decrease of the number of good domains available for recovery—hence increase in unavailability. 12 hosts distributed into 1, 2, 3, 4, 6 or 12 domains. As hosts per domain increases, number of domains decreases. 2, 4, 6, or 8 applications with 7 replicas each. One time unit one hour.
Slide 13 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Comparative Performance Under Different Number of Hosts Distributed Into Constant Number of Domains Observations: Slight increase in unavailability when more hosts per domain. Reason: probability of successful intrusion into a host same for all experiments, more hosts in a domain greater chance that one of them is corrupted (and detected) resulting in exclusion of its entire domain. Considerable waste of resources with more hosts per domain, since the domain is excluded as soon as a small number of hosts are infiltrated. Number of domains fixed at 10, number of hosts per domain varied from 1 to 4. 4 applications with 7 replicas each. One time unit one hour.
Slide 14 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Comparison of Domain-Exclusion and Host-Exclusion Management Algorithms Domain-exclusion algorithm excludes an entire domain if a host (or a replica on it) is found to be corrupt (preemptive strike). Host-exclusion algorithm only excludes the relevant host. Infiltration of host OS and services quintuples vulnerability of replicas and management entity on that host. 10 domains with 3 hosts each. 4 applications with 7 replicas each. Rate of spread of attack within a domain (determines how quickly and how much infiltration of a host affects other hosts) increased from 0 to 1. Observations: Unavailability does not change much with increase in rate of spread for the domain-exclusion scheme, while host-exclusion scheme is quite sensitive to it.
Slide 15 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Effect of Rate of Misbehavior by Infiltrated Replicas Observations: Improvement in intrusion tolerance with increase of misbehavior rate of infiltrated replicas. For higher misbehavior rates, systems with better intrusion detection perform worse! Reason: Majority of contribution to base attack rate from attacks on hosts. When IDS is good, host infiltration is detected before it has time to spread to replicas on the host, resulting in exclusion of domain (and host) though replicas on it may not be corrupt. 4 applications with 7 replicas each. “Normal” IDS rates: detection of script-based attacks in hosts: 95%, exploratory attacks: 75%, innovative attacks: 40%, attacks on management entities and application replicas: 80% each. For 20% efficacy, each rate reduced to 20% of normal. Study: 10 domains with 2 hosts each. Attack on a host increases the vulnerability of replicas running on it 5 times. Cumulative base attack rate of 5, with direct attacks on hosts, attacks on replicas and attack on management distributed in 3:1:1 ratio.
Slide 16 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Data Collection Data collection: –Needed for estimating parameter values of the models –Focus on different vulnerabilities, attacks, workloads, … Status: Network vulnerabilities: –Use of Nessus –Run on 3 networks at UIUC (data collected on 225 hosts – not analyzed yet) Status: Host vulnerabilities: –Development of new tool “Ferret” with: Perl plug-ins (one plug-in for each vulnerability checked) Open source license Addition of plug-ins from the security community –Analysis of a former data collection performed at LAAS-CNRS This presentation focuses on the data collection performed at LAAS-CNRS and links some results to some model parameters
Slide 17 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Data Collection: Host Vulnerabilities Data collection performed at LAAS-CNRS during 21 months ( ) Host vulnerabilities collected on network used by more than 800 users Results based on: –Some well-known host vulnerabilities/configuration features –Guessable passwords found by Crack (limited dictionary/rules) Experiment: –Goal: observe the behavior of the users of a real computer network without interferences (e.g., results were not reported to system administrators) –LAAS-CNRS network is used by researchers and students in various branches of Engineering (not only computer science) –LAAS-CNRS network is representative of a moderate secure network (at that time)
Slide 18 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Results: Guessable Passwords Observations: –Overall increase of ratio of vulnerabilities / users –Sharp decrease day 470 due probably to action of syst admins –Jump day 285 due probably to a change of dictionary used by Crack –Number of vulnerabilities changes but rate of new vulnerabilities is stable (0.41 before day 282 and 0.4 after day 287) –Rate of removed vulnerabilities is also stable –Use constant value of rates of new/removed guessable passwords as characterization of vulnerabilities due to guessable passwords?
Slide 19 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Results: Vulnerabilities in Configuration Files and.rhosts Configuration files: –.login,.logout,.xinitrc, … –Sharp increases/decreases combined with periods of stability –Model of a step function for describing the ratio of vulnerabilities / users?.rhosts file: –Remarkable stability of ratio of vulnerabilities / users –Ratios between 2.5% and 4.5% –Use of constant value (interval or average) for characterizing ratio? Vulnerabilities in configuration files.rhosts files
Slide 20 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Proposed Models We will focus on guessable passwords and vulnerabilities in configuration files from now on Guessable passwords: –Combination of a linear function and a step function Vulnerabilities in configuration files: –Step function t Nb_vuln(t) t0t Nb_vuln(t) t0 Guessable passwordsVulnerabilities in configuration files
Slide 21 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Linking Collected Data to Model Parameters Link to model parameters: –D: time before infiltration of a security domain –attack_host: rate of attacks on a host D and attack_rate are function of number of vulnerabilities and attacks Let us assume a constant rate of attacks to exploit guessable passwords (r_attack_ passwords) and vulnerabilities in configuration files (r_attack_configuration) As a first approximation we have: More work is needed to: –Obtain the distributions of attacks on various vulnerabilities –Confirm models of vulnerabilities However, this first analysis already gives some hints on the link between the collected data and the parameter value estimations
Slide 22 Probabilistic Validation of Intrusion Tolerance Not for public distribution. Summary Probabilistic validation is an useful technique for validating intrusion- tolerant systems It should be used in all phases of a system’s lifecycle Models are useful for making comparative studies and evaluating design alternatives, even if exact parameter values are not known Better parameter value estimation is necessary, for implemented systems, to quantify intrusion tolerance obtained More work is needed to build better models, and to better determine input parameter values