BBN Technologies and University of Illinois
2001 August 7
This document presents the intrusion model developed for the ITUA project. ITUA, which stands for Intrusion Tolerance by Unpredictable Adaptation, is DARPA-sponsored research aimed at increasing the survivability of systems. The research is being performed by BBN Technologies, the University of Illinois, and the Boeing Company.
An intrusion model defines a set of possible attacks against a computer system. We will say that each attack in this set is covered and attacks not in the set are not covered by the model. The attacks covered by the ITUA intrusion model are the attacks the ITUA project is most concerned about defending against.
Ideally, the set of attacks covered by the model should be chosen to be exactly what attackers can do in practice. So the model should cover attacks known to be possible and the attacks not covered should be those that are very difficult, infeasible, or even impossible. But this ideal model, even if it exists, is too complicated and too ephemeral to be useful: the technology attackers use and the environment they work in is ill-defined and constantly changing.
Instead, the model presented here identifies some important, but abstract, features of attacks and of their environment. The model is written in terms of these abstract features. The model covers most, but certainly not all, possible attacks. Because the model is abstract, some details that distinguish between attacks that are possible and attacks that aren't will necessarily be left out. For example, the model does not include any details about known vulnerabilities in existing operating systems.
There is a tradeoff between the model's coverage and the possibility of defense against the attacks it describes. On the one hand, if the model covers all possible attacks then no strategy can defend against the worst case. For example, if the model covers an attack in which all computer hardware is instantaneously destroyed, no software defense would be possible. On the other hand, if the model eliminates so many attacks that defense is easy, the model would not be interesting because common attacks would not be covered. The ITUA model is an attempt to get good coverage and at the same time allow some defense against the worst attack.
At a minimum, every intrusion model must characterize the attacks that are covered, but the ITUA intrusion model goes further to characterize also the mechanisms available for defense. These mechanisms are described in the same abstract terms as the attacks. The result is a model that defines a set of intrusion scenarios in which attack and defense interact. The model states some assumptions about the effectiveness of defense mechanisms; using these assumptions, the likelihood of each scenario can be computed.
The ITUA intrusion model has three parts, one part per section of this document:
This document also includes some analysis of the intrusion model:
The system to be defended consists of a set of security domains. A security domain implements a boundary that attackers have difficulty crossing. For example, a host can be a domain, especially if an attacker with privileges on one host is not automatically granted privileges on other hosts. Another example of a domain is a LAN with firewalls separating it from other networks. In this example, it is possible to configure the firewalls to make it difficult for an attacker with privileges on this LAN to gain privileges elsewhere.
Each domain is either okay or infiltrated. An infiltrated domain is one in which the attacker is able to control and damage resoures freely. For example, a domain that is a Unix host is infiltrated when attacker has gained "root" privilege on that host. An okay domain is one that has not been infiltrated.
Assumptions about security domains:
In each domain a set of processes can run. The set of processes running in a domain is dynamic, i.e., existing processes may be stopped and new processes started.
Each process is either okay or corrupt. An okay process necessarily functions according to its specification; a corrupt process need not.
Assumptions about processes:
Note that there is no way to repair a corrupt process. A similar effect can be gotten, though, by stopping the corrupt process and starting an okay one.
Processes communicate by sending and receiving messages. This communication is assumed to be "timed asynchronous" in the sense explained in Fetzer and Cristian, PODC 1996, and summarized in assumptions C1-C3.
Assumptions about communication:
Components of the system to be defended may be implemented as processes, but they can also be replicated in process groups for fault tolerance. A process within a group is called a replica.
Some groups and processes are essential while others are not. A group or process is essential if its failure to function according to its specification implies that the system fails too.
Assumptions about groups:
The attacker's goal is to cause the system to function poorly. He does this by corrupting processes until the system's tolerance to this corruption is defeated. An attack can be either partially or completely successful. In a partially successful attack, some measure or measures of quality-of-service (QoS) will be significantly degraded. In a completely successful attack, the system fails.
The basic action of the attacker is to infiltrate a single domain. He may do this in a variety of ways, including stealing a password, installing a Trojan Horse, or exploiting an operating system or protocol flaw. We assume that infiltrating any domain will always be possible, but that it will be difficult and will take the attacker some time.
After a domain has been infiltrated the attacker can corrupt its processes. Corrupting a process may be as simple as stopping it or as complicated as modifying its running core image in order to make it behave maliciously. Regardless of the complexity, we assume that corrupting processes is quick once the domain has been infiltrated.
Assumptions about attacks:
The defender has control over those parts of the system the attacker doesn't:
The defense must coordinate the actions of replicas within a group so that some failures are tolerated. This coordination will be called replication management. It includes masking the bad behavior of corrupt replicas, detecting the existence of corrupt replicas and removing them from the group, and starting new replicas.
Assumptions about replication management:
Note that the defender does not know in advance which domains and processes will be attacked first, or which process groups can be made to fail Byzantine.
The model assumes that an attack on a domain will sometimes be detected before that domain is infiltrated and before processes running in that domain are corrupted. An attack may be detected by one or more intrusion-detection systems (IDSs) looking for anomalous behavior in the domain. For example, an attacker may conduct a port scan to find system processes with known vulnerabilities. Detecting this scan often precedes an attacker gaining "root" access on a host by exploiting one of the vulnerabilities found.
Note that an attack may also be detected after infiltration because processes crash or behave abnormally. In contrast, this section expresses only the properties of detection before infiltration.
Assumptions about intrusion detection:
Assumption ID1 quantifies the fact that IDSs are not perfect: some attacks are not detected and some detections are false alarms. Presumably both P and Q are increasing with improvements in IDS technology. The ITUA project, however, is not concerned with improving the state of the art in building IDSs; the goal here is simply to model the value of that technology.
Note that some assumptions about detection are necessary. Without detection, an attacker could silently infiltrate every domain then suddenly cause every process to stop. With detection, a defensive strategy may have time in which to react to an attack.
The goal of ITUA is to maximize the system's useful life in spite of attack. The useful life will be measured by the expectation value of the amount of correct computation the system performs in the G essential groups. In other words, an average is taken over all possible system histories, summing the amount of correct computation in that history, and weighting by the probability that that history actually occurs.
Computing the system's useful life depends on two strategies, one for attack and one for defense. ITUA seeks a defense strategy for which the attacker's most damaging strategy allows the longest useful life.
Measuring the system's useful life depends also on what is being computed in the G essential groups. If computation in group 1 must block to wait for computation in group 2, then group 1 is not making full use of the time available for computation. To make the definition of useful life application-independent we assume that there are no such dependencies among the essential groups. Thus a measurement of useful life will be an upper bound on the amount of computation achievable in practice.
Given a strategy for using the mechanisms in section 4, one can compute (or estimate via simulation) the system's useful life, as defined in section 5.
The analysis of the model to date has used these simplifying assumptions:
The strategy against which others should be compared first is:
Strategy 0:
Given defense "strategy" 0, the attacker's best strategy is, obviously, to infiltrate the domain in which the replica runs and kill it. This takes time D. So the system's useful life is:
Now consider a more interesting strategy:
Strategy 1:
Given defense strategy 1, the attacker's best strategy is simply to infiltrate domains known to have replicas and to corrupt the replicas as fast as possible, namely, one domain per interval D. Then the system is a time-invariant Markov process that changes state every interval. At each state change, the defense has probability P of detecting the most recent infiltration and changing the group membership. The probability of an infiltration going undetected is (1-P). After K undetected infiltrations the system is doomed.
The system's useful life can be computed by averaging the total time for computation over all scenarios. This average is an infinite sum that can be computed directly or as a solution to a recurrence relation in K that expresses the self-similarity of the system after one interval:
Note that this useful life is obviously greater than the useful life of the system without replication, i.e., D, for values of B greater than 1/2, i.e., if replica management has less than 100% overhead, even if P=0, i.e., even if intrusion detection is worthless.
For what value of K is the useful life maximized? Note that B and R, both properties of replica management, will likely be functions of K. Suppose that the cost of replica management is dominated by message passing and that the number of messages is proportional to K^2. Then
Note that when intrusion detection is poor (a common situation!), the first term in the sum for U is always greater than the second, possibly much greater. So approximate U as
Strategy 1 does not contend very well with the fact that, at every stage, the attacker has some chance of corrupting a replica without being detected. A possible adaptation would be to increase the degree of replication when under attack. Now consider:
Strategy 2:
For this strategy, name the useful life U(K,J). Consider first the case of J=1. Then:
If K=J=1 and intrusion detection is poor, algebraic manipulation shows that strategy 2 is always better than strategy 1. It remains to be seen whether strategy 2 is still better under more general conditions, and whether refinements to the strategy will improve it.
This model quantitatively describes a set of attacks against a system and a set of defense mechanisms to counter the attacks. The defense mechanisms can be coordinated in a defense strategy. This document shows the use of mathematical analysis of the model to estimate the effectiveness of some simple strategies.
In the future, the ITUA project will seek more effective defense strategies and will attempt to analyze them with techniques similar to those described in section 6. A broader range of model parameters will be considered as well. Unpredictable defense strategies will be introduced. If the techniques of section 6 become intractible in some of these cases, other analysis techniques, such as simulation, will be applied.