Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa
Background Autonomic systems are characterized by a number of properties that exhibit its ability of self-management. Collectively known as self-* properties
Goal Self-organizing Self-stabilizing Self-optimizing Self-adaptive Self-healing Self-scaling Self-managing What do they precisely mean? How do they differ? Can we find some common framework to satisfy different characterizations of the various self-star properties?
The model of a system Network of processes: topology G = (V, E) Processes execute actions. Each action by a process changes its local state. Global state S is the collection of local states. A computation is a sequence of global states.
What is a “good” system Safety. Property P must always hold Bad things never happen. Example: no deadlock, at least one token must always exist etc. Liveness. Property Q must eventually hold Good things eventually happen. Example: termination, convergence, progress etc. A system configuration is legal when these properties hold
The Environment System Environment Consists of variables that a process can read but not modify. A system is legal when it satisfies its safety properties. Legality is defined with respect to an environment Time of day Network topology User demands for service Output from another system Failures etc
System vs. adversary Adversary Disrupts or challenges the system The adversary causes failures, perturbation, allows processes to join or leave without notice, changes global state, launches security attacks, changes the environment etc. Ultimately the system must win. System Environment
Tolerance to Adversarial actions Masking. Safety and Liveness Properties are NEVER violated. The system always remains in a legal configuration Non masking. Safety properties (but not Liveness) are temporarily violated, but eventually restored. The system state may temporarily become illegal Self-management = tolerance to adversarial actions
Masking vs. Non-masking Legal State space
More on tolerance Tolerance to adversarial actions also depends on its type and extent. A system may be masking tolerant to a single crash failure, but exhibit non-masking tolerance to multiple failures
Graceful degradation The system negotiates the adversarial action, but recovers to a configuration that satisfies a predicate P’ ⊃ P’ (P= original safety predicate). To be graceful, P’ predicate must be acceptable to the application.
Types of actions Actions can be internal or external. External actions can change the environment. Processes execute internal actions only The adversary executes both internal and external actions.
Self-management Self-management is a vision. It encompasses all self-* properties. Typically attributed to systems that exhibit at least one self-star property.
The framework Generally, in defining a specific self-* property, the important issues are: Interpretation of the legal configuration type of adversarial action Type of tolerance permitted like masking, non-masking, or graceful degradation
Self-stabilization Starting from an arbitrary configuration, a self-stabilizing system eventually recovers to a legal configuration (satisfies a predefined predicate P) and remains in that configuration thereafter. Starting from an arbitrary configuration, a self-stabilizing system eventually recovers to a legal configuration (satisfies a predefined predicate P) and remains in that configuration thereafter. Adversarial action: transient failure corrupting the system state Tolerance: non-masking Adversarial action: transient failure corrupting the system state Tolerance: non-masking
Self-adaptation if R k then P k will hold Can be viewed as an extension of a self- stabilizing system, where the legal configuration satisfies the predicate P = ∨ ( R i ∧ P i ) if R k then P k will hold Can be viewed as an extension of a self- stabilizing system, where the legal configuration satisfies the predicate P = ∨ ( R i ∧ P i ) Environment ∈ {R 1 R 2 …, R m }. A system adapts to an environment R. Environment R P i Process j crashes implies that adversary changes the environment variable crashed (j) from false to true
Self-healing A system is self-healing with respect to a subset of external actions if occurrence of those actions cause at most a temporary violation of the system’s legal configuration (safety Property P) Adversarial action: a subset of all possible external actions Tolerance: typically non-masking, but masking not ruled out Adversarial action: a subset of all possible external actions Tolerance: typically non-masking, but masking not ruled out
Comments on Self-healing A self-healing system may not be self-healing with respect to an enlarged set of adversarial actions. Skype is a Self-healing system, but it crashed on August 16, 2007 and was down for nearly two days. Why? A self-healing system may not be self-healing with respect to an enlarged set of adversarial actions. Skype is a Self-healing system, but it crashed on August 16, 2007 and was down for nearly two days. Why?
Comments on Self-healing Also, self-healing frequently leads to graceful degradation.
Self-organization A system is self-organizing with respect to a subset of external actions involving process join and leave if those actions cause at most a temporary violation of the system’s legal configuration (safety Property P) Adversarial action: join / leave actions (up to N processes may concurrently join or N/2 processes may concurrently leave) Tolerance: usually non-masking, but masking not ruled out. Adversarial action: join / leave actions (up to N processes may concurrently join or N/2 processes may concurrently leave) Tolerance: usually non-masking, but masking not ruled out.
Self-organization
Self-organization An example of gathering
Comments on Self-organization A self-organizing system is expected to recover in a reasonable time. [1] imposed a requirement of sub-linear recovery time per join or leave operation [1] Dolev & Tzachar: Empire of Colonies, Theoretical Computer Science 2009 A self-organizing system is expected to recover in a reasonable time. [1] imposed a requirement of sub-linear recovery time per join or leave operation [1] Dolev & Tzachar: Empire of Colonies, Theoretical Computer Science 2009
Self-protection A system is self-protecting with respect to a set of malicious external actions if it maintains its legal configuration (data integrity and continued functionality) in the presence of those actions. Adversarial action: malicious actions Tolerance: masking. Comment: Hard to characterize what a malicious action is. It may be a direct security attack, or something very subtle. Adversarial action: malicious actions Tolerance: masking. Comment: Hard to characterize what a malicious action is. It may be a direct security attack, or something very subtle.
Self-optimization A system is self-optimizing when starting from an initial configuration if it spontaneously improves / maximizes the value of an objective function (cost) relevant to the systems performance Adversarial action: A bad initialization, or an interim action that makes the current configuration sub-optimal. Tolerance: Non-masking. Comment: What if different nodes have different perceptions of cost? Selfishness adds a new dimension. Adversarial action: A bad initialization, or an interim action that makes the current configuration sub-optimal. Tolerance: Non-masking. Comment: What if different nodes have different perceptions of cost? Selfishness adds a new dimension.
Self-optimization with selfish agents Selfish actions used to optimize a system may never reach an equilibrium configuration 1, 34, 3 3, 10 1, 31, 10 3, 1 4,1 1, 10 From a game theory perspective, no Nash Equilibrium exists root Shortest path tree with two different types of processes
Self-configuration The legal configuration is defined over the configuration space: Various notions of configuration, like a set of optimal choices of hardware or software modules and connections among them, which is consistent with the environment. Adversarial action: A subset of external actions. Tolerance: Non-masking. Adversarial action: A subset of external actions. Tolerance: Non-masking.
Self-configuration 1 User
Self-configuration 2 User
Relationships among self-star properties Self-stabilization implies self-healing with respect to any adversarial internal action Self-organization implies self-healing with respect to join and leave operations
Relationships among self-star properties Self-organization implies self-configuration but the reverse is not true For example, a self-configuring web-server changes the connection between server components, processor cycles and memory capacity to provide a stable response [2], but is not self-organizing since it cannot automatically integrate another server [2] (Wildstrom et al ICAC 2005)
Relationships among self-star properties Self-organizationSelf-stabilization A self-stabilizing system that allows a process to join or leave a system of N processes is O(N 2 ) time is not self-organizing (some will disagree)
Relationships among self-star properties Chord P2P network is self-organizing, but not self- stabilizing, since once an adversarial action splits the Chord ring into two rings, they do not join.
New property: self-immunity A system is self-immune with respect to an action C, if (1) initially tolerance to action C is non-masking, but (1) eventually the system is able to mask the effect of action C The system learns from experience, and becomes smarter with time
Self-immune behavior Legal State space
New property: self-containment Self-containment is a variant of self-protection. It prevents the total system from being compromised by external malicious actions. At most a fraction of the system is compromised, but eventually the non-compromised processes are able to offer a meaningful level of service. (The system has the ability of damage control by saving a part of it in spite of a security attack. It is a non-masking version of self- protection, similar in spirit with the fault-containment property of self-stabilizing systems)
Self-containment
Conclusion There is a need for a framework to define what we actually mean by specific self-star property. This will not satisfy everyone’s vision, but as long as it satisfies the majority’s view, the chance of further divergence of views is minimized.