Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)
Quality framework (Bass et al.) Central quality attributes –Availability –Interoperability –Modifiability –Performance –Security –Testability –Usability Other qualities –Portability –Scalability –Variability –Flexibility –Cost –Time to market –… Strongly recommended reading!
A Writing Template 3 Source of stimulus. This is some entity (a human, a computer system, or any other actuator) that generated the stimulus. Stimulus. The stimulus is a condition that needs to be considered when it arrives at a system. Environment. The stimulus occurs within certain conditions. The system may be in an overload condition or may be running when the stimulus occurs, or some other condition may be true. Artifact. Some artifact is stimulated. This may be the whole system or some pieces of it. Response. The response is the activity undertaken after the arrival of the stimulus. Response measure. When the response occurs, it should be measurable in some fashion so that the requirement can be tested.
Example: World of Warcraft Bærbak Christensen4
Example: SkyCave Quality attributeAvailability SourceInternal to the system StimuliA crash ArtifactDatabase server EnvironmentNormal operation ResponseDetects events, record it in log, continues in normal operation Response MeasureWithin 3 seconds Bærbak Christensen5 Quality attributePerformance Source1000 independent clients StimuliGenerate on average 2 character events per second ArtifactSkyCave App server EnvironmentNormal operation ResponseEvents are processed, cave state is updated Response MeasureWith maximal 5 seconds latency
Tactic –A design decision that influences the achievement of a quality attribute response Example of modifiability tactic: –Encapsulate: Introduce explicit interface to module Bærbak Christensen6
CloudArch Core Focus Discussion If a system is not available, what is the point of all other QAs? Security ? – Equals slowness Bærbak Christensen7 System quality attributes –Availability –Modifiability –Performance –Security –Testability –Usability –Interoperability –Scalability
Availability Bærbak Christensen8
Definition(s) Availability (1): Property of software that it is there and ready to carry out its task when you need it to be Availability (2): Ability of a system to mask or repair faults such that the cumulative service outage period does not exceed a required value over a specified time interval Bærbak Christensen9 Nygard Stability (resilience, longevity): Ability to keep processing for a long time even when there are transient impulses, persistent stresses, or component failures
Measurements MTBF: Mean time between failure MTTR: Mean time to repair But often we talk in percentages! –99%3d 15h downtime per year –99,9%8h 1m –99,99%52m –99,9999%32 seconds (!) Bærbak Christensen10
Tactics Lots of techs! Bærbak Christensen11
Tactics Categories –Fault detection –Recovery Preparation+Repair Reintroduction –Prevention Bærbak Christensen12
Detection Ping-echo MonitorNagios – Zabbix - … Exceptions –Time out Bærbak Christensen13
Recover: Prep and Repair Active redundancyHot standby –All receive and process all events Millisecond failover Passive redundancyWarm standby –Master-slave Minute failover SpareCold standby –”I think we have an extra machine in the cellar” Bærbak Christensen14
Recover: Prep and Repair Exceptions Rollback –Used in DB and [exercise: where else?] –Check pointing Retry Degradation Bærbak Christensen15 Which Nygard patterns?
Recover: Reintroduction Shadow –Run in shadow mode until ‘up-to-speed’ State Resync –Typical DB behaviour Cold slaves must catch up with primary –EcoSense db war storyStale DB Bærbak Christensen16
Preventing Removal from service –‘scrubbing’ –Use to be that Tomcat server would respawn every 12 hours Easiest way to fix the numerous memory leaks! Transactions –ACID guaranties Bærbak Christensen17
Summary All things bad can and will happen to real systems having real users operating in the real world! You systems should strive for high availability and graceful degradation –If you want to keep your customers! The architectural tool box is big! Bærbak Christensen18