Introduction to the Service Availability Forum

Introduction to the Service Availability Forum
.

Contents Introduction Quick AIS Specification overview
AIS Dependability services AIS Communication services Programming model DEMO

SA Forum: Who We Are, What We Do
A consortium of industry-leading companies: Network Equipment Providers Computer System Vendors Silicon, Board and System Platform Providers System Integrators Application, Middleware and Operating System Vendors Mission: Foster an ecosystem to enable the use of commercial off-the-shelf building blocks in the creation of high availability network infrastructure products, systems and services Approach: Develop and publish high availability and management software interface specifications Promote and facilitate their adoption by the industry Copyright© 2007 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

SA Forum: How We Are Unique
For high-availability software: Our specifications are hardware independent Our specifications are OS independent We collaborate with other standards bodies and integrate relevant specifications Copyright© 2007 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Design of dependable services
Two parts of functionality Business logic Implements the service Common functionality Communication, management, fault tolerance… This is where SA Forum AIS comes into picture Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Construction of a service
Define the functionality Plan the architecture Components, roles Integrate fault management Service state monitoring Management Plan recovery Communication between components Copyright© 2006 Service Availability™ Forum, Inc

Construction of a service
Define the functionality Plan the architecture (AMF) Components, roles Integrate fault management (AMF) Service state monitoring Management Plan recovery (CKPT) Communication between components (MSG, EVT) Copyright© 2006 Service Availability™ Forum, Inc

The Transition COTS adoption is accelerating transition from vertical to horizontal industry model Vertically integrated platform by individual vendors Platform enabled by The COTS eco-system Vendor A Vendor D Proprietary Platform Proprietary Hardware Proprietary Operating System Middleware Applications Proprietary Platform Proprietary Hardware Proprietary Operating System Middleware Applications Standard Hardware COTS-Based Platform Carrier-Grade Operating System Integrated COTS Middleware Applications Vendor D Vendor C Vendor B Vendor A Copyright© 2007 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

SA Forum Standard Interface Specifications
Highly Available Applications Application Interface Specification (AIS) AIS (SA Forum) Other Middleware and Application Services Service Availability Middleware SMI (SA Forum) Hardware Platform Interface (HPI) Systems Management Interfaces (SMI) HPI (SA Forum) Carrier Grade Operating System Hardware Platform Communications Fabric Copyright© 2007 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Using SA Forum’s specifications to build highly available services
The AIS SPECIFICATION Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Application Interface Specification
The Application Interface Specification (AIS) is a set of open standard interface specifications The AIS specifies the Application Programming Interface (API) for HA middleware services service entities and their behavior (life cycle, administrative operations, functionality) Example for specification/implementation HW interface: SATA HW implementation: manufactured mainboard HW user: the HDD (Samsung, Western Digital, Seagate, Hitachi…) SAI-Overview-B.02.01 This lesson gives a high level overview of the Application Interface Specification (AIS), which includes the Availability Management Framework and the AIS Services. More details on the areas of the AIS will be provided in subsequent lessons, which are organized as follows: Availability Management Framework (4 lessons) Cluster Membership Service (1 lesson) Checkpoint Service (1 lesson) Event Service (1 lesson) Message Service (1 lesson) Lock Service (1 lesson) Notification Service (2 lessons) Log Service (1 lesson) Information Model Management Service (1 lesson). Copyright© 2006 Service Availability™ Forum, Inc

AMF CLM IMM CKPT LOG NTF LCK EVT MSG Web server, Music broadcast Management HA Applications Other Middleware and Application Services SA Forum AIS APIs, SAI-Overview-B.02.01 This slide shows the areas of the Application Interface Specification: Cluster Membership Service (CLM) Availability Management Framework (AMF) Information Management Model (IMM) Log Service (LOG) Checkpoint Service (CKPT) Message Service (MSG) Event Service (EVT) Lock Service (LCK). These areas will be addressed in subsequent slides of this lesson and subsequent lessons on the AIS. HPI Middleware AIS Middleware Carrier Grade Operating System Boards, fans, power, etc. Managed Hardware Platform Copyright© 2006 Service Availability™ Forum, Inc

The AIS is divided into the following parts or areas Administration Software Management Framework Information Model Management Service Cluster Membership Service Notification Service Dependability Availability Management Framework Checkpoint Service Communication Event Service Message Service Miscellaneous Lock Service Log Service Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Application Interface Specification Implementation
Each of the AIS Services consists of a client library and a server The client library for an AIS Service is linked into each application process that utilizes the service SAI-Overview-B.02.01 For each of the AIS Services, the Application Interface Specification defines the interface between that service (client library) and the applications that utilize it. It does not specify how the server for that service is implemented. The following slides provide a high-level overview of the Availability Management Framework and each of the AIS Services. Copyright© 2006 Service Availability™ Forum, Inc

Example service - Message Queues
Node U Node V Node W Process A Process B Process C MSG library MSG library MSG library saMsgMessageSend() saMsgMessageSend() Message Service Message Service Message Service Message m1 priority 0 sent to queue Q Message m2 priority 1 sent to queue Q Node Y Process E SAI-AIS-MSG-B.02.01, Section 2 Message Queues The primary focus of the Message Service is on one-way messages, sent by a sender who can then forget about the messages, which are queued in a message queue by the Message Service. Subsequently, the receiver reads the messages out of the message queue. The Message Service also provides an RPC-like request-reply mechanism, in which the sender of a message blocks until it receives a reply message from the receiver. A message queue is global to a cluster and is identified by a unique name in the name space of message queues. A process on any node can open a message queue to read messages from that queue. The specification does not allow more than one process to open the same message queue at the same time. However, different processes may open different queues in the same message queue group at the same time. Note: The two messages are colored differently, blue and green, to distinguish them. MSG library Queue Q saMsgMessageGet() priority area 0 Getting a message from a queue removes it from the queue saMsgMessageGet() priority area 1 priority area 2 Message Service priority area 3 Copyright© 2006 Service Availability™ Forum, Inc

The AIS SPECIFICATION Dependability Services
Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

SA Forum’s approach to making applications HA
Availability Management Framework principles Integrate best practices into the middleware, Provide means for the software developer to influence behavior Let the middleware control the application (like a Marionette) In most cases making an application/service HA requires the implementation of similar functionalities (failover, health monitoring,…) Based on this observation, the SAF developed the AMF with the following principles in mind: 1,2, but let the middleware control the application. The application provides the strings in the form of callbacks methods and AMF pulls them as required to maintain service contiuity.

SA Forum’s approach to making applications HA
Fault prevention Reduce the probability of system failure to an acceptably low value Fault tolerance Provide service in spite of faults Fault removal Diagnostics, monitoring, repair at development and runtime Fault forecasting / prediction Estimating failures and their effects E.g. software aging

Server Clustering Notions Functionalities of cluster middleware node
link cluster partition Functionalities of cluster middleware Error detection Error handling Notification Reconfiguration

Server Clustering 2. Categories of clusters HA clusters
Improve the availability of services Load-balancing clusters Share the workload among the nodes High-Performance Clusters (HPC) Scientific computing Grid computing Many independent jobs

HW and SW based Fault Tolerance
Redundant power supply SW Process replicas, failover, switchover… Clusters – hybrid solutions Process replicas on different nodes

AMF Redundancy Patterns
Provided models 2N Simple failover pair N+M Load balanced active servers and hot-standby units N-Way Load balanced active servers which are able to overtake the each other’s services N-Way Active No standby redundancy is needed but processing has to be carried out in parallel by N service units No redundancy Non-critical services of the system Redundancy models are not detailed further in this presentation.

AMF System Model Example
Service Instance B Service Instance D Service Group SG2 Service Unit S3 Service Unit S4 Service Unit S5 Service Instance A SG1 Service Unit S1 Service Unit S2 Service group SG1 supports a single service instance A, and service group SG2 supports two service instances B and D Node U Node V Node W Node X Active Standby On behalf of A, service unit S1 is assigned the active HA state and service unit S2 is assigned the standby HA state Service unit S1 contains two components C1 and C2, and service unit S2 contains two components C3 and C4 Component C1 C2 C3 C4 Component C5 C6 C7 Similarly, for service group SG2 Talk about failover Copyright© 2006 Service Availability™ Forum, Inc

The SA Forum Information Model
The most important elements of the metamodel are highlighted.

Fault Management Flow Chart
An alternate form of fault detection, including built-in diagnosis The error is detected Hypothetical cause of error is located The defective component is removed from service The system is reconfigured or restarted to function properly A faulty system component is replaced

Error detection methods
Mechanisms Interface check (e.g. illegal instruction, illegal parameter, insufficient access rights) Ad-hoc methods Acceptability testing Error checking codes Timing checks (watchdog) Diagnostic analysis (in idle time) Substitute back (integrate  derive)

AMF error detection mechanisms
Needs the change of the component Passive monitoring Using OS functionality Currently only crash of a process External active monitoring External entity is used to monitor Internal active monitoring Healthchecks Types Pull: AMF invoked Push: Component invoked   

Planning recovery - checkpointing
Define the variables to be saved State variables Normal operation Open checkpoint Save variables regularly On failure Read checkpoint Continue normal operation Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Checkpoint Service Example of checkpointing, and restoration from a checkpoint after a fault, for a collocated checkpoint Node U Node U Service Group Service Unit S Service Unit S Active Component Active Component Standby Component open write synchronize read open Active replica Standby Active replica Animated SAF-AIS-CKPT-B.02.01 Checkpoint Service The Checkpoint Service manages a set of entities, called checkpoints, that a process uses to save its state to be used in failover or switchover situations. A checkpoint is a cluster-wide entity that is designated by a unique name. A checkpoint is structured into areas called sections. A copy of the data stored in a checkpoint is called a checkpoint replica, or simply a replica, which is typically stored in main memory rather than on disk. A given checkpoint may have several checkpoint replicas stored on different nodes in the cluster, to protect the checkpoint data against node failures. Synchronous update and asynchronous update options are provided. The concept of active replica is defined for checkpoints with the asynchronous update option. Read and write accesses are directed to the active replica and the updated data are asynchronously transferred to the other replicas. The picture shows a process writing checkpoint data to the active checkpoint replica. It also shows instantiation of another process from the checkpoint replica. Checkpoint Service Checkpoint C Checkpoint C Section abc Section abc Section xyz Section xyz Copyright© 2006 Service Availability™ Forum, Inc

Methods of recovery Goal: Recovery from errors before failures.
Trivial methods Retry operation Restart the application (cold, warm) Restart the system Techniques Forward recovery Backward recovery Compensation Simple techniques: Retry, Restart app, Restart sys Cold restart (set start state), Warm restart (reset data structures)

Methods of recovery 2. S1 S2 Forward recovery Compensation
Animation: The green line is the normal state trace of the system along S1 and S2. Rectangles are state saves. An error happens that deviates the system state from the normal trace until it is detected. Backward recovery sets the system state to a previously saved state. Forward recovery drives the system into a well defined stable, error free state. Compensation intends to compensate the effects of the error. Backward recovery S2

Rollback / backward recovery
Preconditions State restoration is possible Correct saved state Logged interactions with the environment Deterministic restart from a given state (distributed systems) Mechanisms State based rollback Operation based rollback Hybrid rollback (state + operation) State based: check-pointing (save document in word) Operation based: undo actions in Word Hybrid: undo + save Word (limited undo stack)

State based rollback Checkpoint operations Timing / triggers
activeness C1 C2 C3 C4 Time: Recovery point Data: Checkpoint Checkpoint operations Create Discard Rollback recovery Timing / triggers Time interval, number of messages sent, etc. Recovery region Active checkpoint exists Recovery regions can overlap

C2 is corrupt, we have to use C1
State based rollback Error detected C2 is corrupt, we have to use C1  C1 C2 C3 C4 Failure Error Errors between state saves  2 phase checkpointing (2 buffers) Factors that influence # of overlapping rec. regions Error detection latency Cost of buffers Coordination in distributed systems

Service continuity Goal: maintain the service continuity Type of error
Related actions Transient Ignore (recovery handles them) Permanent Reconfiguration Failover Switchover Fail-back “Graceful degradation”

Actions Failover The service is removed from the failed entity and assigned to a healthy entity Switchover The service is removed from a healthy entity to another healthy entity Fail-back The failed entity is repaired and service is assigned to it again Graceful degradation Service is carried on in a degraded state, probably with degraded functionality Entity: can be a node, a component… Failover: there are two identical entities, one of them fails Switchover: take the active out of service for maintenance Failback: two entities and one is more preferred than the other Graceful degradation: public site and the database is not available

System level error handling patterns
Fail-silent If failed then no messages are sent out until repaired E.g. in distributed systems Fail-stop Stop operation on failure E.g. train traffic control systems (turn all lights red if failure happens) Normal Silent Failing (i) Normal State: In this state, a node produces correct output. Detection of an internal failure (by the comparator process) causes the node to irreversibly enter either the failing state or the silent state. (ii) Failing State: This is an intermediate state in which the node can suffer at most one performance failure. From this state the node eventually enters the terminal silent state. (iii) Silent State: No new valid messages are produced by the node. Any messages produced by the node can only be invalid or copies of previously produced valid messages: any functioning destination node can detect these messages as unwanted. Francisco V. Brasileiro et. al. Implementing Fail-Silent Nodes for Distributed Systems (1996 )

Requirements in today’s systems
Cost efficiency Flexibility Dependable services Highly Available Reliable Availability Outage duration per year ~5 mins 0.9999 ~52 mins 0.999 ~8 hrs 0.99 ~3 days

Dependability and performance vs. costs
Note that the cost scale is logarithmic. dependability performance

FT vs. Cost International Space Station Advanced techniques Algorithms
Desktop Computer HW / SW redundancy

The AIS SPECIFICATION Communication Services

AMF CLM IMM CKPT LOG NTF LCK EVT MSG Web server, Music broadcast Management HA Applications Other Middleware and Application Services SA Forum AIS APIs, SAI-Overview-B.02.01 This slide shows the areas of the Application Interface Specification: Cluster Membership Service (CLM) Availability Management Framework (AMF) Information Management Model (IMM) Log Service (LOG) Checkpoint Service (CKPT) Message Service (MSG) Event Service (EVT) Lock Service (LCK). These areas will be addressed in subsequent slides of this lesson and subsequent lessons on the AIS. HPI Middleware AIS Middleware Carrier Grade Operating System Boards, fans, power, etc. Managed Hardware Platform Copyright© 2006 Service Availability™ Forum, Inc

Communication Message (MSG) Recommended for many to one scheme
Example: collect sensor data Event (EVT) Many to Many (publish – subscribe) scheme Example: Sentinels and villagers Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Message Service Message Service, Message Service (MSG) Library, and Message Queue Node U Node V Node W Process A Process B Process C MSG library MSG library MSG library saMsgMessageSend() saMsgMessageSend() Message Service Message Service Message Service Message m1 priority 0 sent to queue Q Message m2 priority 1 sent to queue Q Animated SAI-AIS-MSG-B.02.01 Message Service Communication via message queues A message queue permits multipoint-to-point communication. A message queue is global to the cluster, rather than being owned by any one process. Messages are written to, and read from, message queues. Processes on any node in the cluster can write messages to a message queue. At most one process can read a message queue at any time, but that process can change over time. When a message is read from a queue, it is removed from that queue. The messages in a message queue are grouped together into priority areas. Message queues can be persistent or non-persistent. The slide shows one reading process E on node Y that is local to queue Q. Substantial performance benefits result when reading from a local queue. Writers are usually not local to the queues to which they write. Communication via message queue groups (not shown in the picture) A message queue group permits multipoint-to-multipoint communication. A message queue group is identified by a logical name. The sender addresses a message queue group using the same mechanisms used to address a single message queue. Messages addressed to a message queue group can be read by more than one process. Send/receive operations (not shown in the picture) A sender sends to a destination message queue or message queue group and waits for the reply. The reply of the process reading from the queue is sent to a buffer provided by the sender. Node Y Process E MSG library Message Queue Q saMsgMessageGet() priority area 0 Getting a message from a queue removes it from the queue saMsgMessageGet() priority area 1 priority area 2 Copyright© 2006 Service Availability™ Forum, Inc Message Service priority area 3

Event Service Example of the Event Service, with publishers and subscribers on two nodes Node U Node V Publisher EVT library Publisher EVT library Subscriber EVT library Subscriber EVT library Subscriber EVT library Subscriber EVT library Publisher EVT library Animated SAI-AIS-EVT-B.02.01 Event Service An event channel enables multiple publishers to communicate with multiple subscribers. It is global to a cluster and is identified by a unique name. Multiple publishers and multiple subscribers can communicate over the same event channel. Individual publishers and individual subscribers can communicate over multiple event channels. Subscribers are anonymous, which means that they may join and leave an event channel at any time without involving the publisher(s). Publishers can also be subscribers. An event is published by invoking the saEvtEventPublish() function and specifying as parameters the event handle and the event. A process subscribes to receive events on an event channel by invoking the saEvtEventSubscribe() function. The Event Service delivers events to a subscribing process using the saEvtEventDeliverCallback() function of that process. To stop receiving events for which it has registered, a subscriber can invoke the saEvtEventUnsubscribe() function to unsubscribe for those events. When subscribing on an event channel, a process must specify which filters to apply to published events. The Event Service delivers, to the process, only events that match the filters. Filter Filter Filter Filter Event Channel X Event Service Event Channel Y Copyright© 2006 Service Availability™ Forum, Inc

The AIS SPECIFICATION Programming model

Programming model Library life cycle
Initialize, finalize, dispatch events Service specific APIs Callbacks Request functions Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Programming model - AMF
Library life cycle saAmfInitialize(), saAmfFinalize(), saAmfDispatch() Service specific APIs Callbacks saAmfCSISetCallbackT() Request functions saAmfHAStateGet() Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Programming model – how to use it?
Initialize service handler saAmfInitialize() Set callbacks Get selection object saAmfSelectionObjectGet() Start main loop do { select(amfSelectionObject,…) saAmfDispatch() } while (1) Copyright© 2004 Service Availability™ Forum, Inc. - Other names and brands are properties of their respective owners.

Highly Available / Fault Tolerant Music Broadcast Application
Demo Highly Available / Fault Tolerant Music Broadcast Application

The source application is made HA
Architecture The example is a music broadcast application. The clients connect to an Icecast streamer server through the Internet using TPC/IP channels. The clients can be Winamp, Windows Media Player or any other client that are able to play an HTTP stream. The music stream is not generated by the Icecast itself but it is coming from a specific streamer application, which is a simple application that connects to the Icecast, using the Shoutcast protocol, and sends an mp3 file through the connection. Our aim is to make the Stream source application HA so that a standby instance takes over the service and the . For this reason, we integrated it into the AMF. (The steps of integration are detailed in the next slide.) In the architecture there is an additional component, the Input buffer. The Input buffer behaves as a virtual channel between the Icecast and the streamer application. It is required because the Icecast disconnects the clients when the stream source is disconnected (the TPC channel between the Icecast and the streamer application is closed), thus, the failover of source applications would not be seamless for the user. Here, instead of using advanced techniques for modifying the TCP stack (e.g. TCPCP), we decided to create a buffer that simply transmits data from the incoming channel (from the source) to the output channel (to Icecast) and does not close the connection to Icecast if the source disconnects. As a result, the clients will not need to reconnect on source application failover. The source application is made HA Clients

Online radio streaming system

AMF configuration Tolerated failures Component Node
One communication channel Integration of the stream source application into AMF The application is run on a two node cluster. Each node runs one instance of the application. One is the active, the other is the passive. In the AMF configuration the instances are represented by components (Mcomp 1/2) comprised into separate service units (MSource SU 1/2). The service units are grouped into a service group (Msource SG) that is assigned the 2N redundancy model, and the Music Source App element represents the application itself in the configuration. There is only one service instance (Msource SI 1) that represents the workload. It is assigned to the service units at runtime. The one that is assigned as active serves the music stream while the standby one waits. If the active fails the standby takes over the service and continues the serving of the music stream where it was stopped by the active. The health state of the components is monitored automatically by AMF using healthchecks. The created configuration tolerates - the failure of the stream source components - failure of a node from the cluster - failure of the communication link between any of the nodes in the cluster and the node where the Icecast is hosted but only one at a time

Where should I continue?
MP3 streamer MP3 streamer Middleware Middleware OS OS Position STATE Faulty Active Standby Active

Try it yourself Open Windows Media Player Press Ctrl+U
Enter URL: Open Winamp Add above URL to playlist

Acknowledgements András Kövi Zoltán Micskei, István Majzik
András Pataricza

References Service Availability Forum http://saforum.org
Application Interface Specification Education Material Fault Tolerance Research Group – (BME MIT)

Introduction to the Service Availability Forum

Similar presentations

Presentation on theme: "Introduction to the Service Availability Forum"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to the Service Availability Forum

Similar presentations

Presentation on theme: "Introduction to the Service Availability Forum"— Presentation transcript:

Similar presentations

About project

Feedback