Introduction to OpenSAF

Slides:



Advertisements
Similar presentations
Computer Systems & Architecture Lesson 2 4. Achieving Qualities.
Advertisements

Mecanismos de alta disponibilidad con Microsoft SQL Server 2008 Por: ISC Lenin López Fernández de Lara.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Implementing Failover Clustering with Hyper-V
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
1 Autonomic Computing An Introduction Guenter Kickinger.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
An Introduction to Software Architecture
Replication & EJB Graham Morgan. EJB goals Ease development of applications –Hide low-level details such as transactions. Provide framework defining the.
ETICS2 All Hands Meeting VEGA GmbH INFSOM-RI Uwe Mueller-Wilm Palermo, Oct ETICS Service Management Framework Business Objectives and “Best.
Introduction to the Service Availability Forum
(Business) Process Centric Exchanges
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
August 3-4, 2004 San Jose, CA Developing a Complete VoIP System Asif Naseem Senior Vice President & CTO GoAhead Software.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
A Model-Based Approach for AMF Configuration Generation Pejman Salehi, Pietro Colombo Abdelwahab Hamou-Lhadj, Ferhat Khendek Concordia University Department.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Highly Available Internet Telephony Fact or Fiction? Manfred Reitenspiess Fujitsu Siemens Computers Munich, Germany
OOD OO Design. OOD-2 OO Development Requirements Use case analysis OO Analysis –Models from the domain and application OO Design –Mapping of model.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Next Generation of Apache Hadoop MapReduce Owen
1 Murthy Esakonu June 3rd, 2009 Shenzhen China OpenSAF Developer Days 2009 Writing First OpenSAF Application Session OpenSAF.
OpenSAF Technical Overview Mario Angelic Technical Co-Chair OpenSAF Project June 4 th, 2009.
Building Systems with OpenSAF Mario Angelic Expert Hans Feldt OpenSAF Technical Co-Chair
Interstage BPM v11.2 1Copyright © 2010 FUJITSU LIMITED INTERSTAGE BPM ARCHITECTURE BPMS.
SAF Specifications and Architecture U.Kleber October 15, 2008.
Ingvar Bergström Senior Designer Developer Days June 2009 SMF in OpenSAF.
Hans Feldt Senior Software Engineer, Ericsson AB Developer Days June 2009 IMM in OpenSAF, status and future.
Pervasive Computing MIT SMA 5508 Spring 2006 Larry Rudolph 1 Publish & Subscribe Larry Rudolph May 3, 2006 SMA 5508 & MIT
1 Jonathan Fournier Senior Engineer – Linux Product Division Munich, Germany The Platform Management Service.
OpenSAF Architecture & Status
Chapter 19: Network Management
Essentials of UrbanCode Deploy v6.1 QQ147
Migrating a Legacy Application to OpenSAF Experience and Findings Using OpenSAF Ana Sanz Merino SAPC System Architect Ericsson.
High Availability 24 hours a day, 7 days a week, 365 days a year…
Dockerize OpenEdge Srinivasa Rao Nalla.
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
Transforming VLC into an SA-Aware Application
Automatic Generation of AMF Compliant Configuration
OpenSAF Roadmap Murthy Esakonu GoAhead Software Inc OpenSAF TLC.
NTF in OpenSAF, status and future
Integrating HA Legacy Products into OpenSAF based system
LOCO Extract – Transform - Load
Self Healing and Dynamic Construction Framework:
OpenSAF Wanted Architecture TLC view
Open Source distributed document DB for an enterprise
SI-SI Dependency Nagendra Kumar Senior Software Engineer,
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
OpenSAF portability Murthy Esakonu
Database Systems: Design, Implementation, and Management Tenth Edition
Overview: Fedora Architecture and Software Features
Advanced Integration and Deployment Techniques
Oracle Solaris Zones Study Purpose Only
Ch > 28.4.
Chapter 5 Designing the Architecture Shari L. Pfleeger Joanne M. Atlee
IT INFRASTRUCTURES Business-Driven Technologies
Software Architecture
ESIS Consulting LLC (C) ESIS Consulting LLC. All rights reserved
Software models - Software Architecture Design Patterns
Project Information Management Jiwei Ma
Training Module Introduction to the TB9100/P25 CG/P25 TAG Customer Service Software (CSS) Describes Release 3.95 for Trunked TB9100 and P25 TAG Release.
Chapter 11: Software Configuration Management
An Introduction to Software Architecture
Design Yaodong Bi.
Design.
Software Development Process Using UML Recap
ONAP Architecture Principle Review
Presentation transcript:

Introduction to OpenSAF David Fick Senior Software Architect GoAhead Software

Introduction to OpenSAF Service availability and high availability systems and concepts have been around for decades However, HA terminology tends to vary from industry to industry and company to company Goals of this session: High-level technical overview of the Service Availability™ Forum standards Overview of the support of those standards within OpenSAF Allow you to: Familiarize yourself with SA Forum and OpenSAF concepts and terminology OR Map the HA concepts and terminology with which you are familiar to the SA Forum and OpenSAF versions Resources for getting started with OpenSAF

SA Forum Interfaces: AIS & HPI System Management Applications Application Interface Specifications (AIS) Service Availability Middleware SAF Standards Implemented by OpenSAF Software Mgmt Framework (SMF) Availability Management Framework (AMF) Lock (LCK) Checkpoint (CKPT) Information Model Mgmt (IMM) Cluster Membership (CLM) Event (EVT) Notification (NTF) Platform Mgmt (PLM) Message (MSG) Log (LOG) Operating System Virtualization Hardware Platform Interface (HPI) Hardware Platform A Hardware Platform B Hardware Platform C Hardware Platform D

But how to make sense of the SA Forum “acronym soup”? IMM LCK EE SI SMF SU HPI SG NTF OM AMF CLM PLM CSI CKPT HE LOG MSG EVT

AIS Service Groupings First, understand that the AIS services fall into three logical groupings*: System Management Services Resource Availability Management Services Application Services Information Model Mgmt (IMM) Availability Management Framework (AMF) Checkpoint (CKPT) Event (EVT) Software Mgmt Framework (SMF) Cluster Membership (CLM) Message (MSG) Notification (NTF) Platform Mgmt (PLM) Lock (LCK) Log (LOG) Services that manage central system capabilities commonly used by both: AIS services Applications Services that manage and monitor the state of key system resources that affect availability: Hardware / Operating system Cluster nodes Applications Optional services to support application operations such as: Inter-process communication State replication Shared resource access control * - Not official SA Forum AIS service groupings

Fault Management Cycle Second, AIS services that manage availability are designed around a standard fault management cycle Detection Detection E.g. component healthchecks Isolation E.g. blade power off Repair Notification Isolation Recovery E.g. failover of workload assignments to associated standby resources Repair E.g. automatic restart of failed resource Recovery Notification E.g. state change notifications sent by service managing the resource

Resource Dependencies Third, Availability Management in the AIS world is driven by a detailed understanding of the availability management dependencies across all resource types Managed Applications Simple to complex dependencies and relationships can be modeled between the various software elements Dependency on a particular node also modeled AMF Node Represents a node where AMF services are provided Depends on a CLM node CLM Node Represents a cluster node where AIS services are provided Depends on an Execution Environment (optional) Platform Resource Containment and logical dependencies represented between platform resources Execution Environment (EE) Represents an operating system instance (standalone or virtual) Hardware Element (HE) Represents a physical hardware resource in the system Managed Applications AMF Node CLM Node Platform Resource Hardware Element Execution Environment

Common Design Patterns Fourth, the AIS services follow common design patterns: API Common library lifecycle Naming conventions Resource managed by service  Managed object Typically with associated state model Managed objects stored in common information model Administrative operations X.731 style administrative operations for resources which affect availability Notifications automatically generated by AIS services for significant system events (alarms, state changes, etc.)

Resource Availability Management Services Availability Management Framework (AMF) Manages the lifecycle and monitors the state of the managed applications within the system More detail in upcoming slides Cluster Membership (CLM) Provides cluster membership change notifications to AIS services and interested applications OpenSAF CLM implements cluster management protocol dealing with: Cluster formation Active controller selection & failover Node failure detection Platform Management (PLM) Manages state of modeled hardware elements and execution environments (operating system instances) Hardware element states and events accessed through Hardware Platform Interface (HPI) Manages graceful blade extraction / de-activation cases Supports hardware element controls (power on/off and reset) Optional service within OpenSAF AMF CLM PLM

Availability Management Framework (AMF) AMF Logical Entities Structural Entities AMF Application Represents the highest-level service(s) provided by the system AMF Application 1..* Service Group Service Group (SG) Represents a group of like logical resources that provide the same service(s) Associated redundancy model (e.g. 1+1) 1..* Service Unit (SU) Aggregates a set of resources which when combined provide a higher-level service Service Unit 1 1..* Component Represents one or more resources that perform a function within the system Component

Availability Management Framework (AMF) AMF Logical Entities Workload Entities AMF Application Service Instance (SI) Represents a workload to be supported by the system Has associated redundancy requirements (1+1, N+M, etc.) 1..* Service Group Service Group Service Group Protected by Protected by an identified SG Assigned to one or more SUs with an HA state of active, standby, quiescing or quiesced 1..* 1..* Service Unit 1 Service Unit 1 Service Unit 1 Assigned Service Instance Component Service Instance (CSI) Represents a more granular workload that needs to be supported by the system 1..* 1..* Component Component Component Assigned Component Service Instance Assigned to one or more components

Availability Management Framework (AMF) AMF Logical Entities Common Characteristics Well-defined state model for each logical entity type X.731 style administrative operations Common AMF Component Types SA-aware Applications modified to interact with AMF through AMF API Non-proxied, non-SA-aware Legacy or 3rd party applications that typically cannot be modified Interact with AMF through command line scripts to manage application lifecycle Always assigned active HA state if running Proxied, non-SA-aware Applications that have knowledge of HA concepts but do not directly communicate with AMF Proxy application receives HA “commands” from AMF and forwards them to proxied application through a custom interface Lifecycle mgmt AMF comp process HA state assignment AMF AMF Library Lifecycle mgmt Non-proxied AMF comp process AMF Proxy Lifecycle mgmt Proxy component AMF AMF Library Proxied AMF comp process Lifecycle mgmt & HA state assignment Proxy HA state assignment AND Proxied comp lifecycle mgmt & HA state assignment requests

Availability Management Framework (AMF) Service Group Redundancy Models Most common redundancy model Preferred assignment model per SI: 1 active resource 1 standby resource SUs can have either all active or all standby SI assignments A.k.a. 1+1, active-standby, active-backup N+M Both N and M are configurable Common variation: N+1 SI1 Node1 Node2 A S SU1 SU2 SI1 Node1 A Node2 Node3 S SU1 SU2 SU3 A S SI2

Availability Management Framework (AMF) Service Group Redundancy Models No redundancy Preferred assignment model per SI: 1 active resource Similar to a N+0 redundancy scheme where N is the number of protected SIs N-way Y standby resources (where Y is configurable) SUs can concurrently have both active and standby assignments N-way Active X active resources (where X is configurable) No standby resource SI1 SI2 Node1 A Node2 A SU1 SU2 SI1 Node1 A Node2 S Node3 S SU1 SU2 SU3 S A S SI1 SI2 Node1 A Node2 A SU1 SU2

Availability Management Framework (AMF) Error Recovery Policies Pre-defined AMF component error recovery policies Configurable Can be overridden at runtime Recovery policy scopes Component Service Unit Node Recovery policy types Restart Failover Failfast Up to 3 actions per policy Isolation Recovery Repair Error escalation policies

System Management Services Information Model Management (IMM) Information Model Highlights Based on pre-defined object classes (including AIS classes) Holds both configuration and runtime objects Used by AIS services to store current configuration and runtime state info Can be used by applications as well Object Management API Object class management Access object attribute values Search information model Configuration change requests Administrative operation invocation Object Implementer API Runtime object management CCB validation and application Administrative operation handling OpenSAF Implementation Persistence of information model managed through Persistence BackEnd (PBE) feature Replicated to multiple cluster nodes

System Management Services Software Management Framework (SMF) SMF controls migration from one deployment configuration to another Upgrade methods Rolling upgrade Single step upgrade [De-]Activation Unit Scope AMF Node Service Unit During the migration SMF Maintains the campaign state change model Takes measures to enable error recovery Monitors for potential errors caused by the migration Deploys error recovery procedures Upgrade Campaign Definition “Upgrade Instructions” Software Management Framework Adaptation commands (SMF config object) Install / remove software bundles on target nodes Admin operations Read/Create/Delete/Update objects Software Repository Information Model

System Management Services Notification (NTF) Publish-and-subscribe semantics for system-level notifications Reader interface for reading historical alarm info as well Formal syntax and semantics for ITU X.73x notifications: Alarm / security alarm / state change / object create/ delete / attribute change Used by AIS services to publish service-specific notifications Alarm and security alarm notifications automatically logged through LOG service Log (LOG) Flexible, centralized, system-wide logging mechanism Pre-defined log streams: alarm, notification, system Supports multiple, custom application log streams Log streams are configurable on a per log stream basis Including log file full action: halt, wrap, and rotate

Application Services Checkpoint (CKPT) Intended as a state replication mechanism for distributed applications Can be used for all standby “temperature levels” Cold Warm Hot Through OpenSAF CKPT service API extension Semantics of a checkpoint Arbitrary set of sections containing opaque data Stored in one or more replicas distributed across cluster Reads and writes occur against the active replica Both synchronous and asynchronous replication options available Collocated checkpoint option provided for highest performance

Application Services Event (EVT) Message (MSG) Lock (LCK) Publish-and-subscribe communication paradigm Flexible event channel, pattern, and filtering definition Subscriber event queue maintained within app process Message (MSG) Messages sent to and read from message queues Single message queue owner at a time Message queue maintained outside app process Message queues can be logically grouped Messages can be sent to a message queue group Associated distribution policy (round-robin, broadcast, etc.) Lock (LCK) Cluster-wide, distributed lock service Can be used to control access to cluster-level shared resources

Getting Started with OpenSAF OpenSAF Technical Educational Resources Developer Wiki [http://devel.opensaf.org/wiki] OpenSAF Developers blog [http://devel.opensaf.org/blog] OpenSAF mailing lists [Subscribe: http://list.opensaf.org/maillist/listinfo/] Users [Archive: http://list.opensaf.org/pipermail/users/] Announce [Archive: http://list.opensaf.org/pipermail/announce/] Development [Archive: http://list.opensaf.org/pipermail/devel/] Latest documentation [http://devel.opensaf.org/hg/opensaf-4.x-documentation/archive/tip.tar.gz] FAQ [http://www.opensaf.org/HOA/assn14944/images/FREQUENTLY%20ASKED%20QUESTIONS%20ABOUT%20OPENSAF%20RELEASE%204%20Final%20for%20publication.docx] README files in source code repository

Questions