Distributed Computing Network Laboratory Reliability; Report on Models and Features for E2E Reliability ETSI GS REL 003 양현식.

Slides:



Advertisements
Similar presentations
ETSI NFV Management and Orchestration - An Overview
Advertisements

Benchmarking VNFs and their Infrastructure Al Morton March 7, 2014.
High Availability Project Qiao Fu Project Progress Project details: – Weekly meeting: – Mailing list – Participants: Hui Deng
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
© 2015 Dbvisit Software Limited | dbvisit.com An Introduction to Dbvisit Standby.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Virtualized Infrastructure Deployment Policies (Copper) 19 February 2015 Bryan Sullivan, AT&T.
Making Services Fault Tolerant
Zhipeng (Howard) Huang
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
High Availability for OPNFV
storage service component
HA Scenarios.
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 30 Slide 1 Security Engineering.
Virtualization Infrastructure Administration Cluster Jakub Yaghob.
Disaster Recovery as a Cloud Service Chao Liu SUNY Buffalo Computer Science.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
Fault and Intrusion Tolerant (FIT) Event Broker & BFT-SMaRt A. Casimiro, D. Kreutz, A. Bessani, J. Sousa, I. Antunes, P. Veríssimo University of Lisboa,
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 2.
IMPROUVEMENT OF COMPUTER NETWORKS SECURITY BY USING FAULT TOLERANT CLUSTERS Prof. S ERB AUREL Ph. D. Prof. PATRICIU VICTOR-VALERIU Ph. D. Military Technical.
Module 13 Implementing Business Continuity. Module Overview Protecting and Recovering Content Working with Backup and Restore for Disaster Recovery Implementing.
August 3-4, 2004 San Jose, CA Developing a Complete VoIP System Asif Naseem Senior Vice President & CTO GoAhead Software.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Fault Localization (Pinpoint) Project Proposal for OPNFV
Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
1 Adopting and Embracing Open Source for NFV Guy Shemesh Senior Director for Cloud Solutions, CloudBand October 2015.
Features Scalability Manage Services Deliver Features Faster Create Business Value Availability Latency Lifecycle Data Integrity Portability.
Mick Badran Using Microsoft Service Fabric to build your next Solution with zero downtime – Lvl 300 CLD32 5.
HUAWEI TECHNOLOGIES CO., LTD. Huawei FusionSphere Key Messages – Virtualization, Cloud Data Center, and NFVI.
가상화 기반의 Workload 관리솔루션 : FORGE PlateSpin Virtualization and Workload Management 나영관 한국노벨 /
Escalator Questionnaire. Why a Questionnaire? To understand better the challenges of smooth upgrade in the OPNFV context –Part 1: Questions investigating.
Benoit Claise Mehmet Ersue
When RINA Meets NFV Diego R. López Telefónica
Dr. Ir. Yeffry Handoko Putra
ARC: Definitions and requirements for SO/APP-C/VF-C discussion including call notes Chris Donley July 5, 2017.
Orchestration and Controller Architecture Alignment Vimal Begwani AT&T
Open Network Automation Platform (ONAP) Controller Architecture Proposal DRAFT.
rain technology (redundant array of independent nodes)
Service Assurance in the Age of Virtualization
Chapter 19: Network Management
NERC Published Lessons Learned Summary
Orchestration and Controller Alignment for ONAP Release 1
draft-bernini-nfvrg-vnf-orchestration
Primary-Backup Replication
Introduction to OpenSAF
X V Consumer C1 Consumer C2 Consumer C3
Integrating HA Legacy Products into OpenSAF based system
ARC: Definitions and requirements for SO/APP-C/VF-C discussion Chris Donley Date , 2017.
IT Services Portfolio Todd Endicott – Senior Network and System Engineer Mary Monroe – Implementation Engineer.
Tomi Juvonen SW Architect, Nokia
Storage Virtualization
Tomi Juvonen Software Architect, Nokia
1. 2 VIRTUAL MACHINES By: Satya Prasanna Mallick Reg.No
ONAP Amsterdam Architecture
Isasku, Srini, Alex, Ramki, Seshu, Bin Hu, Munish, Gil, Victor
Dependability Evaluation and Benchmarking of
NFV Update Vienna, February 2018
Providing Secure Storage on the Internet
Fault Tolerance Distributed Web-based Systems
ONAP Architecture for Rel 1
Distributed computing deals with hardware
Introduction To Distributed Systems
Anand Bhat*, Soheil Samii†, Raj Rajkumar* *Carnegie Mellon University
Latest Update on Gap Analysis of Openstack for DPACC
Seminar on Enterprise Software
Common NFVI Telco Taskforce Paris Face-To-Face Sessions Compliance & Verification Heather, Kirksey LFN Rabi Abdel, Vodafone Group; July 2019.
Title: Robust ONAP Platform Controller for LCM in a Distributed Edge Environment (In Progress) Source: ONAP Architecture Task Force on Edge Automation.
Presentation transcript:

Distributed Computing Network Laboratory Reliability; Report on Models and Features for E2E Reliability ETSI GS REL 003 양현식

Distributed Computing Network Laboratory Reliability / availability methods

Distributed Computing Network Laboratory Reliability / availability methods Overview  NFV, architecture models  Network services elements Network function  Introduction Fault management cycle Protection schemes(2N, N+M, N-way)  NVFI and NFV-MANO support for VNF reliability and availability Fault management cycle phase review Non-redundant / on-demand redundant VNFC configuration Active standby Active Active

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Non-redundant / On-demand redundant VNFC configuration-(stateless/stateful) Placement – free anti-affinity / there is no concept of VNFCI protection in this mode. State protection: the VNFC state protection is not applicable. Fault detection: NFVI / Fault localization: NFVI and NFV-MANO Fault containment: containment may include powering off the failed nodes and/or network reconfiguration actions. (by NFVI and MANO) Fault remediation: NFV-MANO performs VM re-instantiation on failure. Supplementary actions may be required, e.g., network and/or storage association reconfiguration. (Fault recovery)

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Non-redundant / On-demand redundant VNFC configuration-(stateful-external) Placement – free anti-affinity / there is no concept of VNFCI protection in this mode. State protection: VNFC state protection is the responsibility of the VNFCI. State protection is done by an externalised entity, which may be (or utilises) storage service provided by the NFVI, or another VNFC that is provided by application (or combination thereof). Fault containment: containment may include powering off the failed nodes and/or network reconfiguration actions. (by NFVI and MANO) Fault detection: NFVI / Fault localization: NFVI and NFV-MANO Fault remediation: NFV-MANO performs VM re-instantiation on failure. Supplementary actions may be required, e.g., network and/or storage association reconfiguration. (Fault recovery)

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations- stateless Placement – VNFCs of the redundant pair need to be placed on different hardware servers with no or limited common failure modes. State protection: VNFC state protection is not applicable. Fault detection: NFVI / Fault localization: NFVI and NFV-MANO Fault containment: containment may include powering off the failed nodes and/or network reconfiguration actions. (by NFVI and MANO)

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateless Fault remediation: NFVI performs VM failover on the hypervisor layer. Supplementary actions may be the responsibility of NFV-MANO (e.g., network reconfiguration). Fault recovery: NFV-MANO assigns the replacement of the failed node from the cloud resource pool as a new standby entity. NFV-MANO is then responsible for the on-demand diagnosis of the candidate failed entities, and initiation of any subsequent physical recovery request actions for entities with confirmed persistent faults.

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-external Placement: VNFCs of the redundant pair need to be placed on different hardware servers with no or limited common failure modes. Depending on the nature of the externalized state repository, its placement may be subject to explicit or implicit anti-affinity requirements with respect to the VNFCI placement. State protection: VNFCI performs partial VM state replication for its critical state to external state replica repository. This state replication may be VNFC vendor proprietary or utilize 3 rd party or open source middleware services. Fault detection: NFVI / Fault localization: NFVI and NFV-MANO

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-external Fault containment: containment may include powering off the failed nodes and/or network reconfiguration actions. (by NFVI and MANO) Fault remediation: VNFCI performs VM failover to standby. Stateful fault remediation requires that the standby node is brought to the state that is consistent with the state of the external state repository. (Reactive / proactive) / NFV-MANO (e.g., network reconfiguration). Fault recovery: NFV-MANO assigns the replacement of the failed node from the cloud resource pool as a new standby entity. NFV-MANO is then responsible for the on-demand diagnosis of the candidate failed entities, and initiation of any subsequent physical recovery request actions for entities with confirmed persistent faults.

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-external Placement: VNFCs of the redundant pair need to be placed on different hardware servers with no or limited common failure modes. Depending on the nature of the externalized state repository, its placement may be subject to explicit or implicit anti-affinity requirements with respect to the VNFCI placement. State protection: VNFCI performs partial VM state replication for its critical state to external state replica repository. This state replication may be VNFC vendor proprietary or utilize 3 rd party or open source middleware services. Fault detection: NFVI / Fault localization: NFVI and NFV-MANO

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-partial Fault containment: NFVI and NFV-MANO perform required containment actions. Depending on the specific failure mode and its associated scope, containment may include powering off failed nodes and/or network reconfiguration actions. Fault remediation: VNFM or VNF performs VM failover initiated by application. Remediation actions can also be split between the NFVI and VNF, e.g., NFVI may be fully responsible for the network fault remediation, while VNF may use network APIs at the NFV-MANO layer to request network reconfiguration as part of specific VNFC failure remediation cases. VNF/VNFM is responsible for starting the application in the state reflecting the replicated protected state.

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-partial Fault recovery: NFV-MANO assigns the replacement of the failed node from the cloud resource pool as a new standby entity. VNFC is responsible for the state replication to bring the new standby up to date with the active state, which restores the redundancy configuration. NFV- MANO is then responsible for the on-demand diagnosis of the candidate failed entities, and initiation of any subsequent physical recovery request actions for entities with confirmed persistent faults.

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-full Placement: VNFCs of the redundant pair need to be placed on different hardware servers with no or limited common failure modes. State protection: NFVI (specifically hypervisor) performs full VM state replication, including full VM execution state replication as a platform service. Fault detection: NFVI / Fault localization: NFVI and NFV-MANO

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-full Fault containment: NFVI and NFV-MANO perform required containment actions. Depending on the specific failure mode and its associated scope, containment may include powering off failed nodes and/or network reconfiguration actions. Fault remediation: NFVI performs VM failover by hypervisor layer. Supplementary actions may be the responsibility of NFV-MANO (e.g., network reconfiguration).

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability Active–Standby VNFC redundancy configurations-stateful-full Fault recovery: NFV-MANO assigns the replacement of the failed node from the cloud resource pool as a new standby entity. The NFVI layer is responsible for the state replication to bring the new standby up to date with the active state, which restores the redundancy configuration. NFV-MANO is then responsible for the on-demand diagnosis of the candidate failed entities, and initiation of any subsequent physical recovery request actions for entities with confirmed persistent faults.

Distributed Computing Network Laboratory NFVI and NFV-MANO support for VNF reliability and availability MethodPlacementState protection Fault detection Fault localization Fault containment Fault remediationFault recovery Non- redundant StatelessFreeNot applicableNFVI detect NFVI & MANO NFVI & MANO VM re- instantiation (MANO) nothing Stateful (ext) Freethe responsibility of the VNFCI. NFVI detect NFVI & MANO NFVI & MANO VM re- instantiation (MANO) nothing Active / standby StatelessDifferent node Not applicableNFVI detect NFVI & MANO NFVI & MANO Hypervisor layer(NFVI) Cloud resource pool Stateful (ext) Different node External state replica repository NFVI detect NFVI & MANO NFVI & MANO VM failover (VNFCI) Cloud resource pool Active / standby Partial CP (stateful) Different node VNFC vendorNFVI detect NFVI & MANO NFVI & MANO VM failover (VNFM/VNF) by application Cloud resource pool Full VMCP (stateful) Different node NFVI (hypervisor) NFVI detect NFVI & MANO NFVI & MANO Hypervisor layer(NFVI) Cloud resource pool