Download presentation
Presentation is loading. Please wait.
Published byIlene Bailey Modified over 9 years ago
1
Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
OPNFV Summit 2015 Doctor - Fault Management Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
2
Doctor Overview Goal Approach Status
Build fault management and maintenance framework Approach Identify requirement Gap Analysis Implementation work in Upstream (OpenStack) Integration and testing Status Initial Requirement study, architecture design, Gap analysis : Done Collaborative Development: On-going (3 merged Blueprints in OpenStack Liberty) Standardization Sync: On-going (by NFV member efforts, joint meeting)
3
Doctor Members 2x At project creation (Dec 2014) Now (Oct 2015)
NTT DOCOMO, Sprint NEC, Nokia, Ericsson, Huawei, ClearPath Network, Cisco Now (Oct 2015) NTT DOCOMO, Sprint, AT&T, Telecom Italia, KDDI NEC, Nokia, Ericsson, Huawei, ClearPath Network, Cisco Cloudbase Solutions, Spirent, Intel, ZTE 2x
4
Assumption of VNF (NFV Application)
Telco Applications basically deployed in active-standby or active-active fashion App state will be switched when failure occurred App (Active) App (Standby) App and App Manager (VNFM) cannot detect HW failures directly VM VM Machine Machine
5
X Use Case 1: Fault management V Consumer C1 Consumer C2 Consumer C3
4. Switch to SBY configuration V Consumer C1 Consumer C2 Consumer C3 3. FaultNotification (VM ID, Fault ID) 5. Instruction (VM ID) OpenStack Northbound Interface 2. Inform the Consumer? If YES, find owner of affected VMs from database Virtualized Infrastructure Manager (VIM), e.g. OpenStack Resource Map VM-1 VM-2 VM-7 VM-4 Server – VM mapping Server S1 VM-1, VM-2 Server S2 VM-7 Server S3 VM-4 6. Execute Instruction - e.g. migrate VM Ownership information VM-1, VM-7 Consumer C1 VM-2 Consumer C2 VM-4 Consumer C3 Resource Pool Hypervisor Hypervisor Hypervisor Hardware Server S1 Hardware Server S2 Hardware Server S3 X 1. Fault Monitoring - Hardware fault - Hypervisor fault - Host OS fault
6
Use Case 2: Maintenance V Administrator Consumer C1 Consumer C2
4. Switch to SBY configuration V Administrator Consumer C1 Consumer C2 Consumer C3 3. Maintenance Notification (VM ID) 1. Maintenance Request (Server S3) 5. Instruction (VM ID) OpenStack Northbound Interface Virtualized Infrastructure Manager (VIM), e.g. OpenStack VM-1 VM-2 VM-7 VM-4 6. Execute Instruction - e.g. migrate VM Resource Map Server – VM mapping Server S1 VM-1, VM-2 Server S2 VM-7 Server S3 VM-4 Resource Pool Hypervisor Hypervisor Hypervisor Ownership information VM-1, VM-7 Consumer C1 VM-2 Consumer C2 VM-4 Consumer C3 Hardware Server S1 Hardware Server S2 Hardware Server S3 2. Which VMs are affected? Find Consumer owning the VM(s) from the database.
7
Fault Management Sequence
Detection Reaction Applications VIM User and Administrator App App App Virtualized Infrastructure Virtualized Infrastructure Manager (VIM) = OpenStack Virtual Compute Virtual Storage Virtual Network Virtualization Layer Hardware Resources Doctor Scope
8
Key Requirements as VIM
Consistent Resource State Awareness Immediate Notification Extensible Monitoring Fault Correlation
9
Doctor Architecture and Typical Scenario
Application 0. Set Alarm Manager 6-. Action 5. Notify Error Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Controller Resource Map Alarm Conf. 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy 1. Raw Failure
10
Virtualized Infrastructure (Resource Pool)
Doctor OSS Map Application 0. Set Alarm Manager 6-. Action 5. Notify Error Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Nova Controller Resource Map Alarm Conf. Neutron Ceilometer Cinder 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy e.g. Zabbix e.g. Monasca 1. Raw Failure
11
Doctor OSS Development
Application 0. Set Alarm Manager 6-. Action 5. Notify Error Event Alarm State Correction Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Nova Controller Resource Map Alarm Conf. Neutron Ceilometer Cinder 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy e.g. Zabbix e.g. Monasca 1. Raw Failure
12
Doctor Blueprints in Liberty Cycle
Project Blueprint Spec Drafter Developer Status Ceilometer Event Alarm Evaluator Ryota Mibu (NEC) Completed (Liberty) Nova New nova API call to mark nova-compute down Tomi Juvonen (Nokia) Roman Dobosz (Intel) Support forcing service down Carlos Goncalves (NEC) Get valid server state Spec approved (Mitaka) Add notification for service status change Balazs Gibizer (Ericsson) Waiting for spec approval (Mitaka) ✓ ✓ ✓
13
Doctor BP Detail: Nova – Mark Nova-Compute Down
External Monitoring Service Client Monitoring NEW API to update nova-compute service state Force-down API Host / Machine nova api VM service state nova compute queue nova conductor nova DB Hypervisor EXISTING (periodic update) vSwitch nova scheduler BMC
14
Doctor BP Detail: Ceilometer - Event Alarm
Nova Neutron Cinder Manager event stats sample Notification-driven alarm evaluator EXISTING (polling-based) notification NEW Shortcut (notification-based) Audit Service
15
Doctor Southbound API Configuration Fault Messaging Unified Event API
Admin Threshold Enable Conf. Policy Conf. Enable NFVI Monitor Inspector Controller Notifier User Monitor Unified Event API Monitor
16
Doctor Status Notifier Controller Inspector Monitor Ceilometer Nova
Neutron Cinder Monasca? Zabbix DPDK To-Be Arch. Design Gap Analysis Blueprint Coding Integration OPNFV Release Dec 2014 Mar 2015 Done Sep 2015 Next Step Feb 2016
17
Don’t miss out... “Doctor – Fault Management” Project Theater, Wednesday, 3:55 pm – 4:15 pm “Doctor: Failure Detection and Notifiaction for NFV” DOCOMO booth, PoC Demo Zone
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.