Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case

Slides:



Advertisements
Similar presentations
System Center 2012 R2 Overview
Advertisements

Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
OpenContrail Quickstart
QTIP Version 0.2 4th August 2015.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Challenges towards Elastic Power Management in Internet Data Center.
EXPOSING OVS STATISTICS FOR Q UANTUM USERS Tomer Shani Advanced Topics in Storage Systems Spring 2013.
Microsoft Virtual Academy. System Center 2012 Virtual Machine Manager SQL Server Windows Server Manages Microsoft Hyper-V Server 2008 R2 Windows Server.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
In Depth Azure StackIn Depth Azure Stack Resource Providers Damian Flynn MVP Daniel Savage Microsoft.
When RINA Meets NFV Diego R. López Telefónica
An Analysis of Container-based Platforms for NFV
Dell EMC NFV Validated Systems: vCPE & SD-WAN.
Shaopeng, Ho Architect of Chinac Group
/csit CSIT Readout to FD.io Board 08 February 2017
Virtualization.
Xin Li, Chen Qian University of Kentucky
New Approach to OVS Datapath Performance
Service Assurance in the Age of Virtualization
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
/csit CSIT Readout to FD.io Board 09 February 2017
BESS: A Virtual Switch Tailored for NFV
vCAT: Dynamic Cache Management using CAT Virtualization
Use Case for Distributed Data Center in SUPA
Architecture and Algorithms for an IEEE 802
Architectural Overview Of Cloud Computing
Current Generation Hypervisor Type 1 Type 2.
Operating Systems : Overview
6WIND MWC IPsec Demo Scalable Virtual IPsec Aggregation with DPDK for Road Warriors and Branch Offices Changed original subtitle. Original subtitle:
DPDK API and Virtual Infrastructure
Sebastian Solbach Consulting Member of Technical Staff
Cloud Computing Platform as a Service
Tomi Juvonen SW Architect, Nokia
Enhanced Platform Awareness (EPA) Alex Vul Intel Corporation
NPAR Dell - QLogic October 2011.
Oracle Solaris Zones Study Purpose Only
GGF15 – Grids and Network Virtualization
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Cloud Computing Dr. Sharad Saxena.
Network Function Virtualization: Challenges and
Dependability Evaluation and Benchmarking of
Microsoft Ignite NZ October 2016 SKYCITY, Auckland.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Management and Orchestration in Complex and Dynamic Environment
HC Hyper-V Module GUI Portal VPS Templates Web Console
link level network slicing with DPDK
Specialized Cloud Mechanisms
Virtio Keith Wiles July 11, 2016.
Operating Systems : Overview
Cloud computing mechanisms
Casablanca Platform Enhancements to Support 5G Use Case (Network Deployment, Slicing, Network Optimization and Automation Framework) 5G Use Case Team.
Network Services Benchmarking - NSB
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Specialized Cloud Architectures
Operating Systems : Overview
SCONE: Secure Linux Containers Environments with Intel SGX
Operating Systems : Overview
Closed Loop Platform Automation w/ OPNFV
ONAP Architecture Principle Review
Figure 3-2 VIM-NFVI acceleration management architecture
Openstack Summit November 2017
Presentation transcript:

Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case Sunku.Ranganath@Intel.com Sridhar.Rao@Spirent.com Shreya.Pandita@Spirent.com

Agenda Intro to VSPERF Intro to Intel RDT & Spirent Cloud Stress Demo: Noisy Neighbor impact with VSPERF Intro to RMD Demo: Mitigating Noisy Neighbor impact with RMD Call to Action

Intro to VSPERF Define, implement and execute a test suite to characterize the performance of a virtual switch in the NFVi Based on industry standards Ability to assign and scale CPUs for VNFs Supports multiple traffic generators and virtual switches with various VNF deployment scenarios

Common Contention in Cloud Deployments Minimizing Total Cost of Ownership (TCO) often leads to oversubscription Quality of Service (QoS) requirements Service Level Agreements (SLAs) Metrics: Service Availability, Throughput, Latency, Scaling. Cloud vs. Network Function Virtualization Deployments Optimizing CPU resource utilization often leads to Shared Resource contention Multi-Tenants & Automated workload placement Lack of control of cache by orchestration layer

Intel® Resource Director Technology (Intel® RDT) DRAM Cache Allocation Technology (CAT) Last- Level Cache CORE APP DRAM Cache Monitoring Technology (CMT) CORE APP Last- Level Cache Identify misbehaving applications and reschedule according to priority Cache Occupancy reported on a per Resource Monitoring ID (RMID) basis—Advanced Telemetry Last-Level Cache partitioning mechanism enabling separation and prioritization of apps or VMs Misbehaving threads can be isolated to increase determinism

Key Concepts: Class of Service (CLOS) Threads/Apps/VMs grouped into Classes of Service (CLOS) for resource allocation Resource usage of any thread, app, VM, or a combination controlled with a CLOS Specify the CLOS for a thread via the per-core IA32_PQR_ASSOC (“PQR”) MSR Configure resource guidelines per CLOS Associate threads into CLOS Hardware manages resource allocation Default Bitmask LLC is all shared Overlapped Bitmask LLC is partially shared. Low priority Workload will be placed in COS with shared resources Isolated Bitmask LLC is allocated separately to individual COS.

Noisy Neighbor Impact & VSPERF Traffic Generator port 2000 Flows VNF 1 (Testpmd L2 FWD) Linux Kernel DPDK Intel Xeon Platform Pod 12 – Node 4 2 Dedicated cores Virtio port 0 Virtio port 1 DPDK - PMDs Open vSwitch bridge Tenant port 2 Internet Port: Onboard Intel GbE NIC Si Tenant port 1 NUMA Node 0 4 Dedicated cores Cloud Stress Noisy Neighbor 3 Dedicated cores VSPERF integration with Collectd provides insight into NFVi data plane resource utilization VSPERF automates the deployment of Phy-VM-Phy setup Cloud Stress as a Noisy Neighbor 4 Dedicated cores Cloud Stress Noisy Neighbor Figure: A Phy-VM-Phy deployment

CloudStress Intro to Spirent CloudStress Web-based infrastructure validation application Performance and capacity planning for Compute, Memory, Storage and Network I/O Dynamic workloads to validate NFV/Cloud infrastructure CloudStress

Intro to Spirent CloudStress Virtual Firewall NFVi Compute Network Storage

Creating Virtual Machine Profiles Spirent CloudStress NFVi Compute Network Storage

Creating Virtual Machine Profiles Spirent CloudStress Spirent CloudStress NFVi Under Test Compute Network Storage

Capacity Planning NFVi Under Test Compute Network Storage

Cloud Stress as Noisy Neighbor Assess impact of resource contention on VNF and/or NFV service chains. Noisy Neighbor VNF Performance vRouter vFW vCPE VNF NFVi Under Test Generate flap or negative events on a system to cause intentional disruption. This helps understand the impact of noisy neighbors on the given system. What is the impact of a CPU spike on one of the VM on a fully loaded host If network load drops suddenly, and after a short time returns all at once, is full capacity immediately available? What is the effect of VMs on oversubscribed hosts to uneven loads on a small sub-set of VMs? <Needs more definition and clarity> Compute Network Storage

Demo : Impact of Noisy Neighbor on VNF Under Test

Planning For Resources Remote analysis of resource utilization and granular resource control not optimal for latency sensitive workloads Planning for your Cache: LLC Profiling LLC considerations Class Of Service construction

Class Of Service Construction Total LLC Considerations: Capacity of Cache Would you require DDIO? Isolated vs. Overlapping cache COS Crucial to have local agent on the host to control & enforce COS associations for latency sensitive workloads 1 Non DDIO Packet path Figure: Traffic flow from NIC to VMs

Enabling Options User space tool requiring access to Intel MSR Platform Quality of Service (pqos) tool User space tool requiring access to Intel MSR Associates LLC per Core id basis https://github.com/intel/intel-cmt-cat Resctrl file system Extension of kernfs Associates using pid per thread basis Kernel 4.10+ Resource Management Daemon Newly open sourced Based on resctrl fs Figure: Kernel resctrl fs

Resource Management Daemon What is RMD Why RMD A Linux daemon that runs on individual hosts, with pluggable interfaces to interact with orchestration, monitoring and enforcement layers Communicates across control and data plane using REST API Receives resource policy from orchestration layer and enforces it on host Enforces resource allocation using kernel interfaces like resctrlfs or using libraries like libpqos Complex usage (mask) Real time tuning Varying platforms (cache size, bandwidth, numa) Fast shifting workloads (local policy) Uniform interface for RDT Simple API Interface Hosted at https://github.com/intel/rmd

RMD Architecture Open sourced on Nov 9th 2017 Provides the construct of overlapped and isolated COS’es Help tune the LLC for optimal performance Simple to use with max_cache and min_cache constructs WIP sections Configuration Policy Details osgroup cache ways reserved for operate system usage infragroup cache ways will be shared with other workloads guarantee allocate cache for workload max_cache == min_cache > 0 besteffort allocate cache for workload max_cache > min_cache > 0 shared allocate cache for workload max_cache == min_cache = 0

Demo : Mitigation of Noisy Neighbor Impact with RMD

Permutations of Test Scenarios Overlapping COS between: Virtual switch and VMs Multiple VMs OS and virtual switch Isolated COS between: DDIO considerations: Exclusive to VMs Exclusive to OS Shared across virtual switch & VMs Forced Contentions Limited LLC to VM under test Limited LLC to virtual switch 11 12 6 10 4 5 2 3 0, 7-9,13-23 1 Hypervisor PHY PMDs VM1 vswitchd OS VM2 COS0 COS1 COS2 COS3 Isolated LLC Overlapped OVS LLC Total LLC Isolated OVS LLC 1.DDIO 2.DDIO 3.DDIO Overlapped LLC across VMs Figure: Permutations of COS association

In Summary…. Call To Action… Noisy Neighbor affects are real and here to persist RMD provides a REST API for control/orchestration/management layer to request LLC for their VMs/Containers/applications. Call To Action… Enable test cases for VSPERF with various combinations of cache associations Scale the test scenarios for your projects with RMD and/or Cloud Stress

Questions?

References Cloud Testing with Synthetic Workload Gen: https://www.spirent.com/-/media/White- Papers/Broadband/PAB/Cloud_testing_with_synthetic_workload_generators.pdf Virtual Infrastructure Benchmarking: https://www.spirent.com/-/media/White- Papers/Broadband/PAB/Key_Considerations_for_Virtual_Infrastructure_Benchmarking_whit epaper.pdf Intro to Intel RDT: https://01.org/intel-rdt-linux/blogs/fyu1/2017/resource-allocation- intel%C2%AE-resource-director-technology Intro to RMD: https://github.com/intel/rmd Deterministic NFV w/ Intel RDT: https://builders.intel.com/docs/networkbuilders/deterministic_network_functions_virtualizatio n_with_Intel_Resource_Director_Technology.pdf