Presentation is loading. Please wait.

Presentation is loading. Please wait.

Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case

Similar presentations


Presentation on theme: "Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case"— Presentation transcript:

1 Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case

2 Agenda Intro to VSPERF Intro to Intel RDT & Spirent Cloud Stress
Demo: Noisy Neighbor impact with VSPERF Intro to RMD Demo: Mitigating Noisy Neighbor impact with RMD Call to Action

3 Intro to VSPERF Define, implement and execute a test suite to characterize the performance of a virtual switch in the NFVi Based on industry standards Ability to assign and scale CPUs for VNFs Supports multiple traffic generators and virtual switches with various VNF deployment scenarios

4 Common Contention in Cloud Deployments
Minimizing Total Cost of Ownership (TCO) often leads to oversubscription Quality of Service (QoS) requirements Service Level Agreements (SLAs) Metrics: Service Availability, Throughput, Latency, Scaling. Cloud vs. Network Function Virtualization Deployments Optimizing CPU resource utilization often leads to Shared Resource contention Multi-Tenants & Automated workload placement Lack of control of cache by orchestration layer

5 Intel® Resource Director Technology (Intel® RDT)
DRAM Cache Allocation Technology (CAT) Last- Level Cache CORE APP DRAM Cache Monitoring Technology (CMT) CORE APP Last- Level Cache Identify misbehaving applications and reschedule according to priority Cache Occupancy reported on a per Resource Monitoring ID (RMID) basis—Advanced Telemetry Last-Level Cache partitioning mechanism enabling separation and prioritization of apps or VMs Misbehaving threads can be isolated to increase determinism

6 Key Concepts: Class of Service (CLOS)
Threads/Apps/VMs grouped into Classes of Service (CLOS) for resource allocation Resource usage of any thread, app, VM, or a combination controlled with a CLOS Specify the CLOS for a thread via the per-core IA32_PQR_ASSOC (“PQR”) MSR Configure resource guidelines per CLOS Associate threads into CLOS Hardware manages resource allocation Default Bitmask LLC is all shared Overlapped Bitmask LLC is partially shared. Low priority Workload will be placed in COS with shared resources Isolated Bitmask LLC is allocated separately to individual COS.

7 Noisy Neighbor Impact & VSPERF
Traffic Generator port 2000 Flows VNF (Testpmd L2 FWD) Linux Kernel DPDK Intel Xeon Platform Pod 12 – Node 4 2 Dedicated cores Virtio port 0 Virtio port 1 DPDK - PMDs Open vSwitch bridge Tenant port 2 Internet Port: Onboard Intel GbE NIC Si Tenant port 1 NUMA Node 0 4 Dedicated cores Cloud Stress Noisy Neighbor 3 Dedicated cores VSPERF integration with Collectd provides insight into NFVi data plane resource utilization VSPERF automates the deployment of Phy-VM-Phy setup Cloud Stress as a Noisy Neighbor 4 Dedicated cores Cloud Stress Noisy Neighbor Figure: A Phy-VM-Phy deployment

8 CloudStress Intro to Spirent CloudStress
Web-based infrastructure validation application Performance and capacity planning for Compute, Memory, Storage and Network I/O Dynamic workloads to validate NFV/Cloud infrastructure CloudStress

9 Intro to Spirent CloudStress
Virtual Firewall NFVi Compute Network Storage

10 Creating Virtual Machine Profiles
Spirent CloudStress NFVi Compute Network Storage

11 Creating Virtual Machine Profiles
Spirent CloudStress Spirent CloudStress NFVi Under Test Compute Network Storage

12 Capacity Planning NFVi Under Test Compute Network Storage

13 Cloud Stress as Noisy Neighbor
Assess impact of resource contention on VNF and/or NFV service chains. Noisy Neighbor VNF Performance vRouter vFW vCPE VNF NFVi Under Test Generate flap or negative events on a system to cause intentional disruption. This helps understand the impact of noisy neighbors on the given system. What is the impact of a CPU spike on one of the VM on a fully loaded host If network load drops suddenly, and after a short time returns all at once, is full capacity immediately available? What is the effect of VMs on oversubscribed hosts to uneven loads on a small sub-set of VMs? <Needs more definition and clarity> Compute Network Storage

14 Demo : Impact of Noisy Neighbor on VNF Under Test

15 Planning For Resources
Remote analysis of resource utilization and granular resource control not optimal for latency sensitive workloads Planning for your Cache: LLC Profiling LLC considerations Class Of Service construction

16 Class Of Service Construction
Total LLC Considerations: Capacity of Cache Would you require DDIO? Isolated vs. Overlapping cache COS Crucial to have local agent on the host to control & enforce COS associations for latency sensitive workloads 1 Non DDIO Packet path Figure: Traffic flow from NIC to VMs

17 Enabling Options User space tool requiring access to Intel MSR
Platform Quality of Service (pqos) tool User space tool requiring access to Intel MSR Associates LLC per Core id basis Resctrl file system Extension of kernfs Associates using pid per thread basis Kernel 4.10+ Resource Management Daemon Newly open sourced Based on resctrl fs Figure: Kernel resctrl fs

18 Resource Management Daemon
What is RMD Why RMD A Linux daemon that runs on individual hosts, with pluggable interfaces to interact with orchestration, monitoring and enforcement layers Communicates across control and data plane using REST API Receives resource policy from orchestration layer and enforces it on host Enforces resource allocation using kernel interfaces like resctrlfs or using libraries like libpqos Complex usage (mask) Real time tuning Varying platforms (cache size, bandwidth, numa) Fast shifting workloads (local policy) Uniform interface for RDT Simple API Interface Hosted at

19 RMD Architecture Open sourced on Nov 9th 2017 Provides the construct of overlapped and isolated COS’es Help tune the LLC for optimal performance Simple to use with max_cache and min_cache constructs WIP sections Configuration Policy Details osgroup cache ways reserved for operate system usage infragroup cache ways will be shared with other workloads guarantee allocate cache for workload max_cache == min_cache > 0 besteffort allocate cache for workload max_cache > min_cache > 0 shared allocate cache for workload max_cache == min_cache = 0

20 Demo : Mitigation of Noisy Neighbor Impact with RMD

21 Permutations of Test Scenarios
Overlapping COS between: Virtual switch and VMs Multiple VMs OS and virtual switch Isolated COS between: DDIO considerations: Exclusive to VMs Exclusive to OS Shared across virtual switch & VMs Forced Contentions Limited LLC to VM under test Limited LLC to virtual switch 11 12 6 10 4 5 2 3 0, 7-9,13-23 1 Hypervisor PHY PMDs VM1 vswitchd OS VM2 COS0 COS1 COS2 COS3 Isolated LLC Overlapped OVS LLC Total LLC Isolated OVS LLC 1.DDIO 2.DDIO 3.DDIO Overlapped LLC across VMs Figure: Permutations of COS association

22 In Summary…. Call To Action…
Noisy Neighbor affects are real and here to persist RMD provides a REST API for control/orchestration/management layer to request LLC for their VMs/Containers/applications. Call To Action… Enable test cases for VSPERF with various combinations of cache associations Scale the test scenarios for your projects with RMD and/or Cloud Stress

23 Questions?

24 References Cloud Testing with Synthetic Workload Gen: Papers/Broadband/PAB/Cloud_testing_with_synthetic_workload_generators.pdf Virtual Infrastructure Benchmarking: Papers/Broadband/PAB/Key_Considerations_for_Virtual_Infrastructure_Benchmarking_whit epaper.pdf Intro to Intel RDT: intel%C2%AE-resource-director-technology Intro to RMD: Deterministic NFV w/ Intel RDT: n_with_Intel_Resource_Director_Technology.pdf


Download ppt "Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case"

Similar presentations


Ads by Google