Neutron Deployment at Scale Igor Bolotin, Cloud Architecture Vinay Bannai, SDN Architecture.

Slides:



Advertisements
Similar presentations
All rights reserved © 2000, Alcatel 1 CPE-based VPNs Hans De Neve Alcatel Network Strategy Group.
Advertisements

Modular Layer 2 In OpenStack Neutron
LESSONS LEARNED – BUILDING PAYPAL CLOUD
Tableau Software Australia
Differentiated Services == Differentiated Scheduling Gary Kotton - VMware Gilad Zlotkin - Radware The role of the Nova scheduler in managing Quality of.
Software Defined Networking in Apache CloudStack
1 Applications Virtualization in VPC Nadya Williams UCSD.
© 2014 Avaya Inc. Avaya – Confidential & Proprietary Do not duplicate, publish or distribute further without the express written permission of Avaya. #AvayaATF.
© 2012 IBM Corporation Architecture of Quantum Folsom Release Yong Sheng Gong ( 龚永生 ) gongysh #openstack-dev Quantum Core developer.
L3 + VXLAN Made Practical
Connect communicate collaborate GN3plus What the network should do for clouds? Christos Argyropoulos National Technical University of Athens (NTUA) Institute.
CloudStack Scalability Testing, Development, Results, and Futures Anthony Xu Apache CloudStack contributor.
Seamless migration from Nova-network to Neutron in eBay production Chengyuan Li, Han Zhou.
Managing Open vSwitch Across a Large Heterogeneous Fleet
A 5 minutes intro to Openstack (and a few more minutes on Openstack Networking) Salvatore Orlando 3 rd OSUG Italy Meetup Rome, May 9 th 2013.
Virtualization of Fixed Network Functions on the Oracle Fabric Krishna Srinivasan Director, Product Management Oracle Networking Savi Venkatachalapathy.
10/04/12 Under the Hood: Network Virtualization with OpenStack Neutron and VMware NSX Somik Behera – NSX Product Manager Dimitri Desmidt - NSX Senior Technical.
1 Security on OpenStack 11/7/2013 Brian Chong – Global Technology Strategist.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Software Defined Networking.
SDN Architect, Nov Vinay Bannai NEUTRON HYBRID MODE.
SDN in Openstack - A real-life implementation Leo Wong.
SDN and Openflow.
Network Overlay Framework Draft-lasserre-nvo3-framework-01.
Cloud Computing (101).
Data Center Network Redesign using SDN
Opensource for Cloud Deployments – Risk – Reward – Reality
Cloud Operating System Unit 13 Cloud System Management II M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,
Components of Windows Azure - more detail. Windows Azure Components Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET,
+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.
Ceph Storage in OpenStack Part 2 openstack-ch,
Module 3: Designing IP Addressing. Module Overview Designing an IPv4 Addressing Scheme Designing DHCP Implementation Designing DHCP Configuration Options.
608D CloudStack 3.0 Omer Palo Readiness Specialist, WW Tech Support Readiness May 8, 2012.
CON Software-Defined Networking in a Hybrid, Open Data Center Krishna Srinivasan Senior Principal Product Strategy Manager Oracle Virtual Networking.
EXPOSING OVS STATISTICS FOR Q UANTUM USERS Tomer Shani Advanced Topics in Storage Systems Spring 2013.
CERN IT Department CH-1211 Genève 23 Switzerland PES 1 Ermis service for DNS Load Balancer configuration HEPiX Fall 2014 Aris Angelogiannopoulos,
SOFTWARE DEFINED NETWORKING/OPENFLOW: A PATH TO PROGRAMMABLE NETWORKS April 23, 2012 © Brocade Communications Systems, Inc.
Launch Amazon Instance. Amazon EC2 Amazon Elastic Compute Cloud (Amazon EC2) provides resizable computing capacity in the Amazon Web Services (AWS) cloud.
Virtualization Vitalis Konopelec Technology Solution Professional Microsoft Slovakia s.r.o.
Building Cloud Solutions Presenter Name Position or role Microsoft Azure.
OpenContrail at OPNFV Summit 2015
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
CON8473 – Oracle Distribution of OpenStack Ronen Kofman Director of Product Management Oracle OpenStack September, 2014 Copyright © 2014, Oracle and/or.
Communication Needs in Agile Computing Environments Michael Ernst, BNL ATLAS Distributed Computing Technical Interchange Meeting University of Tokyo May.
Security on OpenStack 11/7/2013
New Approach to OVS Datapath Performance
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
Don’t Miss These Sessions!
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Multi-VIM/Cloud High Level Architecture
VIDIZMO Deployment Options
Red Hat User Group June 2014 Marco Berube, Cloud Solutions Architect
AWS. Introduction AWS launched in 2006 from the internal infrastructure that Amazon.com built to handle its online retail operations. AWS was one of the.
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
Cloud Technology Group
Multisite BP and OpenStack Kingbird Discussion
Marrying OpenStack and Bare-Metal Cloud
Indigo Doyoung Lee Dept. of CSE, POSTECH
Network Virtualization
Mix & Match: Resource Federation
Top #1 in China Top #3 in the world
Future Internet: Infrastructures and Services
LOAD BALANCING INSTANCE GROUP APPLICATION #1 INSTANCE GROUP Overview
OpenStack Summit Berlin – November 14, 2018
System Center Configuration Manager Cloud Services – Cloud Distribution Point Presented By: Ginu Tausif.
Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,
PayPal Cloud Journey & Architecture
Openstack Summit November 2017
Using OpenDaylight in Hybrid Cloud: issues or challenges
Presentation transcript:

Neutron Deployment at Scale Igor Bolotin, Cloud Architecture Vinay Bannai, SDN Architecture

eBay Inc. enables commerce by delivering flexible and scalable solutions that foster merchant growth. About ebay inc With 145 million active buyers globally, eBay is one of the world's largest online marketplaces, where practically anyone can buy and sell practically anything. With 148 million registered accounts in 193 markets and 26 currencies around the world, PayPal enables global commerce, processing almost 8 million payments every day. eBay Enterprise is a leading provider of commerce technologies, omnichannel operations and marketing solutions. It serves 1000 retailers and brands.

Business case Cloud at eBay Inc Deployment Patterns Problem Areas How we addressed them Future Direction Summary Q & A Outline of the Presentation

Agility −Reduce time to market −Enable innovation Efficiency −Elastic scale −Reduce overall cost Multi-Tenancy Availability Security & Compliance Software Enabled Data Centers What our businesses need?

Cloud at eBay Inc eBay Inc Cloud eBay Inc Cloud Region/DC AZ Nova Cells Openstack Controllers Identity & Image Management AZ Global Orchestration 5

Private Cloud for all eBay Inc properties Global Orchestration with traffic and load balancing Identity Management −Region level (eventually global) Image Management −Region level Nova/Cinder/Neutron −Availability Zones −Active/Active servers Trove Zabbix for monitoring All services run behind a load balancer VIP Deployment Patterns

Shared Cloud between tenants Different types of tenants −eBay Production, PP production, StubHub, GSI Enterprise etc −Dev/QA −Sandbox environment, internal tenants (IT, VPCs) Production Traffic −All bridged and no overlays −No DHCP Dev/QA and some of the internal tenants −Overlays −DHCP Deployment Patterns (contd.)

Gateway Nodes Physical Racks 8

Hypervisor Scale Out Overlay Networks Bridged Networks Neutron Services (DHCP, Metadata, API server) SDN Controllers Network Gateway Nodes Upgrade Areas With Scale Issues

Hypervisor Scale Out Nova API Nova Cells Nova Sched Nova Cells Nova Sched Neutron API Nova API Nova Cells DHCP Agent SDN Contrl 10

Several hundreds of hypervisor in a cell Multiple cells in a AZ Several thousands of hypervisors in a AZ Nova cells mitigate hypervisor scale Neutron scaling −Majority of the hypervisors support Bridged VM’s −Hybrid mode with both overlay and bridged VM’s Hypervisor Scale Out

Network Virtualization Layer L2 VM L2 L3 VM Tenant on Overlay Network Tenant on Bridged Network Bridged and Overlay

Overlay technology −VXLAN −STT Handling BUM traffic −ARP −Unknown unicast −Multicast Logical switches and routers Distributed L3 routers −Direct tunnels from hypervisor to hypervisor Scale out deployment of Gateway nodes Overlay Networks

Keystone tokens Single threaded Quantum/Neutron server DHCP Servers Healing Instance Info Caching Interval Neutron Services

Keystone Token Generation and Authentication Nova Server Nova Server Neutron Server Cinder Server Cinder Server Client Keystone Server Keystone Server Image Server Image Server UUID based Token –Needs to be authenticated by the keystone server for every call PKI based Token –Authenticated by the servers using Keystone certs Token caching –Prevents unnecessary token creation 15

Applies to uuid based tokens −98% of tokens generated by inter-API services −92% are quantum/neutron related −Average of 25 to 30 tokens/sec created by quantum/neutron alone −RPC call overhead, bloated token table Fix −Use token caching (1 hour) −Use PKI for service tenant −Reduces network chatter and improves performance Openstack bugs −Bug id : −Bug id : Token Caching

Prior to Havana One api thread handling both REST calls and the RPC calls Broke up the api to two threads −One handles REST API calls −The other handles RPC calls Havana fixes −DHCP renewals not handled by neutron servers, instead dhcp_release −Multi-worker support Neutron Multi Worker

All nova computes regularly poll neutron server To get network info of the instances running on the compute node Default is 10 seconds Hundreds of hypervisors and tens of thousands of VM’s will add up −Even though only one instance is checked for each interval We adjusted the interval to 600 seconds Heal Instance Info Cache Interval

The most common source of problems We employ multiple strategies −DHCP active/standby −Planning to support DHCP active/active −No DHCP/Config Drive option Production Environment −No DHCP −Config drive management −Requires “cloudinit” aware images DHCP Scaling

SDN Controllers SDN Controller SDN Controller SDN Controller SDN Controller SDN Controller SDN Controller Neutron API Nova API OS Ctrl OS Ctrl OS Ctrl OS Ctrl OS Ctrl OS Ctrl

Only with overlay networks Scale out architecture Problems with high CPU utilization Number of flows in the gateway node East – West traffic also hitting VIPs −Load Balancer running as a appliance on a hypervisor −Using SNAT OVS Enhancements −Use megaflows in openvswitch −Multi-core version of ovs-vswitchd Network Gateway Nodes

Prior to OVS 1.11 Megaflow introduces wildcarding in kernel module Fewer misses and punts to user space Reduced number of flows in kernel Requires OVS 1.11 or greater Cons −Using security groups nullifies the effects of megaflows OVS Improvements Megaflows

OVS Improvements Multi-Core vswitchd Kernel module User Space Kernel Space cpu cores vswitchd Kernel module OVS < 2.0OVS >=

VPC model Neutron Tagging Blueprint −Network Assignment for Bridged VM’s −Network Selection for VPC tenants −Network Scheduling −Additional meta data information Blueprint − Future Work

There are two primary ways to plumb a VM into a network −Pass the net-id to the nova boot −Create a port and pass it to nova boot Nova schedules the instance without much knowledge about the underlying network BP proposes to address this issue as one of the use cases Network Assignment in Bridged VM’s

Rack 1 N1 Rack 2 N2 Rack 3 N3 Rack 4 N4 FZ1FZ2FZ3FZ4 Network Tagging 26 VM

Know your requirements Understand your size and scale Pick the SDN controller based on your needs Design with multiple failure domains Overlay, Bridged or Hybrid Monitor your cloud for performance degradation Summary

Thank you. Yes. We are hiring!