Using OpenStack In A Traditional Hosting Environment Jun Park (Ph.D.), Sr. Systems Architect Mike Wilson, Sr. Systems Architect EIG/Bluehost OpenStack.

Slides:



Advertisements
Similar presentations
Modular Layer 2 In OpenStack Neutron
Advertisements

All Rights Reserved © Alcatel-Lucent 2009 Enhancing Dynamic Cloud-based Services using Network Virtualization F. Hao, T.V. Lakshman, Sarit Mukherjee, H.
© 2012 IBM Corporation Architecture of Quantum Folsom Release Yong Sheng Gong ( 龚永生 ) gongysh #openstack-dev Quantum Core developer.
Bringing Together Linux-based Switches and Neutron
CloudStack Scalability Testing, Development, Results, and Futures Anthony Xu Apache CloudStack contributor.
DOT – Distributed OpenFlow Testbed
Seamless migration from Nova-network to Neutron in eBay production Chengyuan Li, Han Zhou.
What’s New: Windows Server 2012 R2 Tim Vander Kooi Systems Architect
It’s the App, Stupid! Orchestration, Automation, Scaling & What’s in Between Yaron Parasol, Uri
OpenStack Open Source Cloud Software. OpenStack: The Mission "To produce the ubiquitous Open Source cloud computing platform that will meet the needs.
Application Centric Infrastructure
10/04/12 Under the Hood: Network Virtualization with OpenStack Neutron and VMware NSX Somik Behera – NSX Product Manager Dimitri Desmidt - NSX Senior Technical.
1 Security on OpenStack 11/7/2013 Brian Chong – Global Technology Strategist.
SDN in Openstack - A real-life implementation Leo Wong.
Apache CloudStack Evolution Proposal Alex Huang Software Architect, Citrix Systems.
7th OpenSTACK USER group nordics
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
SDN Controller Requirement draft-gu-sdnrg-sdn-controller-requirement-00 Rong Gu (Presenter) Chen Li China Mobile.
Data Center Virtualization: Open vSwitch Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking.
Class 3: SDN Stack Theophilus Benson. Outline Background – Routing in ISP – Cloud Computing SDN application stack revisited Evolution of SDN – The end.
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Getting Started with Oracle Compute Cloud
Additional SugarCRM details for complete, functional, and portable deployment.
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
Data Center Network Redesign using SDN
Opensource for Cloud Deployments – Risk – Reward – Reality
Barracuda Load Balancer Server Availability and Scalability.
Cloud Computing Saneel Bidaye uni-slb2181. What is Cloud Computing? Cloud Computing refers to both the applications delivered as services over the Internet.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Common Devices Used In Computer Networks
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
CloudNaaS: A Cloud Networking Platform for Enterprise Applications Theophilus Benson*, Aditya Akella*, Anees Shaikh +, Sambit Sahu + (*University of Wisconsin,
CON Software-Defined Networking in a Hybrid, Open Data Center Krishna Srinivasan Senior Principal Product Strategy Manager Oracle Virtual Networking.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
VMware vSphere Configuration and Management v6
EXPOSING OVS STATISTICS FOR Q UANTUM USERS Tomer Shani Advanced Topics in Storage Systems Spring 2013.
Extending OVN Forwarding Pipeline Topology-based Service Injection
SOFTWARE DEFINED NETWORKING/OPENFLOW: A PATH TO PROGRAMMABLE NETWORKS April 23, 2012 © Brocade Communications Systems, Inc.
How Adobe Built An OpenStack Cloud
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Arne Wiebalck -- VM Performance: I/O
How Adobe Has Built An OpenStack Cloud
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
@projectcalico Sponsored by Simple, Secure, Scalable networking for the virtualized datacentre UKNOF 33 Ed 19 th January 2016.
Brian Lauge Pedersen Senior DataCenter Technology Specialist Microsoft Danmark.
Atrium Router Project Proposal Subhas Mondal, Manoj Nair, Subhash Singh.
Network Topology Computer network topology is the way various components of a network (like nodes, links, peripherals, etc) are arranged. Network topologies.
CON8473 – Oracle Distribution of OpenStack Ronen Kofman Director of Product Management Oracle OpenStack September, 2014 Copyright © 2014, Oracle and/or.
Communication Needs in Agile Computing Environments Michael Ernst, BNL ATLAS Distributed Computing Technical Interchange Meeting University of Tokyo May.
Shaopeng, Ho Architect of Chinac Group
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Red Hat User Group June 2014 Marco Berube, Cloud Solutions Architect
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
New Solutions For Scaling The Internet Address Space
Software Defined Networking (SDN)
Network Virtualization
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
HC Hyper-V Module GUI Portal VPS Templates Web Console
Neutron at Scale Justin Hammond - Developer
Cloud computing mechanisms
OpenStack Summit Berlin – November 14, 2018
Using OpenDaylight in Hybrid Cloud: issues or challenges
Presentation transcript:

Using OpenStack In A Traditional Hosting Environment Jun Park (Ph.D.), Sr. Systems Architect Mike Wilson, Sr. Systems Architect EIG/Bluehost OpenStack Summit 2013

EIG/Bluehost 2 what we were in about a year’s time Needed a system to manage it all Provisioning these systems had to be automated System had to scale to multiple data centers Btw, we have 2 months to come up with this

High level requirements Centralized management, horizontal scalability. Abstractions for physical and logical deployments of devices and resources. Open-source project with lots of support and momentum Typical Cloud features (migration, imaging, provisioning, etc.) Easy transition into future cloud offering with support for industry standard APIs = OpenStack! 3 EIG/Bluehost

Bluehost Environment Scale – More than 10,000 physical servers – Adding hundreds of nodes per day Network – Public network directly attached to VMs – Plan on adding private network later OpenStack/Folsom Components – Nova (compute, api, scheduler) – Quantum (network) – AMQP: Qpid (messaging) – Mysql (db) 4 EIG/Bluehost

Outline Scalability/Stability Rethink Quantum Network Operational Issues Wrap-up/Conclusions 5 EIG/Bluehost

Why Difficult To Scale up? Components that don’t scale – Messaging system – Mysql server – Heavy APIs Hard to Diagnose – No simulator/emulator for high scalability testing – No quantitative guide as to how to scale up – No detailed error message or even not at all – Requires detailed knowledge of codebase 6 EIG/Bluehost Look at that line!

Nova Enhancements Monitoring/troubleshooting – Added service ping (similar to grizzly) – Additional task_states – Better errors to instance_faults Functionality – Added LVM resize and injection – Added stop_soft and stop_hard similar to reboot Stability/Scalability – Some bug fixes to OVS VIF driver – Added custom scheduler 7 EIG/Bluehost

MySQL/Innodb Concurrency Nova behavior – Direct connection to DB (addressed in grizzly though) – Too MANY queries; much more read than write – Some queries often return huge number of results – Periodicity of loads due to periodic tasks Mysql behavior – innodb_thread_concurrency = num of threads that can execute queries in the queue – innodb_concurrency_tickets = number of rows you can return before requeue (default 500!) – With default settings your mysql server will fall over with a few thousand compute nodes EIG/Bluehost 8

9 So the answer is just increase concurrency and tickets right? Sadly, no. Tuning concurrency and tickets is a must! But we still requeue. We don’t know why, but we have a workaround. Our workaround: Send most read queries to a cluster of mysql slaves. I say most because many queries would be sensitive to possible slave replication lag

Messaging System Qpid Chosen – Speed and easy cluster ability – Incredibly unstable at large scale (Even at small scale) – Possibly poorly configured – Older release in CentOS6 Problem – Broker bottleneck – Unnecessarily complex Recommendation – AVOID BROKER! – ZeroMQ as replacement EIG/Bluehost 10

Server Farm > 10,000 physical servers Server Farm > 10,000 physical servers BlueHost OpenStack Group of Controllers Nova-api Nova-scheduler Quantum-server Keystone Group of Controllers Nova-api Nova-scheduler Quantum-server Keystone Read-only mysql slave Read-only mysql slaves mysql master Qpid servers Load Balanced EIG/Bluehost Host BH-enhanced Nova- compute BH-enhanced OVS Quantum Plugin

Rethink Quantum Network Problem – Quantum API only for DB; no efficient API for Nova – No API-around design for actual network objects – Premature OpenvSwitch (OVS) quantum plugin Our approach – Adding intelligence to OVS plugin – Optimizing API E.g., get_instance_nw_info() in nova; Python-level join operation – Expand current implicit communication with nova- compute (although not ideal) 12 EIG/Bluehost

OpenvSwitch (OVS) Support – OpenFlow 1.0 (1.2, 1.3 in experiment) – Various Hypervisors including KVM – Merged into main stream since Linux 3.3 Functionality – Filtering rules and associated actions E.g., anti-IP spoofing, DMAC filtering – Replacement or superset of ebtables/iptables/linuxbridge – QoS or Linux traffic control (TC) – Various third party controllers for full-fledged OF functionality 13 EIG/Bluehost

Controller Host Quantum With Nova-Compute Quantum-server Quantum-agent Nova-compute Quantum DB Nova-api server 2. Call quantum-api to create port 4. Create OVS tap 5. Set up external-ids on tap 1. Monitor any change of external- ids of the existing taps 1. Nova boot API 3. Modify external-ids on tap & setup OVS flows 2. Call quantum-api to get port info 3. Create port in DB Why nova-compute still create tap? Why nova-compute still create tap? 14 EIG/Bluehost 6. Do the rest of steps for creating a new VM Implicit communication with nova-compute Implicit communication with nova-compute

BH-Enhanced OVS Quantum Plugin Idiosyncratic Requirements – Focus on a Zone; plan on addressing multiple-datacenters issue – Direct public IP using flat shared provider network with No NAT – Still required to be isolated while allowing intra-traffic among VMs Our Approach – Developing “Distributed OpenFlow controller” using OVS plugin – Either no-tag or 24bit ID (e.g., QinQ, VxLAN, GRE), but NO VLAN – Caveats: Neither API-around approach nor Virtual Appliance New Features – Anti-IP/ARP spoofing OF flows – Multiple-IPs per public port – Network bandwidth throttling (QoS) – Optimal Intra-traffic flows 15 EIG/Bluehost

BH-Enhanced OVS Quantum Plugin br-int tap1 tap2 br-int-eth0 br-eth0 phy-br-eth0 eth0 pair of veth DMAC matching for incoming packets TPA matching in ARP query Anti-IP spoofing: SRC IP matching for outgoing packets VM1 VM2 Public Networks Deploy QoS for outgoing packets 10 Mbps 50 Mbps Int-loopback phy-loopback pair of veth DMAC matching for internal traffic to VMs 16 EIG/Bluehost Tag=1 (MTU size bug) Tag=2

Operational Issues Reboot hosts – Problem Circular dependencies between libvirtd and nova-compute OVS bug, allows to add non-existing tap interfaces – Solution A simple workaround to restart services in rc.local Restart services – Problem Quantum recreates taps, resulting in network hiccups – Solution Check out existing taps, no touch when unnecessary 17 EIG/Bluehost

Operational Issues Monitor Health – Problem Hard to track down – Solution Adding health check APIs XML customization – Problem No way to modify XML on the fly Not even allowed for static customization – Solution Add more parameters 18 EIG/Bluehost

Wrap-up/Conclusions EIG/Bluehost 19

Problem vs. Solution EIG/Bluehost 20 Problem Solution Monitoring/troubleshootin g Service ping, task_states, etc. Nova No LVM Overloaded scheduler Add LVM driver Custom scheduler OVS VIF driver issue Mysql Fix bugs Overloaded Read-only Mysql slave server Innodb issue Optimized confs Qpid Instability issue Clustered qpid Future plan No broker, ZeroMQ Quantum Heavy API Optimized API Premature OVS plugin Add features (Anti-IP, No-tag flows, QoS) Operations Reboot, restart of services Simple workarounds XML XML customization (cpu custom)

For True OpenStack Success Scalability  Scalable Messaging System & DB Infra Networks  Good Networks Abstraction So … We don’t do live demo until … 21 EIG/Bluehost

We’re sorry, we haven’t had time to contribute a whole lot YET. But here’s some code: for BH nova for BH Quantum (as in live production) Browse the branches, many have to do with features discussed today 22 EIG/Bluehost