Ananta: Cloud Scale Load Balancing Presenter: Donghwi Kim 1.

Slides:



Advertisements
Similar presentations
All Rights Reserved © Alcatel-Lucent 2009 Enhancing Dynamic Cloud-based Services using Network Virtualization F. Hao, T.V. Lakshman, Sarit Mukherjee, H.
Advertisements

Ananta: Cloud Scale Load Balancing
CloudStack Scalability Testing, Development, Results, and Futures Anthony Xu Apache CloudStack contributor.
NCCA 2014 Performance Evaluation of Non-Tunneling Edge-Overlay Model on 40GbE Environment Nagoya Institute of Technology, Japan Ryota Kawashima and Hiroshi.
Leone From global measurements to local management UC3M: inHome NAT detection RFC recommender ICMP UDP TCP Miguel Ángel Díaz, Francisco Valera.
GLBP GLBP: Gateway Load Balancing Protocol. It is a Cisco proprietary protocol. We can Load Balance between the Gateways. The Load can be distributed among.
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
Highly Available Central Services An Intelligent Router Approach Thomas Finnern Thorsten Witt DESY/IT.
1 Network Address Translation (NAT) Relates to Lab 7. Module about private networks and NAT.
DFence: Transparent Network-based Denial of Service Mitigation CSC7221 Advanced Topics in Internet Technology Presented by To Siu Sang Eric ( )
SERVER LOAD BALANCING Presented By : Priya Palanivelu.
Subnetting.
Tesseract A 4D Network Control Plane
Lesson 1: Configuring Network Load Balancing
Middleboxes & Network Appliances EE122 TAs Past and Present.
Additional SugarCRM details for complete, functional, and portable deployment.
Serval: Software Defined Service-Centric Networking Jen Rexford Erik Nordstrom, David Shue, Prem Gopalan, Rob Kiefer, Mat Arye, Steven Ko, Mike Freedman.
Redirection and Load Balancing
Components of Windows Azure - more detail. Windows Azure Components Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET,
Software-Defined Networks Jennifer Rexford Princeton University.
Networking the Cloud Presenter: b 電機三 姜慧如.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
VL2 – A Scalable & Flexible Data Center Network Authors: Greenberg et al Presenter: Syed M Irteza – LUMS CS678: 2 April 2013.
11 SECURING YOUR NETWORK PERIMETER Chapter 10. Chapter 10: SECURING YOUR NETWORK PERIMETER2 CHAPTER OBJECTIVES  Establish secure topologies.  Secure.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Cloud Scale Performance & Diagnosability Comprehensive SDN Core Infrastructure Enhancements vRSS Remote Live Monitoring NIC Teaming Hyper-V Network.
Web Cache Redirection using a Layer-4 switch: Architecture, issues, tradeoffs, and trends Shirish Sathaye Vice-President of Engineering.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
VIRTUAL SWITCH/ROUTER BENCHMARKING Muhammad Durrani Ramki Krishnan Brocade Communications Sarah Banks Akamai 1 © 2013 Brocade Communications Systems, Inc.
Module 10: How Middleboxes Impact Performance
Visual Studio Windows Azure Portal Rest APIs / PS Cmdlets US-North Central Region FC TOR PDU Servers TOR PDU Servers TOR PDU Servers TOR PDU.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNP 1 v3.0 Module 1 Overview of Scalable Internetworks.
Networking Basics CCNA 1 Chapter 11.
Global scale with Microsoft Azure Scenarios Achieving high availability with Microsoft Azure Demos.
Create a dynamic datacenter with software-defined networking
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Cloud Computing Lecture 5-6 Muhammad Ahmad Jan.
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Building Cloud Solutions Presenter Name Position or role Microsoft Azure.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
XRBLOCK IETF 85 Atlanta Network Virtualization Architecture Design and Control Plane Requirements draft-fw-nvo3-server2vcenter-01 draft-wu-nvo3-nve2nve.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Embrace the Future of.
NEWS: Network Function Virtualization Enablement within SDN Data Plane.
Manajemen Jaringan, Sukiswo ST, MT 1 Network Monitoring Sukiswo
Represented BY:- Allauddin Ahmad.  What it is?  OSI model.  History.  Objectives.  Encapsulation and decapsulation.  Multiplexing and demultiplexing.
Implement Storage Implement Blobs and Azure Files Manage Access Configure Diagnostics, Monitoring & Analytics Implement SQL Databases Implement Recovery.
Network Processing Systems Design
1 Super/Ultra-Basic Load-Balancing Introduction For AFNOG 2012 Joel Jaeggli.
Network Virtualization Ben Pfaff Nicira Networks, Inc.
Shaopeng, Ho Architect of Chinac Group
Multi Node Label Routing – A layer 2.5 routing protocol
CIS 700-5: The Design and Implementation of Cloud Networks
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
Slicer: Auto-Sharding for Datacenter Applications
Network Address Translation (NAT)
F5 BIGIP V 9 Training.
ETHANE: TAKING CONTROL OF THE ENTERPRISE
Network Address Translation
Network Address Translation (NAT)
2TCloud - Veeam Cloud Connect
Introducing To Networking
Congestion-Aware Load Balancing at the Virtual Edge
NTHU CS5421 Cloud Computing
VL2: A Scalable and Flexible Data Center Network
Specialized Cloud Architectures
Data Center Architectures
Congestion-Aware Load Balancing at the Virtual Edge
Network Address Translation (NAT)
Presentation transcript:

Ananta: Cloud Scale Load Balancing Presenter: Donghwi Kim 1

Background: Datacenter Each server has a hypervisor and VMs Each VM is assigned a Direct IP(DIP) 2 Each service has zero or more external end-points Each service is assigned one Virtual IP (VIP)

Background: Datacenter Each datacenter has many services A service may work with Another service in same datacenter Another service in other datacenter A client over the internet 3

Background: Load-balancer Entrance of server pool Distribute workload to worker servers Hide server pools from client with network address translator (NAT) 4

Do destination address translation (DNAT) Inbound VIP Communication 5 Front-end VM LB Front-end VM Front-end VM Internet DIP 1 VIP src: Client, dst: VIP payload src: Client, dst: DIP1 payload DIP 2 DIP 3 src: Client, dst: DIP2 payload src: Client, dst: DIP3 payload src: Client, dst: VIP payloadsrc: Client, dst: VIP payload

Do source address translation (SNAT) VIP 1 Outbound VIP Communication 6 Front-end VM LB Back-end VM DIP 1DIP 2 Front-end VM LB Front-end VM Front-end VM DIP 3 Service 1 Service 2 VIP 2 src: DIP2, dst: VIP2 payload src: VIP1, dst: VIP2 payload DIP 4 DIP 5 src: VIP1, dst: VIP2 payload

State of the Art A load balancer is a hardware device Expensive, slow failover, no scalability 7 LB

Cloud Requirements Scale Reliability 8 RequirementState-of-the-art ~40 Tbps throughput using 400 servers 20Gbps for $80, Gbps for a single VIPUp to 20Gbps per VIP RequirementState-of-the-art N+1 redundancy1+1 redundancy or slow failover Quick failover

Cloud Requirements Any service anywhere Tenant isolation 9 RequirementState-of-the-art Servers and LB/NAT are placed across L2 boundaries NAT supported only in the same L2 RequirementState-of-the-art An overloaded or abusive tenant cannot affect other tenants Excessive SNAT from one tenant causes complete outage

Ananta 10

SDN SDN: Managing a flexible data plane via a centralized control plane 11 Controller Control Plane Data plane Switch

Break down Load-balancer’s functionality Control plane: VIP configuration Monitoring Data plane Destination/source selection address translation 12

Design Ananta Manager Source selection Not scalable (like SDN controller) Multiplexer (Mux) Destination selection Host Agent Address translation Reside in each server’s hypervisor 13

Data plane 14 Multiplexer... VM Switch VM N Host Agent VM 1... VM Switch VM N Host Agent VM 1... VM Switch VM N Host Agent VM 1... dst: VIP1 dst: VIP2 dst: VIP1 dst: VIP2dst: DIP3dst: VIP1dst: DIP1dst: VIP1dst: DIP2 dst: DIP1 dst: DIP2 dst: DIP3 1 st tier (Router) packet-level load spreading via ECMP. 2 nd tier (Multiplexer) connection-level load spreading destination selection. 3 rd tier (Host Agent) Stateful NAT

Inbound connections 15 Router MUX Host MUX Router MUX … Host Agent VM DIP Client s: CLI, d: VIP s: CLI, d: DIP s: VIP, d: CLI s: DIP, d: CLI s: CLI, d: VIP s: MUX, d: DIP

Outbound (SNAT) connections 16 Server s: DIP:555, d: SVR:80 Port?? Map VIP:777 to DIP s: VIP:777, d: SVR:80 s: SVR:80, d: VIP:777 s: MUX, d: DIP:555 s: SVR:80, d: DIP:555

Reducing Load of AnantaManager Optimization Batching: Allocate 8 ports instead of one Pre-allocation: 160 ports per VM Demand prediction: Consider recent request history Less than 1% of outbound connections ever hit Ananta Manager SNAT request latency is reduced 17

VIP traffic in a datacenter Large portion of traffic via load-balancer is intra-DC 18

Step 1: Forward Traffic 19 Host MUX MUX1 VM … Host Agent 1 1 DIP1 MUX MUX2 2 2 Host VM … Host Agent DIP2 Data Packets Destination VIP1 VIP2

Step 2: Return Traffic 20 Host MUX MUX1 VM … Host Agent 1 1 DIP1 4 4 MUX MUX Host VM … Host Agent DIP2 Data Packets Destination VIP1 VIP2

Step 3: Redirect Messages 21 Host MUX MUX1 VM … Host Agent DIP MUX MUX2 Host VM … Host Agent DIP2 7 7 Redirect Packets Destination VIP1 VIP2

Step 4: Direct Connection 22 Host MUX MUX1 VM … Host Agent DIP1 MUX MUX2 8 8 Host VM … Host Agent DIP2 Redirect Packets Data Packets Destination VIP1 VIP2

SNAT Fairness Ananta Manager is not scalable More VMs, more resources 23 DIP 1 DIP 2 DIP 3 DIP 4 VIP 1 VIP Pending SNAT Reques ts per DIP. At most on e per DIP. 1 Pending SNAT Reques ts per VIP. SNAT proces sing queue Global queue. Round- robin dequeue from V IP queues. Processed by thread pool

Packet Rate Fairness Each Mux keeps track of its top-talkers (top-talker: VIPs with the highest rate of packets) When packet drop happens, Ananta Manager withdraws the topmost top-talker from all Muxes 24

Reliability When Ananta Manager fails Paxos provides fault-tolerance by replication Typically 5 replicas When Mux fails 1 st tier routers detect failure by BGP The routers stop sending traffic to that Mux. 25

Evaluation 26

Impact of Fastpath Experiment: One 20 VM tenant as the server Two 10 VM tenants a clients Each VM setup 10 connections, upload 1MB data 27

Ananta Manager’s SNAT latency Ananta manager’s port allocation latency over 24 hour observation 28

SNAT Fairness Normal users (N) make 150 outbound connections per minute A heavy user (H) keep increases outbound connection rate Observe SYN retransmit and SNAT latency Normal users are not affected by a heavy user 29

Overall Availability Average availability over a month: 99.95% 30

Summary How Ananta meet cloud requirements 31 RequirementDescription Scale Mux: ECMP Host agent: Scale-out naturally Reliability Ananta manager: Paxos Mux: BGP Any service anywhere Ananta is on layer 4 (Transport layer) Tenant isolation SNAT fairness Packet rate fairness

MUX (NEW)MUX Discussion Ananta may lose some connections When it recovers from MUX failure Because there is no way to copy MUX’s internal state tupleDIP …DIP1 …DIP2 1 st tier Router 5-tupleDIP ??? TCP flows

Discussion Detection of MUX failure takes at most 30 seconds (BGP hold timer). Why don’t we use additional health monitoring? Fastpath does not preserve the order of packets. Passing through a software component, MUX, may increase the latency of connection establishment.* (Fastpath does not relieve this.) Scale of evaluation is too small. (e.g. Bandwidth of 2.5Gbps, not Tbps). Another paper insists that Ananta requires 8,000 MUXes to cover mid-size datacenter.* 33 *DUET: Cloud Scale Load Balancing with Hardware and Software, SIGCOMM‘14

Thanks ! Any Questions ? 34

Lessons learnt Centralized controllers work There are significant challenges in doing per-flow processing, e.g., SNAT Provide overall higher reliability and easier to manage system Co-location of control plane and data plane provides faster local recovery Fate sharing eliminates the need for a separate, highly-available management channel Protocol semantics are violated on the Internet Bugs in external code forced us to change network MTU Owning our own software has been a key enabler for: Faster turn-around on bugs, DoS detection, flexibility to design new features Better monitoring and management Microsoft

Backup: ECMP Equal-Cost Multi-Path Routing Hash packet header and choose one of equal-cost paths 36

Backup: SEDA 37

Backup: SNAT 38

VIP traffic in a data center Microsoft

CPU usage of Mux CPU usage over typical 24-hr period by 14 Muxes in single Ananta instance 40

Remarkable Points The first middlebox architecture that moves parts of it to the host Deployed and served for Microsoft datacenter more than 2 years 41