Download presentation
Presentation is loading. Please wait.
1
LESSONS LEARNED – BUILDING PAYPAL CLOUD
Chinmay Naik Lead Software Engineer, Cloud Engineering Anand Palanisamy Manager, Software Development, Cloud Engineering (OpenStack Summit – Hong Kong – 2013)
2
About paypal 137,000,000 Users. $300,000 Payments processed by PayPal each minute. 193 markets / 26 currencies. PayPal is the World’s Most Widely Used Digital Wallet.
3
Structure of the presentation
Challenges we are trying to address Why OpenStack has emerged as a problem solver ? Getting Openstack ready for production primetime Success stories
4
What are we trying to solve ?
5
Some of our Challenges Seamless On-Demand Infrastructure Capacity
Do we really want those hundred tickets to deploy a service ? Drive developer agility Provide self-service tool for application life cycle mgmt Provide a platform to enable faster innovation.
6
Who will get us there ?
7
Openstack is the winner
Solves Infrastructure-as-a-Service Its open source No specific vendor lock-ins Fast growing developer community Open standards and api driven Industry best practices, prevent reinventing the wheel
8
OPEN source cannot always be used off the shelf
9
Our Technology stack Orchestration Engine
User Interface Operations Portal Asgard, Horizon, Ceilometer PD Deployment Portal Traffic Mgmt Monitoring Metering Stages Workflow Monitoring Orchestration Orchestration Engine Cloud Formation (Heat) Foundational Services Nova, Cinder, Swift, Keystone, Neutron, Horizon LBaaS, DNSaaS FWaaS Software Infrastructure Cobbler ISC DHCP Salt Bind RHEL 6.x Hypervisor Zabbix Two Entry Points for Infrastructure PayPal Product Developers Cloud Operators to manage Cloud Centrally Orchestrated using Heat Local Storage HP 4X600 GB(Mirror Cisco 4948 & Arista 7050 Nicira NVP F LB ----- Meeting Notes (10/25/13 12:12) ----- - take horizon out - replace asgard with aurora Hardware Infrastructure x86 Compute Local Storage Network Load Balancer PP Specific
10
TUNING nova for High Availability
Scheduling enhancements for failure and availability domains Custom PayPal filter scheduler Tenant based Compute Zone filters with Folsom Host Aggregate filtering in Grizzly 25% distribution among different fault zone for HA A Rack of Servers is an important entity - Defines Fault Zone (Availability Zone) 1. Use Host Aggregates to define availability zone for all hosts in a half rack. 2. Use Host Aggregates for Front and Mid Tier (production) & Per Requirement Basis and then map tenants to these HAs. Its Tenant Based - Production requires Special Tenants to have their VMs landing on Specific Computes - In Grizzly - Modified HAs - Added a New Table for Tenant – HA mapping - In Folsom - Had concept of Compute Zones - Compute Zone could have hosts from different availability zones (fault zones) - Our Own filter which is a compute zone filter - Reserved compute zone capability – To make sure a host is dedicated to owner of the Compute Zone. And no one else lands on it. 25% Availability Zone Distribution – Basic concept being equal distribution of VMs for High Availability reasons. - Custom PayPal filter scheduler - Calculate VMs per availability zone aggregation for the tenant requesting the instances – This information is used for 25% availability zone filtering - ‘Weigh filter’ help filter by availability zone fullness.
11
NOVA changes Instance host naming uniqueness
Auto assigning floating IPs to VMs Rack aware networking Leveraging config-drive Nova conductor - security vs. load on rabbit - Instance Host naming (Also helps meet some of our OPS Tools requirements): - template based and its configurable per tenant. - nova api level host name validation logic for non standard characters. - Auto assign floating IP - plugged into nova during instance launch time. - nova orchestrates quantum apis call to allocate and assign flip to instance . Auto assigning quantum floating IPs to VMs at launch time, for external connectivity in required environments - Rack aware networking (in Grizzly) for selecting correct Neutron network to allocate IPs from, for launched instances - Bridged vs. Overlay networks - Leveraging config-drive to store cell specific configurations, device type labels etc. - Nova conductor services - security vs. load on rabbit in a large deployments
12
Keystone Changes Integrating keystone with LDAP Auto tenancy feature
Tenant based hostnames & dns zones Client side token caching Team admin feature Keystone integration with AD and OPEN LDAP for easier authentication of all internal users - Auto tenancy (for specific clouds) he can start using the cloud ! - tenant name is assigned as username - default member or team admin role ownership is created for the user to this tenant - Tenant Metadata - Extras field is being used to save key value pairs - Concept similar to host aggregates where tenants are tagged with key value pairs. - Horizon has been added with new features to allow users to select from the DNS list for their tenant during instance launch time. - Client token caching - Quantum client calls made by nova create a lot of keystone tokens. - Caching tokens at client side and reusing them helps reduce the total number of tokens stored with keystone - Speeds ups keystone performance. - Team Admin (was supposed to be implemented using Domains concept in v3 keystone apis) - you don’t need OS admin to handle the tenant, you can be team admin of few tenants - new user role = team admin (was supposed to be implemented using Domains concept in v3 keystone apis) - can be configured with team_admin_roles = Member, which is roles with which normal users will be added/removed to tenants by team_admins. - helpful in listing roles of corporate user and tenants - All these features are configurable
13
DNS-as-a-service integration
Automatic Project based zones Floating IPs - Allow each instance to have unique IP-FQDN bindings registered in production DNS - REST API driven and integrated into nova - allocation and deallocation of entries handled during instance creation and deletion time - Tenant based DNS zoning feature leverages tenant metadata to support different zones per tenant, on a need basis. - DNS support extended to Quantum floating IPs as well
14
LOAD Balancer-AS-a-service
Registration and auto discovery Rich tenant and operator facing apis Propagating changes to multiple LBs Change Management Integration Main thing about this REST API Driven and 2. Tenant based segregation - Registration and discovery of physical load balancers. - Management of - vips, - pools, - monitors, - i7 rules, - ssl certs and - services through GUI, PAAS and HEAT - Devices are not exposed to cloud users but visible to operators - Operator facing APIs for - managing devices, - config back up/restore, - config sync across primary and secondary LBs - Granular job status, failed jobs re-submit, 100% async, pre & post validation
15
Other Success stories
16
User experience Ease of use Adoption Multi Version Multi Region
Velocity use case in Asgard itself cell deployment with centralized LDAP login Managing different releases of OpenStack with simple json config change Options to pick & choose nvd3 based Graphs and bootstrap based GUI Easy install
17
User interface screen shots
18
User interface screenshots
19
Deployment pain points
Devstack != Production Keeping up with trunk Single keystone service Performance & Scalability Error Handling
20
Confidential and Proprietary
21
Courtesies for images Used
22
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.