Virtualization for the Win! Scaling Electronic Sports League’s servers way up Sreeram Sammeta Paul Lindberg Intel
Agenda MMO hosting is well-understood, but hosting lots of LAN game servers can be hard. What is virtualization? Can we virtualize LAN game server code? Electronic Sports League (ESL) tests showed it can be done. New hardware and software technologies let us virtualize more. Do more with less!
Electronic Sports League*: Largest online gaming community in Europe Electronic Sports League (ESL) has >1 million active members 1 Mission-critical game servers Sensitive to transaction latency Often single-thread CPU intensive 1 Source: ESL web site August 19, 2009
Industry knows how to host a “typical” MMO Like any multi- tier IT shop Dedicated network between tiers SW/HW designed for throughput and availability, as needed Database Compute Network Internet Game clients (1000s) …
Hosting lots of LAN game servers can be harder 1 is easy How would you host 100 LAN games? 1000? 10,000? Server Internet Game clients (16 typical)
Scaling up LAN game servers could be hard + expensive LAN game server code sometimes not built to make hosting easy Usually CPU-intensive Need lots of CPU “headroom” Assumes it owns the machine? Network - single IP address? File system Impractical/unmanageable to run many server procs on one machine “Simple” way: A few game procs per server! …and lots of servers But lots of servers $$, space, W Is there a better way?
ESL had a challenge: Host lots of game servers without compromising LAN game servers, like Counter-Strike* 1.6, are usually: CPU Intensive: Single Thread vs. Multi-Thread Memory Intensive: Size/Throughput/Latency Network I/O Intensive: Throughput/Latency How could ESL host lots of game servers (especially LAN games), to meet demand? Must maintain Quality of Service (QoS)!
Virtualization can help! Virtualization is a best practice used by many IT shops to consolidate servers for cost and efficiency Can we really get enough performance in a virtual machine (VM) to satisfy gamers? Perception: Can’t! We say: Can! Here’s how…
Virtualization shares hardware CS1CS2CS3 Windows* Server 2003 Physical hardware CPU Memory Network Storage VMware* Hypervisor (OS + Virtual Machine Manager) Physical hardware CPU Memory Network Storage Windows* Server 2003 (1) Windows* Server 2003 (n) Virtual HW CS 1 CS 2 CS 3 CS 1 CS 2 CS 3 … …
ESL Proof of Concept (PoC) showed it can be done Hypothesis: Virtualization of gaming servers may be possible Use the latest technologies Intel® Xeon® 7400 processor based servers Intel VMDq NICs VMware* ESX* 3.5U1 & NetQueue* Test if virtualization adds network latency, in the Intel lab Private ESL lab Public testing on the Internet with real ESL members Success!
PoC Hardware makes it easier Intel® Xeon® processor 7400 Series Performance boost from 6-core with 16 MB L3 cache Energy efficient boost from 45nm high-k process technology Enhanced hardware assist features for virtualization 32 slots 32 GB tested (256 GB max) Configurable PCI Express ESB2 I/O Bridge 4 sockets with 6-cores each 7400 Series 4x 1066 MHz FBD
Throughput (Gb/s) >=60% of the NIC capacity unused No VMDq Network I/O can be slow if not tuned for virtualization VMM overhead Switching load Interrupt bottleneck Throughput measures receive side (Rx) I/O performance of 10GbE LAN. Source: Intel. VM 1 VM 2 VM n Virtual NIC … Virtualization Hypervisor NIC LAN
Queues for each VM give near-native throughput VM 1 VM 2 VM n … VMware* with NetQueue* NIC with VMDq LAN Virtual NIC Throughput (Gb/s) No VMDq >2x throughput! Near native 10GbE VMDqVMDq Jumbo Frames Tests measure Wire Speed Receive (Rx) Side Performance With VMDq on Intel ® Gigabit Ethernet Controller. Source: Intel. VMDq & NetQueue* Optimize switching Load balance interrupts
PoC Software fits it together VMware* ESX* 3.5 U1 Virtual Center* 2.5 NetQueue* enabled (16 queues) 1 virtual CPU per VM 2GB memory per VM Windows* Server* bit Counter-Strike* 1.6
It’s all about latency: Don’t make it worse Player sends ~ byte update to server Server sends ~2000 bytes in return In-game transaction latency = round-trip network latency + game server processing time Source: ESL observations.
VMDq keeps latency near native levels! Virtualization with no VMDq increases latency VMDq latency is near-native Negligible effect on latency! << In-Game Transaction Latency (5 ms best case) Source: Intel Lab. Performance measured using the netperf (UDP latency test with 8 parallel streams) benchmark running on Intel® Xeon® processors 7300 (2.93 GHz).
Private ESL & Public Internet testing revealed no impact on In-Game Transaction Latency! Source: ESL Lab. Performance measured using esxtop & power meter with reference s/w stack running on Intel® Xeon® processors 7400 (2.67 GHz). Live tests find ideal load
Replace 18 servers with 1! BeforeAfter 1P Core™2 Duo (2 cores) 4P Xeon 7400 (24 cores) 6 game servers (3 per core) 108 game servers (3 per VM, 36 VMs) 72 gamers1296 gamers 18x game servers per machine (18:1 consolidation ratio) 18x gamers per machine Same CPU headroom, same user experience! Source: ESL Lab. Performance measured using esxtop & power meter with reference s/w stack running on Intel® Xeon® processors 7400 (2.67 GHz). Power savings calculated based on ESL actual power rate & Yahoo $/€ exchange rate as of
VMware* ESX 3.5 U1 Intel® Xeon® Processor 7400 series based server Win 2003 CS (2) CS (3) CS (1) Win 2003 CS (2) CS (3) CS (1) 18:1 consolidation yields big efficiency VM (1)VM (36) 18 servers into 1! CS (1) Windows Svr 2003 Intel® Core™2 Duo (1) CS (6) Intel® Core™2 Duo (18) Windows Svr 2003 CS (1) CS (6)
18:1 consolidation gives big power savings BeforeAfter 1P Core 2 Duo (2 cores) 4P Xeon 7400 (24 cores) 350 W / machine (6300 W / 18 machines) 710 W / machine Power $731K / 1000 machines Power $83K / equivalent machines Power savings: $648K per year for each 1000 machines converted! Power and other factors give a good Total Cost of Ownership (TCO) Source: ESL La. Performance measured using esxtop & power meter with reference s/w stack running on Intel® Xeon® processors 7400 (2.67 GHz). Power savings calculated based on ESL actual power rate & Yahoo $/€ exchange rate as of Actual performance and savings may vary. Source: ESL Lab. Performance measured using esxtop & power meter with reference s/w stack running on Intel® Xeon® processors 7400 (2.67 GHz). Power savings calculated based on 24x7x365 usage, ESL actual power rate & Yahoo $/€ exchange rate as of
ESL and gamers loved it! "Playing on virtualized gameservers running on Intel® and VMware* technologies gives professional gamers no disadvantages compared with playing on a non virtualized server. Everything ran smoothly and I did not notice anything unusual. A perfect setup for professional gaming." —Navid Javadi aka mousesports|Kapio "The new Quad-core Intel® Xeon® processor 7400 series were completely overwhelming in all terms. The Intel Xeon MP processor … based servers with Intel VMDq technology enable us to efficiently run our servers with reduced costs and without any negative impacts." —Bjoern Metzdorf Director Information Technology Electronic Sports League
New virtualization tech might help you, too! Do you have “non-virtualize-able” apps? Really? Try them! Can have very low network latency Can consolidate many servers into 1 Consolidating servers can lead to big efficiency and power savings Read more: archive/performance-sensitive- application.php
24 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Intel, Intel Inside, Xeon and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright © 2009 Intel Corporation.