Intelligent Placement of Datacenters for Internet Services Íñigo Goiri, Kien Le, Jordi Guitart, Jordi Torres, and Ricardo Bianchini 1
Motivation Internet services require thousands of servers Use multiple “mirror” datacenters – High availability and fault tolerance – Low response time Spend millions building and operating datacenters Consume enormous amounts of brown energy 2
Datacenter construction costs Each datacenter costs >$100M to construct – The smaller datacenters are rated at ~25MW Examples: – Microsoft DCs in Virginia & Chicago: $500M each 3
Energy costs and carbon emissions Company#Servers Energy/year (MWh) Energy cost/year CO 2 /year (Metric tons) eBay16K0.6 x 10 5 $3.7M0.4 x 10 5 Akamai40K1.7 x 10 5 $10M1.0 x 10 5 Rackspace50K2 x 10 5 $12M1.2 x 10 5 Microsoft>200K>6 x 10 5 >$36M>3.6 x 10 5 Google>500K>6.3 x 10 5 >$38M>3.8 x 10 5 Sources: [Qureshi’09], EPA 4
Intelligent Placement of Datacenters Goal: Manage the monetary and environmental costs Define framework Model costs and datacenter characteristics Define optimization problem Create solution approaches Collect cost and location-related data Create placement tool 5
Outline Motivation Placing datacenters Evaluation Conclusion 6
Selecting datacenter locations Model datacenter placement – Network latencies – Availability 7
Selecting datacenter locations Model datacenter placement – Network latencies – Availability CAPEX costs – Distance to electricity and networking infrastructure – Land and construction (maximum PUE) – Power delivery, cooling, backup equipment – Servers and networking equipment 8
Selecting datacenter locations Model datacenter placement – Network latencies – Availability CAPEX costs – Distance to electricity and networking infrastructure – Land and construction (maximum PUE) – Power delivery, cooling, backup equipment – Servers and networking equipment OPEX costs – Maintenance and administration – Electricity and water prices (average PUE) 9
Selecting datacenter locations Model datacenter placement – Network latencies – Availability CAPEX costs – Distance to electricity and networking infrastructure – Land and construction (maximum PUE) – Power delivery, cooling, backup equipment – Servers and networking equipment OPEX costs – Maintenance and administration – Electricity and water prices (average PUE) Incentives (taxes) 10
Selecting datacenter locations Model datacenter placement – Network latencies – Availability CAPEX costs – Distance to electricity and networking infrastructure – Land and construction (maximum PUE) – Power delivery, cooling, backup equipment – Servers and networking equipment OPEX costs – Maintenance and administration – Electricity and water prices (average PUE) Incentives (taxes) 11
Formulating the problem Goal – Minimize CAPEX and OPEX Constraints – Response times < MAX LATENCY for all users – Min consistency delay between 2 DCs < MAX DELAY – Min system availability > MIN AVAILABILITY Output – Number of servers at each location – Minimum cost 12
Solving the (non-linear) problem Linear Programming – Does not support non-linear costs Brute force – Too slow Simple heuristics – May not produce accurate results efficiently 13
Our approach for solving the problem Evaluate each potential solution – Quickly via Linear Programming (LP) Consider neighboring configurations – Simulated annealing (SA) Cost optimization process – Combine SA and LP 14 Current solutionNear neighbor LP SA LP
Our approach for solving the problem 15 LP SA LP SA LP SA $13.8M/month $9.2M/month$10.7M/month $10.3M/month
Summary of our approach Generate a grid of tentative locations Collect data about each location Define datacenter characteristics Instantiate optimization problem Solve optimization problem 16
Tool demo We built a tool that – Embodies the problem – Input data for the US – Multiple solution approaches Short video at: 17
Outline Motivation Placing datacenters Evaluation Conclusion 18
Comparing locations for 60k-server DC 19
Interesting questions How much does… … lower latency cost? … higher availability cost? … faster consistency cost? … a green DC network cost? … a chiller-less DC network cost? 20
Cost of 60k-server green DC network 21 Green DC network costs $100k/month more, except when latency <70ms
Cost of a 60k-server chiller-less DC network 22 Chiller-less DC network is cheaper but it cannot achieve low latencies
Conclusions First scientific work on smart datacenter placement – Proposed framework and optimization problem – Proposed solution approach – Characterized many locations across the US – Built a tool to automate the process – Answered many interesting questions Results show that smart placement can save millions Work enables smaller companies to reap the benefits 23
Intelligent Placement of Datacenters for Internet Services Íñigo Goiri, Kien Le, Jordi Guitart, Jordi Torres, and Ricardo Bianchini 24
Future work Extend with data from Europe Include tax incentives Test the tool with data from real services 25
Maximum user response time 26 Maximum latency of 75 milliseconds
Location-dependent data Network backbones – Connectivity – Response time Power plants and transmission lines – Power capacity – CO 2 emissions Pricing – Land – Electricity – Water Weather – Temperature → PUE 27
Location-dependent data Example: – Network backbones – Major cities – Electricity price 28
Datacenter characteristics Number of servers and internal networking Cooling cost (function of PUE) Infrastructure cost (power and networking) Building costs Land required Water consumption Staff costs Example: Building costs range from $8/W to $22/W 29