Amar Phanishayee,LawrenceTan,Vijay Vasudevan FAWN A Fast Array of Wimpy Nodes* Bogdan Eremia, SCPD *by DavidAndersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee,LawrenceTan,Vijay Vasudevan
Energy in computing • Power is a significant burden on computing • 3-yearTCO soon to be dominated by power Hydroelectric Dam 2
“Google’s power consumption ...would incur an annual electricity bill of nearly $38 million” [Qureshi:sigcomm09] “Energy consumption by…data centers could nearly double ...(by 2011) to more than 100 billion kWh, representing a $7.4 billion annual electricity cost” [EPA Report 2007] Annual cost of energy for Google,Amazon,Microsoft = Annual cost of all first-year CS PhD Students 3 Monday, October 12, 2009
Can we reduce energy Still serve the same workloads use by a factor of ten? Still serve the same workloads Avoid increasing capital cost 4
FAWN Traditional Server Mem 40W CPU 4GB CompactFlash Improve computational efficiency of data-intensive computing using an array of well-balanced low-power systems. FastArray ofWimpy Nodes Traditional Server FAWN CPU Disk CPU Mem ()* %&' +,-#. AMD Geode 256MB DRAM 4GB CompactFlash 220W 40W 5
{ Goal:reduce peak power FAWN Traditional Datacenter 1000W Servers 100% Servers 20% Power Cooling 1000W 750W 100W <100W Distribution { 20% energy loss (good) 6
Overview • Background • FAWN Principles • FAWN-KV Design • Evaluation • Conclusion 7
Towards balanced systems Rebalancing Options 1E+08 1E+07 1E+06 1E+05 1E+04 1E+03 1E+02 1E+01 Disk Seek Wasted resources DRAM Access Nanoseconds 1E+00 1E-01 1980 1985 1990 1995 2000 2005 Year CPU Cycle Today’s CPUs Slower CPUs Array of Fast Storage Fastest Disks Slow CPUs Today’s Disks 8
Targeting the sweet-spot in efficiency Speed vs.Efficiency Fastest processors exhibit superlinear power usage 2500 Instructions/sec/W in millions Fixed power costs can dominate efficiency for slow processors 2000 XScale 800Mhz Atom Z500 1500 FAWN targets sweet spot in system efficiency when including fixed costs Xeon7350 1000 500 Custom ARM Mote 1 10 100 1000 10000 100000 Instructions/sec in millions (Includes 0.1W power overhead) 9
Targeting the sweet-spot in efficiency Instructions/sec/W in millions FAWN 1000 1500 2000 2500 500 1 Custom ARM Mote 10 Instructions/sec in millions XScale 800Mhz 100 1000 Atom Z500 10000 Xeon7350 100000 Today’s CPU Slower CPU Slow CPU Array of Fast Storage Today’s Disk Fastest Disks 10 More efficient
Overview • Background • FAWN Principles • FAWN-KV Design • Evaluation Architecture Constraints • • Evaluation • Conclusion 11
Data-intensive KeyValue • Critical infrastructure service • Service level agreements for performance/latency • Random-access,read-mostly,hard to cache 12
FAWN-KV: • Energy-efficient cluster key-value store Our KeyValue Proposition • Energy-efficient cluster key-value store • Goal:improve Queries/Joule • Prototype:Alix3c2 nodes with flash storage • 500MHz CPU,256MB DRAM,4GB CompactFlash 13 Monday, October 12, 2009
FAWN-KV: • Prototype:Alix3c2 nodes with flash storage Our KeyValue Proposition Unique Challenges: • Efficient and fast failover • Wimpy CPUs, limited DRAM • Flash poor at small random writes • Prototype:Alix3c2 nodes with flash storage • 500MHz CPU,256MB DRAM,4GB CompactFlash 14
FAWN-KVArchitecture Manages Backends Consistent hashing X Back-end Acts as Gateway Routes Requests Back-end FAWN-DS Front-end KV Ring Consistent hashing X Back-end 15
FAWN-KVArchitecture FAWN-KV FAWN-DS X Front-end Back-end Limited Resources Avoid random writes Efficient Failover Avoid random writes 16
{
Log-structured Datastore • Log-structuring avoids small random writes Get Put Delete Random Read Append FAWN-DS Limited Resources Avoid random writes FAWN-KV Efficient Failover Avoid random writes ✔ 18
On a node addition H A G B F C Hash Index Values (H,B] D Node additions, failures require transfer of key-ranges 19
Nodes stream data range Datastore List Stream Atomic Update A of Datastore List Minimizes locking from B to Concurrent Inserts, Compact Datastore Concurrent Inserts • Background operations sequential Continue to meet SLA A FAWN-DS Limited Resources Avoid random writes FAWN-KV Efficient Failover Avoid random writes ✔ ✔ 21 Monday, October 12, 2009
FAWN-KV Take-aways • Log-structured datastore • Avoids random writes at all levels • Minimizes locking during failover • Careful resource use but high performing • Replication and strong consistency • Variant of chain replication (see paper) 21
Overview • Background • FAWN principles • FAWN-KV Design • Evaluation • Conclusion 22
Evaluation Roadmap • Key-value lookup efficiency comparison • Impact of background operations • TCO analysis for random read workloads 23
FAWN-DS Lookups Watt QPS Watts Alix3c2/Sandisk(CF) Desktop/Mobi (SSD) 346 51.7 2.3 1.96 System Alix3c2/Sandisk(CF) Desktop/Mobi (SSD) MacbookPro / HD Desktop / HD QPS 1298 4289 66 171 Watts 3.75 83 29 87 • FAWN-based system over 6x more efficient than 2008-era traditional systems 24
Impact of background ops 1600 1200 800 400 1600 1200 800 400 Queries per second Queries per second Peak Compact Split Merge Peak Compact Split Merge Peak query load 30% of peak query load Background operations have: • Moderate impact at peak load • Negligible impact at 30% load 25
TCO = Capital Cost + Power Cost ($0.10/kWh) When to use FAWN for random access workloads? TCO = Capital Cost + Power Cost ($0.10/kWh) Traditional (200W) Five 2TB disks 160GB PCI-e Flash SSD 64GB FBDIMM per node ~$2000-8000 per node FAWN (10W each) 2TB disk 64GB SATA Flash SSD 2GB DRAM per node ~$250-500 per node 26
Ratio of query rate to cooling, !$"""" 0.12*+*,%34 ./ *, (# "#$ !"#$ Ratio of query rate to 0.12*+*,%34 0.12*+*0)#35 - )*+ %&%' ! !$ !$" !$"" 01)23!4&')!56+77+8-(9():; ./ *, (# cooling, "#$ 0.12*+*,-./ !$"""
• FAWN architecture reduces energy Conclusion • FAWN architecture reduces energy consumption of cluster computing • FAWN-KV addresses challenges of wimpy nodes for key value storage • Log-structured,memory efficient datastore Efficient replication and failover Meets energy efficiency and performance goals “Each decimal order of magnitude increase in parallelism requires a major redesign and rewrite of parallel code”- KathyYelick • • • 28