Download presentation
Presentation is loading. Please wait.
Published byDrake Hance Modified over 9 years ago
1
FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott R. Sideleau ssideleau@umassd.edussideleau@umassd.edu 14-Nov-2013
2
2
3
Overview Identify the problem space FAWN as a solution –Architecture principles –Unique key-value storage Evaluate and benchmark a 21-node FAWN cluster Identify when FAWN makes sense 3
4
Theoretical Problem Space CPU I/O gap –Modern processors are so efficient that a lot of time is spent idle CPU power consumption scales linearly –Increased caches to keep the superscalar pipelines fed is a driver Dynamic Voltage Frequency Switching (DVFS) is inefficient –Intel SpeedStep technology –CPU still operates generally at 50% power consumption 4
5
What’s the real problem? Electricity is expensive! –Home usage is measured in KW, data center usage in MW Facebook use up to $1 million a month in electricity –Only three data centers! Oregon, USA Virginia, USA Sweden 5
6
Facebook’s Not Playing Around Fourth data center to be powered by renewable wind –Iowa, USA 6 http://goo.gl/sFmmxz dtd 14-Nov-2013
7
Proposed Solution Fast Array of Wimpy Nodes (FAWN) –Bridge the I/O gap Use slower CPUs and faster Flash storage –Reduce power consumption per node Embedded CPUs consume significantly less power –Address distributed storage for the new architecture New key-value storage system (FAWN-KV) –Complementary per node data store (FAWN-DS) 7
8
8 System Architecture
9
9 Basic Functions
10
10 Replication & Consistency
11
Understanding Flash Storage Fast random reads –175x faster than HDDs –Vary wildly between make/models Efficient I/O –Very low power –High query per Joule rate vs. HDDs Slow random writes –Expensive erase/write cycle –Motivation for log structured (i.e. sequential) data storage 11
12
Optimized Maintenance Functions Split –Used when adding a node to the cluster –Read, then sequential write to two new data stores if key is in range Merge –Used when deleting a node from the cluster –Mutually exclusive stores, so append one data store to the other Compact –Cleans up entries in a data store –Skip orphans, out-of-range, deleted and write to new data store 12
13
13 Optimized Sequential Read & Writes
14
14 Front-end Consistent Hashing
15
15 Node Join
16
Node Leave Rather than split the data stores, nodes merge them In reality, this means… –Add a new replica into each chain the departing node belonged to –So, the processing is the same as a join event 16
17
Failure Detection Nodes are assumed to be fail-stop –Front-end and back-end nodes gossip at a known rate If timeout, front-end initiates leave operation for failed node Current design only copes with node failures –Coping with network failures require future work 17
18
Single Node Evaluation Performance almost entirely dependent on flash media 18
19
21-Node Evaluation In general, the back-ends prove to be well-matched 19
20
21-Node Evaluation Relatively responsive through maintenance operations 20
21
21-Node Evaluation Slightly slower than production key-value systems –Worst case response times on-par 21
22
21-Node Evaluation Power draw is low and consistent across operations 22
23
21-Node Evaluation Power draw is low and consistent across operations –Query per Joule is an order of magnitude higher than traditional production distributed systems 1 billion instructions per Joule 1/3 the frequency 1/10 (or less) the power 23
24
When does FAWN matter? It depends on the workload… 24
25
QUESTIONS? Thanks very much! 25
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.