Download presentation
Presentation is loading. Please wait.
1
FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque
2
Motivation Large-scale data-intensive applications Facebook, LinkedIn, Dynamo CPU-I/O Gap storage, network and memory bottlenecks low CPU utilization CPU Power slower CPUs execute more queries per second per Watt 1 billion vs. 100 million instructions per Joule inefficient energy saving techniques Memory Power
3
FAWN Data-intensive, computational simple workloads Small objects - 100B - 1KB Cluster of embedded CPUs using flash storage Efficient Fast random reads Slow random writes FAWN-KV Key-value storage Consistent Hashing FAWN-DS Data store Log structured
4
FAWN - DS Log-structure key-value store Contains all values in a key range for each virtual ID Maps 160-bit key Hash Index bucket = i low order index bits key fragment = next 15 low order bits 6 byte in-memory Hash Index stores frag and pointer
5
FAWN - DS Basic Functions: Store Lookup Delete Concurrent operations Virtual Node Maintenance: Split Merge Compact
6
Consistent hashing of back-end VIDs Management node assigns each front-end to circular key space Front-end nodes manages its key space forwards out-of-range request Back-end nodes - VIDs contacts front-end when joining owns a key range FAWN - KV
7
Chain replication FAWN - KV
8
Join split key range pre-copy chain insertion log flush Leave merge key range Join into each chain FAWN - KV
9
Individual Node Performance Lookup speed Bulk store speed: 23.2 MB/s, or 96% of raw speed
10
Individual Node Performance Put speed Compared to BerkeleyDB: 0.07 MB/s – shows necessity of log-based filesystems
11
Individual Node Performance Read- and write-intensive workloads
12
System Benchmarks System throughput and power consumption
13
Impact of Ring Membership Changes Query throughput during node join and maintenance operations
14
Impact of Ring Membership Changes Query latency
15
Alternative Architectures Large Dataset, Low Query → FAWN+Disk Small Dataset, High Query → FAWN+DRAM Middle Range → FAWN+SSD
16
Conclusion Fast and energy efficient processing of random read- intensive workloads Over an order of magnitude more queries per Joule than traditional disk-based systems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.