Download presentation
1
Bin Fan, David G. Andersen, Michael Kaminsky
MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing Bin Fan, David G. Andersen, Michael Kaminsky MemC3: Internal Improvements of memcached servers Concurrency, memory efficient and better performance Presenter: Son Nguyen
2
Memcached internal LRU caching using chaining Hashtable and doubly linked list
3
Goals Reduce space overhead (bytes/key)
Improve throughput (queries/sec) Target read-intensive workload with small objects Result: 3X throughput, 30% more objects
4
Doubly-linked-list’s problems
At least two pointers per item -> expensive Both read and write change the list’s structure -> need locking between threads (no concurrency)
5
Solution: CLOCK-based LRU
Approximate LRU Multiple readers/single writer Circular queue instead of linked list -> less space overhead
6
CLOCK example Originally: entry (ka, va) (kb, vb) (kc, vc) (kd, vd)
(ke, ve) recency 1 entry (ka, va) (kb, vb) (kc, vc) (kd, vd) (ke, ve) recency 1 Read(kd): entry (ka, va) (kb, vb) (kf, vf) (kd, vd) (ke, ve) recency 1 Write(kf, vf): entry (kg, vg) (kb, vb) (kf, vf) (kd, vd) (ke, ve) recency 1 Write(kg, vg):
7
Chaining Hashtable’s problems
Use linked list -> costly space overhead for pointers Pointer dereference is slow (no advantage from CPU cache) Read is not constant time (due to possibly long list)
8
Solution: Cuckoo Hashing
Use 2 hashtables Each bucket has exactly 4 slots (fits in CPU cache) Each (key, value) object therefore can reside at one of the 8 possible slots
9
Cuckoo Hashing HASH1(ka) (ka,va) HASH2(ka)
10
Cuckoo Hashing Read: always 8 lookups (constant, fast)
Write: write(ka, va) Find an empty slot in 8 possible slots of ka If all are full then randomly kick some (kb, vb) out Now find an empty slot for (kb, vb) Repeat 500 times or until an empty slot is found If still not found then do table expansion
11
Cuckoo Hashing X b a Insert a: HASH1(ka) (ka,va) HASH2(ka) X c
12
Cuckoo Hashing X a Insert b: HASH1(kb) (kb,vb) X b c HASH2(kb)
13
Cuckoo Hashing X a X Insert c: b HASH1(kc) c (kc,vc) HASH2(kc)
Done !!!
14
Cuckoo Hashing Problem: after (kb, vb) is kicked out, a reader might attempt to read (kb, vb) and get a false cache miss Solution: Compute the kick out path (Cuckoo path) first, then move items backward Before: (b,c,Null)->(a,c,Null)->(a,b,Null)->(a,b,c) Fixed: (b,c,Null)->(b,c,c)->(b,b,c)->(a,b,c)
15
Cuckoo path X b X Insert a: c HASH1(ka) (ka,va) HASH2(ka)
Disadvantage: traverse 2 times through the hashtables
16
Cuckoo path backward insert
X b a Insert a: HASH1(ka) (ka,va) HASH2(ka) X Disadvantage: traverse 2 times through the hashtables c
17
Cuckoo’s advantages Concurrency: multiple readers/single writer
Read optimized (entries fit in CPU cache) Still O(1) amortized time for write 30% less space overhead 95% table occupancy
18
Evaluation 68% throughput improvement in all hit case. 235% for all miss
19
Evaluation 3x throughput on “real” workload
20
Discussion Write is slower than chaining Hashtable
Chaining Hashtable: million keys/sec Cuckoo: 7 million keys/sec Idea: finding cuckoo path in parallel Benchmark doesn’t show much improvement Can we make it write-concurrent?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.