Bin Fan, David G. Andersen, Michael Kaminsky

Name: Bin Fan, David G. Andersen, Michael Kaminsky
Uploaded: 2017-12-12T15:52:05+00:00
Duration: PTM6S37
Channel: Adela Franklin
Description: Bin Fan, David G. Andersen, Michael Kaminsky

Bin Fan, David G. Andersen, Michael Kaminsky
MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing Bin Fan, David G. Andersen, Michael Kaminsky MemC3: Internal Improvements of memcached servers Concurrency, memory efficient and better performance Presenter: Son Nguyen

Memcached internal LRU caching using chaining Hashtable and doubly linked list

Goals Reduce space overhead (bytes/key)
Improve throughput (queries/sec) Target read-intensive workload with small objects Result: 3X throughput, 30% more objects

Doubly-linked-list’s problems
At least two pointers per item -> expensive Both read and write change the list’s structure -> need locking between threads (no concurrency)

Solution: CLOCK-based LRU
Approximate LRU Multiple readers/single writer Circular queue instead of linked list -> less space overhead

CLOCK example Originally: entry (ka, va) (kb, vb) (kc, vc) (kd, vd)
(ke, ve) recency 1 entry (ka, va) (kb, vb) (kc, vc) (kd, vd) (ke, ve) recency 1 Read(kd): entry (ka, va) (kb, vb) (kf, vf) (kd, vd) (ke, ve) recency 1 Write(kf, vf): entry (kg, vg) (kb, vb) (kf, vf) (kd, vd) (ke, ve) recency 1 Write(kg, vg):

Chaining Hashtable’s problems
Use linked list -> costly space overhead for pointers Pointer dereference is slow (no advantage from CPU cache) Read is not constant time (due to possibly long list)

Solution: Cuckoo Hashing
Use 2 hashtables Each bucket has exactly 4 slots (fits in CPU cache) Each (key, value) object therefore can reside at one of the 8 possible slots

Cuckoo Hashing HASH1(ka) (ka,va) HASH2(ka)

Cuckoo Hashing Read: always 8 lookups (constant, fast)
Write: write(ka, va) Find an empty slot in 8 possible slots of ka If all are full then randomly kick some (kb, vb) out Now find an empty slot for (kb, vb) Repeat 500 times or until an empty slot is found If still not found then do table expansion

Cuckoo Hashing X b a Insert a: HASH1(ka) (ka,va) HASH2(ka) X c

Cuckoo Hashing X a Insert b: HASH1(kb) (kb,vb) X b c HASH2(kb)

Cuckoo Hashing X a X Insert c: b HASH1(kc) c (kc,vc) HASH2(kc)
Done !!!

Cuckoo Hashing Problem: after (kb, vb) is kicked out, a reader might attempt to read (kb, vb) and get a false cache miss Solution: Compute the kick out path (Cuckoo path) first, then move items backward Before: (b,c,Null)->(a,c,Null)->(a,b,Null)->(a,b,c) Fixed: (b,c,Null)->(b,c,c)->(b,b,c)->(a,b,c)

Cuckoo path X b X Insert a: c HASH1(ka) (ka,va) HASH2(ka)
Disadvantage: traverse 2 times through the hashtables

Cuckoo path backward insert
X b a Insert a: HASH1(ka) (ka,va) HASH2(ka) X Disadvantage: traverse 2 times through the hashtables c

Cuckoo’s advantages Concurrency: multiple readers/single writer
Read optimized (entries fit in CPU cache) Still O(1) amortized time for write 30% less space overhead 95% table occupancy

Evaluation 68% throughput improvement in all hit case. 235% for all miss

Evaluation 3x throughput on “real” workload

Discussion Write is slower than chaining Hashtable
Chaining Hashtable: million keys/sec Cuckoo: 7 million keys/sec Idea: finding cuckoo path in parallel Benchmark doesn’t show much improvement Can we make it write-concurrent?

Bin Fan, David G. Andersen, Michael Kaminsky

Similar presentations

Presentation on theme: "Bin Fan, David G. Andersen, Michael Kaminsky"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bin Fan, David G. Andersen, Michael Kaminsky

Similar presentations

Presentation on theme: "Bin Fan, David G. Andersen, Michael Kaminsky"— Presentation transcript:

Similar presentations

About project

Feedback