Scalable lock-free Stack Algorithm Wael Yehia York University February 8, 2010
My Paper Danny Hendler, Nir Shavit, and Lena Yerushalmi. A scalable lock-free stack algorithm. In SPAA 2004: Proceedings of the Sixteenth Annual ACM Symposium on Parallel Algorithms, June 27-30, 2004, Barcelona, Spain, pages 206–215, Revised and published in Journal of Parallel and Distributed Computing, Volume 70, Issue 1, January 2010, Pages 1-12
Stacks A Stack is a Last In, First Out (LIFO) abstract data structure [Wikipedia] It can have any abstract data type as an element Provides two operations: – Pop(): removes and return the top element – Push(v): adds v to the top of the stack
Example of a stack after 3 pushes: push(1), push(5), push(2) A simple Sequential implementation top Stack{ Cell * top; } Cell{ Cell * next; int value; } 251 stack Cell
Push(3) Before: After: top stack top stack
Pop() Before: After: Return: top stack top stack cell
Intuitive lock-free shared stack Similar to sequential version But the top pointer is guarded by a CAS object. void push(E x){ head = stack.top; x.next = head; return CAS(stack.top, head, x); } E pop(){ head = stack.top; if(head == NULL) return EMPTY; next = head.next; if(CAS(stack.top, head, next)) return head; else return FAIL; } bool CAS(L, Old, New) { atomically { if (*L == Old) { *L = New; return true; } else return false; }
Problems with this approach High Memory contention on the CAS object at high loads Chances are, that many will fail, i.e. CAS(old,new) == false. Solution: use Elimination as a Backoff mechanism
Backoff mechanism An old technique used in various places such as packet- switching networks and ethernet Idea: spread the access to a busy location out in time. For our case: Instead of keep trying to modify the top pointer, spread the thread accesses out in time. Various ways to spread out: randomly, evenly, or based on traffic history Our approach is wait for a predefined time t 1 t 2 t 3 t 4 time Example of 4 threads t 1, t 2,t 3, and t 4 that collided and are spread in time randomly
Elimination technique Opposite operations such as push and pop eliminate their effect on the stack. Ex: push(1) followed by pop() keeps the stack in the same state. Every pair of push() and pop() can simply exchange data and terminate without ever touching the stack. Data exchange means the popping thread reads the pushed element from the pushing thread The problem is finding these pairs Stack initially After push(1) After pop() top
The new algorithm Combines backoff schemes and elimination: Each thread first tries to execute directly on the stack If it fails then it backs off and tries to find a partner thread with an opposite operation If elimination fails, it will wait for some time and retry on the stack, and so on. while(true){ if(performOpOnStack()) return; if(tryToCollide()) return; wait(sometime); }
Collision array Each thread need to find a partner to eliminate itself with use collision array A simple array of ids, of predefined length A thread picks a random location and write its id there Two threads collide if they choose same location. Collided threads check if they match and then eliminate by exchanging data Otherwise the elimination fails and they proceed to execute their ops on the stack
Example A system of 4 threads. Initial state of the stack and collision array: top EMPTY Stack Collision Array of length 2
3 Threads try to execute t 1 : push(3), t 2 : pop(), t 3 : push(1) All 3 threads fail to modify the top pointer So they try to collide: t1t1 t2t2 t3t3 backoff top Stack Collision Array
Elimination in progress t 2 and t 3 are suitable for elimination t 2 reads the value “1” from t 3 and both return; t 1 finds no partner, so waits, and goes to the stack t1t1 t2t2 t3t3 top wait return pop push(1)
Possible improvement and the next step Backing off for constant time is not always the best solution Dynamically resizing the collision array Next Step: –Implement the algorithm –Compare it to the java’s synchronized Stack