11/2/20111 The Ordering Requirements of Relativistic and Reader-Writer Locking Approaches to Shared Data Access Philip W. Howard with Jonathan Walpole,

Slides:



Advertisements
Similar presentations
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 469-+
Advertisements

CS252: Systems Programming Ninghui Li Based on Slides by Prof. Gustavo Rodriguez-Rivera Topic 14: Deadlock & Dinning Philosophers.
Concurrent programming: From theory to practice Concurrent Algorithms 2014 Vasileios Trigonakis Georgios Chatzopoulos.
Automated Verification with HIP and SLEEK Asankhaya Sharma.
(C) 2001 Daniel Sorin Correctly Implementing Value Prediction in Microprocessors that Support Multithreading or Multiprocessing Milo M.K. Martin, Daniel.
CS510 – Advanced Operating Systems 1 The Synergy Between Non-blocking Synchronization and Operating System Structure By Michael Greenwald and David Cheriton.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
CS510 Concurrent Systems Jonathan Walpole. What is RCU, Fundamentally? Paul McKenney and Jonathan Walpole.
Foundation of Computing Systems Lecture 2 Linked Lists.
RCU in the Linux Kernel: One Decade Later by: Paul E. Mckenney, Silas Boyd-Wickizer, Jonathan Walpole Slides by David Kennedy (and sources)
CS 6560 Operating System Design Lecture 7: Kernel Synchronization Kernel Time Management.
Concurrency 101 Shared state. Part 1: General Concepts 2.
Read-Copy Update P. E. McKenney, J. Appavoo, A. Kleen, O. Krieger, R. Russell, D. Saram, M. Soni Ottawa Linux Symposium 2001 Presented by Bogdan Simion.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
10/31/20111 Relativistic Red-Black Trees Philip Howard 10/31/2011
1 Where did the synchronized methods go? Yes, there still exists the synchronized keyword. public synchronized void foo() {} public void foo(){ aLock.lock();
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
What is RCU, fundamentally? Sri Ramkrishna. Intro RCU stands for Read Copy Update  A synchronization method that allows reads to occur concurrently with.
Scalable Reader Writer Synchronization John M.Mellor-Crummey, Michael L.Scott.
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
CS533 - Concepts of Operating Systems 1 Class Discussion.
CS444/CS544 Operating Systems Classic Synchronization Problems 3/5/2007 Prof. Searleman
Department of Computer Science Presenters Dennis Gove Matthew Marzilli The ATOMO ∑ Transactional Programming Language.
Practical Concerns for Scalable Synchronization Jonathan Walpole (PSU) Paul McKenney (IBM) Tom Hart (University of Toronto)
1 Chapter 5 Concurrency. 2 Concurrency 3 4 Mutual Exclusion: Hardware Support Test and Set Instruction boolean testset (int *i) { if (*i == 0) { *i.
Computer Laboratory Practical non-blocking linked lists Tim Harris Computer Laboratory.
Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking vs. Transactional Memory Written by: Paul E. McKenney Jonathan Walpole Maged.
1 Lock-Free Linked Lists Using Compare-and-Swap by John Valois Speaker’s Name: Talk Title: Larry Bush.
Relativistic Red Black Trees. Relativistic Programming Concurrent reading and writing improves performance and scalability – concurrent readers may disagree.
CS510 Concurrent Systems Jonathan Walpole. A Lock-Free Multiprocessor OS Kernel.
Mehdi Kargar Department of Computer Science and Engineering 1.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Persistent Data Structures 1. What is “persistent”? A data structure capable of preserving the current version when modified – A collection of immutable.
DOUBLE INSTANCE LOCKING A concurrency pattern with Lock-Free read operations Pedro Ramalhete Andreia Correia November 2013.
Java Thread and Memory Model
The ATOMOS Transactional Programming Language Mehdi Amirijoo Linköpings universitet.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Fundamentals of Parallel Computer Architecture - Chapter 51 Chapter 5 Parallel Programming for Linked Data Structures Yan Solihin.
The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with Linux Guniguntala et al.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
A Relativistic Enhancement to Software Transactional Memory Philip Howard, Jonathan Walpole.
 Array is a data structure were elements are stored in consecutive memory location.in the array once the memory is allocated.it cannot be extend any more.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
6.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Module 6: Process Synchronization.
Read-Log-Update A Lightweight Synchronization Mechanism for Concurrent Programming Alexander Matveev (MIT) Nir Shavit (MIT and TAU) Pascal Felber (UNINE)
1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.
Read-Copy-Update Synchronization in the Linux Kernel 1 David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis.
Readers/Writers Problem  Multiple processes wanting to read an item, and one or more needing to write (Think of airline reservations…)  Rather than enforce.
Scalable Computing model : Lock free protocol By Peeyush Agrawal 2010MCS3469 Guided By Dr. Kolin Paul.
CS510 Concurrent Systems “What is RCU, Fundamentally?” By Paul McKenney and Jonathan Walpole Daniel Mansour (Slides adapted from Professor Walpole’s)
By Michael Greenwald and David Cheriton Presented by Jonathan Walpole
UNIT – I Linked Lists.
Lock-Free Linked Lists Using Compare-and-Swap
CSE 332: Locks and Deadlocks
Martin Rinard Laboratory for Computer Science
Linked Lists head One downside of arrays is that they have a fixed size. To solve this problem of fixed size, we’ll relax the constraint that the.
CS510 Concurrent Systems Jonathan Walpole.
CS510 Concurrent Systems Jonathan Walpole.
Kernel Synchronization II
Multicore programming
Hybrid Transactional Memory
Concurrency and Immutability
Kernel Synchronization II
Multicore programming
CSE 153 Design of Operating Systems Winter 19
Problems with Locks Andrew Whitaker CSE451.
Process/Thread Synchronization (Part 2)
MV-RLU: Scaling Read-Log-Update with Multi-Versioning
Presentation transcript:

11/2/20111 The Ordering Requirements of Relativistic and Reader-Writer Locking Approaches to Shared Data Access Philip W. Howard with Jonathan Walpole, Josh Triplett, Paul E. McKenney

11/2/20112 Outline The Problem The RWL Solution The RP Solution Other problems and their solutions Multiple Writers Performance

11/2/20113 The Problem ABC AC

11/2/20114 The Problem obtain refdrop ref unlinkreclaim obtain refdrop ref Reader 1 Reader 2 Writer

11/2/20115 The Problem void init(stuff *node) { node->a = COMPUTE_A; node->b = COMPUTE_B; node->initd = TRUE; } while (!node->initd) {} if (node->a) drop_nuke(); if (node->b) drop_napalm();

11/2/20116 The Problem void init(stuff *node) { int value; value = some_computation; node->a = COMPUTE_A; node->b = COMPUTE_B; node->initd = value; }

11/2/20117 The Problem void init(stuff *node) { node->a = COMPUTE_A; node->b = COMPUTE_B; node->initd = TRUE; } while (!node->initd) {} if (node->a) drop_nuke(); if (node->b) drop_napalm();

11/2/20118 The Problem Compiler reordering CPU reordering Memory System reordering

11/2/20119 How are these dependencies maintained? obtain refdrop ref unlinkreclaim obtain refdrop ref Reader 1 Reader 2 Writer

11/2/ How does → work? Thread 1 A: a=1; Thread 2 B: if (a) A → B else B → A

11/2/ How does  work? Thread 1 A: a=1; mb(); C: c=1; Thread 2 while (!a) B: A  B Thread 3 if (c) D: A  D

11/2/ With Reader-Writer Locks obtain refdrop ref unlinkreclaim read-lockread-unlock write-lockwrite-unlock

11/2/ With Reader-Writer Locks obtain refdrop ref unlinkreclaim read-lockread-unlock write-lock write-unlock

11/2/ With Reader-Writer Locks obtain refdrop ref unlinkreclaim read-lock read-unlock write-lockwrite-unlock

11/2/ Locking primitives must Impose the semantics of the lock Prevent compiler, CPU, or Memory system from reordering operations across them

11/2/ With Relativistic Programming obtain refdrop ref start-waitend-wait start-readend-read unlinkreclaim

11/2/ With Relativistic Programming obtain refdrop ref start-waitend-wait start-readend-read unlinkreclaim

11/2/ RP primitives must Impose the semantics of wait-for-readers Prevent compiler, CPU, or Memory system from reordering operations across them

11/2/ Outline The Problem The RWL Solution The RP Solution Other problems and their solutions Insert Move Down Move Up General Case Multiple Writers Performance

11/2/ Insert CDE CE

11/2/ Insert obtain refderef initlink obtain refderef Reader 1 Reader 2 Writer

11/2/ How does  work? Thread 1 A: a=1; mb(); C: c=1; Thread 2 while (!a) B: A  B Thread 3 if (c) D: A  D

11/2/ Insert Use rp-publish to perform link operation

11/2/ Move Down A B C D F G E H A B C D F GE H

11/2/ Move Down A B C D F G E H A B C D F GE H F’ A B C D GE H 1.init F’ 2.link F’ 3.unlink F

11/2/ Move Down init(F’) link(F’)unlink(F) reclaim(F) deref(H)deref(F)deref(D)deref(E) A B C D F G E H A B C D F G E H F’ A B C D GE H

11/2/ Move Down init(F’) link(F’)unlink(F) reclaim(F) deref(H)deref(F)deref(D)deref(F’) A B C D F G E H A B C D F G E H F’ A B C D GE H deref(E)

11/2/ Move Down init(F’) link(F’)unlink(F) reclaim(F) deref(H)deref(D)deref(F’) A B C D F G E H A B C D F G E H F’ A B C D GE H deref(E)

11/2/ RBTree Delete with Swap (Move Up) D A B E C F null D A B E C F C’ AE D F

11/2/ Move Up init(C’) link(C’)unlink(C) reclaim(C) deref(F)deref(C’) D A B E C F null D A B E C F C’ AE D F

11/2/ Move Up init(C’) link(C’)unlink(C) reclaim(C) deref(F)deref(B)deref(E) D A B E C F deref(C) null D A B E C F C’ AE D F

11/2/ The General Case Mutable vs. Immutable data

11/2/ The General Case for Writers Copy nodes to update immutable data or to make a collection of updates appear atomic Use rp-publish to update mutable data Use wait-for-readers when updates are in traversal order ABC

11/2/ The General Case for Readers Use rp-read for reading mutable data Only dereference mutable data once if (node->next->key == key) { return node->next->value; }

11/2/ Outline The Problem The RWL Solution The RP Solution Other problems and their solutions Multiple Writers Performance

11/2/ Multiple Writers W R2 R1

11/2/ Multiple Writers W1 Reader W2

11/2/ The phone company Can’t wait to get my new phone Can’t wait to ditch this phone Can’t wait to ditch this phone Where’s my phone book?

11/2/ Multiple Writers W1 Reader W2 wait-for-readers W1 Reader W2

11/2/ RP vs RWL delays W1 Reader W2 RP W1 Reader W2 reader pref W1 Reader W2 writer pref W1 Reader W2 TORP

11/2/ Trigger Events RP RWLR RWLW

11/2/ Outline The Problem The RWL Solution The RP Solution Other problems and their solutions Multiple Writers Performance

11/2/ How does  work? Thread 1 A: a=1; mb(); C: c=1; Thread 2 while (!a) B: A  B Thread 3 if (c) D: A  D

11/2/ Reader-Writer Locks read-unlock write-lockwrite-unlock read-lock read-unlockread-lock

11/2/ Reader-Writer Locks Reader while (writing) {} reading=1; Writer while (reading) {} writing=1;

11/2/ Relativistic Programming start-waitend-wait start-readend-read start-readend-read

11/2/ Relativistic Programming start-read(i) { reader[i]=1; } end-read(i) { reader[i]=0; } start-wait() {} end-wait() { for (i) { while (reader[i]) {} }

11/2/ Relativistic Programming start-read(i) { reader[i]=Epoch; } end-read(i) { reader[i]=0; } start-wait() { Epoch++; my_epoch = Epoch; } end-wait() { for (i) { while (reader[i] && reader[i] < my_epoch) {} }

11/2/ Performance Benefits to RP Less expensive read-side primitives Less serialization / more concurrency Less waiting

11/2/ Benchmarked Methods nolockNo synchronization rpRelativistic Programming torpTotally Ordered RP rwlrReader-Writer Lock Read preference rwlwReader-Writer Lock Write preference

11/2/ Read Performance (size=1)

11/2/ Read Performance (size=1000)

11/2/ Update Scalability

11/2/ Update Scalability part 2

11/2/ Update Scalability (part 3)

11/2/ Conclusions Correctness can be preserved by limiting allowable orderings RP read primitives are less expensive than RWL primitives RP allows more concurrency and less waiting

11/2/ Conclusions RP can preserve both correctness and scalability or reads