Department of Computer Science Presenters Dennis Gove Matthew Marzilli The ATOMO ∑ Transactional Programming Language.

Slides:



Advertisements
Similar presentations
TRAMP Workshop Some Challenges Facing Transactional Memory Craig Zilles and Lee Baugh University of Illinois at Urbana-Champaign.
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Practice Session 7 Synchronization Liveness Deadlock Starvation Livelock Guarded Methods Model Thread Timing Busy Wait Sleep and Check Wait and Notify.
Enabling Speculative Parallelization via Merge Semantics in STMs Kaushik Ravichandran Santosh Pande College.
ECE 454 Computer Systems Programming Parallel Architectures and Performance Implications (II) Ding Yuan ECE Dept., University of Toronto
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
© 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
Monitors Chapter 7. The semaphore is a low-level primitive because it is unstructured. If we were to build a large system using semaphores alone, the.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
TxLinux: Using and Managing Hardware Transactional Memory in an Operating System Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan,
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
02/19/2007CSCI 315 Operating Systems Design1 Process Synchronization Notice: The slides for this lecture have been largely based on those accompanying.
To GPU Synchronize or Not GPU Synchronize? Wu-chun Feng and Shucai Xiao Department of Computer Science, Department of Electrical and Computer Engineering,
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Synchronization (Barriers) Parallel Processing (CS453)
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Mutual Exclusion.
Condition Variables and Transactional Memory: Problem or Opportunity? Polina Dudnik and Michael Swift University of Wisconsin, Madison.
Internet Software Development Controlling Threads Paul J Krause.
CSC321 Concurrent Programming: §5 Monitors 1 Section 5 Monitors.
ICS 313: Programming Language Theory Chapter 13: Concurrency.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-4 Process Communication Department of Computer Science and Software.
Java Thread and Memory Model
The ATOMOS Transactional Programming Language Mehdi Amirijoo Linköpings universitet.
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
Fundamentals of Parallel Computer Architecture - Chapter 71 Chapter 7 Introduction to Shared Memory Multiprocessors Yan Solihin Copyright.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
CS533 – Spring Jeanie M. Schwenk Experiences and Processes and Monitors with Mesa What is Mesa? “Mesa is a strongly typed, block structured programming.
COSC 3407: Operating Systems Lecture 9: Readers-Writers and Language Support for Synchronization.
C H A P T E R E L E V E N Concurrent Programming Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
1 Synchronization via Transactions. 2 Concurrency Quiz If two threads execute this program concurrently, how many different final values of X are there?
CS 2200 Presentation 18b MUTEX. Questions? Our Road Map Processor Networking Parallel Systems I/O Subsystem Memory Hierarchy.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Chapter 7: Repetition Structure (Loop) Department of Computer Science Foundation Year Program Umm Alqura University, Makkah Computer Programming Skills.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Lecture 20: Consistency Models, TM
Part 2: Software-Based Approaches
Memory Consistency Models
Threads Cannot Be Implemented As a Library
Memory Consistency Models
Computer Engg, IIT(BHU)
Challenges in Concurrent Computing
Synchronization Lecture 23 – Fall 2017.
Changing thread semantics
Transactional Memory Semaphores, monitors, and conditional critical regions all suffer from limitations based on lock semantics Naïve synchronization may.
Lecture 6: Transactions
Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E
Lecture 22: Consistency Models, TM
Software Transactional Memory Should Not be Obstruction-Free
Monitors Chapter 7.
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Lecture: Coherence and Synchronization
Lecture: Consistency Models, TM
Synchronization These notes introduce:
Controlled Interleaving for Transactions
Software Engineering and Architecture
CSE 542: Operating Systems
Presentation transcript:

Department of Computer Science Presenters Dennis Gove Matthew Marzilli The ATOMO ∑ Transactional Programming Language

2 Department of Computer Science What is Atomos?  Atomos delivers to Java Implicit Transactions Strong Atomicity Programming Language Approach Scalable Multiprocessor Implementation  Constructs help make parallel programming More intuitive to the programmer Provide for easier reasoning concerning execution Exceed the performance of “lock based” systems

3 Department of Computer Science The downside of Locks  Traditional thread programs use Locks for critical sections of code  Typically one or more locks per shared data structure  Heavy locking can lead to serialization  Fine grained locking helps performance, but increased code complexity risks of deadlock priority inversion

4 Department of Computer Science New Parallel Construct - Transactions  Declare a portion of code “atomic”  Allows programmer to focus on where atomicity is necessary  Provides non-blocking synchronization atomic { count = count + 1; } Start of transaction End of transaction, changes to count “committed” to other threads

5 Department of Computer Science Violations  Transactions need to detect Violations of data dependencies to ensure atomicity.  Occur when a transactions “read set” intersects with another’s “write set”.  Causes a transaction to “roll back” and begin again.

6 Department of Computer Science Details of what Atomos Provides  Implicit Transactions Atomic sections allow programmers to use parallel constructs without specific transactional knowledge. Explicit transactions require “transactional awareness” of the programmer.  Strong Atomicity Non transactional code (non-atomic) does not see the state of uncommitted transactions. Updates to shared memory still violate transactions Under weak atomicity isolation is only guaranteed between transactions

7 Department of Computer Science Details of what Atomos Provides  Programming Language Some transactional systems are libraries. Language semantics require a compiler that generates safe and efficient code.  Multiprocessor Scalability Provides an implementation to take advantage of multiprocessor trends.

8 Department of Computer Science Atomos Synchronization Primitives  atomic  Transactions are defined by the atomic statement. Remember “strong atomicity.” Serialization with non-transactional code as well. atomic { counter ++; } Programmers usually mean atomic during lock() or synchronized()

9 Department of Computer Science Atomos Synchronization Primitives  Nested Atomic statements follow “closed-nesting” semantics  Inner atomic statements merge their read and write sets to the parent upon commit.  Violations mean that only a parent and its children must be rolled back.  We’ll revisit Closed vs. Open nesting after some examples.

10 Department of Computer Science Atomos Synchronization Primitives  watch watch a variable for a change  retry roll back and restart the atomic block communicates a “watch set” (set of all watched variables) to the scheduler. scheduler will now listen for violations within the watch set and upon a violation reschedule this thread  We’ll see an example of these three constructs with Producer- Consumer.

11 Department of Computer Science Producer Consumer Example (using Java Synchronized) public int get () { synchronized (this) { while (!available) wait(); available = false; notifyAll(); return contents; } } public void put (int value) { synchronized (this) { while (available) wait(); contents = value; available = true; notifyAll(); } }

12 Department of Computer Science Producer Consumer Example (using Atomos Constructs) public int get() { atomic { if (!available) { watch available; retry; } available = false; return contents; } public void put (int value) { atomic { if (available) { watch available; retry; } contents = value; available = true; }

13 Department of Computer Science Barrier Example (using Java Synchronized) synchronized (lock) { count++; if (count != thread_count) lock.wait(); else lock.notifyAll(); }

14 Department of Computer Science Barrier Example (using Atomos Constructs) atomic { count++; } atomic { if (count != thread_count) { watch count; retry; }

15 Department of Computer Science Closed vs. Open Nested Transactions  Recall nested Atomic statements used Closed Nesting.  What happens if we need updates from a child transactions available across all threads immediately?  Open Nested Transactions involve commit stages that immediately make their changes global (without waiting for the parent).

16 Department of Computer Science Closed vs. Open Nested Transactions

17 Department of Computer Science Closed vs. Open Nested Transactions public static int generateID { atomic { return id++; } public static void createOrder (...) { atomic { Order order = new Order(); order.setID(generateID()); orders.put(new Integer(order.getID()),order); }

18 Department of Computer Science Closed vs. Open Nested Transactions public static int generateID { open { return id++; } public static void createOrder (...) { atomic { Order order = new Order(); order.setID(generateID()); orders.put(new Integer(order.getID()),order); }

19 Department of Computer Science Loop Speculation  Atomos provides a loop construct that quickly allows existing for-loops to take advantage of transactional parallelism.  Also gives the programmer control over ordering of these transactions.  Loops can be ordered or unordered.

20 Department of Computer Science Loop Speculation void histogram(int[] A,int[] bin) { for(int i=0; i<A.length; i++) bin[A[i]]++; } void histogram(int[] A,int[] bin) { Loop.run(false,20,Arrays.asList(A),new LoopBody() { public void run(Object o){ bin[A[((Integer)o).intValue()]]++; }

21 Department of Computer Science Loop Speculation: Ordering

22 Department of Computer Science Evaluation  How does Atomos compare with Java? Embarrassingly parallel – matches Java performance High Contention between thread – exceeds performance 4 major benchmarks used

23 Department of Computer Science Evaluation: SPECjbb2000  Server-side Java benchmark Embarrassingly Parallel! Only 1% chance of contention between threads  Meant to compare basic Java performance with Atomos  Synchronized statements automatically changed to atomic  Vary thread and warehouses from 1 to 32

24 Department of Computer Science Evaluation: SPECjbb2000 Atomos matches Java on “embarrassingly parallel” performance.

25 Department of Computer Science Evaluation: TestHashtable  Biggest benefit of Atomos is “optimistic speculation” instead of threads “pessimistic waiting.”  Micro-benchmark TestHashtable compares varying implementations of java.util.Map  Multiple threads contend over a single Map instance with 50% get and 50% put operations

26 Department of Computer Science Evaluation: TestHashtable Think back to use of watch and retry statements. watch / retry vs. waiting Assists high-concurrency performance.

27 Department of Computer Science Evaluation: Conditional Waiting with TestWait  Focus on performance of the Producer – Consumer problem. Heavy use of Test Waiting semantics.  32 threads operate on 32 shared queues.

28 Department of Computer Science Evaluation: Conditional Waiting with TestWait

29 Department of Computer Science Evaluation: Loop.run with TestHistogram  Random numbers between 0 and 100 are counted in bins.  Java Implementation involves a Lock for each bin.  Atomos Implementation involves transaction for each update.

30 Department of Computer Science Evaluation: Loop.run with TestHistogram

31 Department of Computer Science Conclusion  Intuitive model for parallel applications  “optimistic speculation”  Strong performance both for “embarrassingly parallel” high contention between threads