Tools for the development of parallel applications

Slides:



Advertisements
Similar presentations
Parallel Processing with OpenMP
Advertisements

1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.
Memory Models (1) Xinyu Feng University of Science and Technology of China.
Java PathRelaxer: Extending JPF for JMM-Aware Model Checking Huafeng Jin, Tuba Yavuz-Kahveci, and Beverly Sanders Computer and Information Science and.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
Scaling Model Checking of Dataraces Using Dynamic Information Ohad Shacham Tel Aviv University IBM Haifa Lab Mooly Sagiv Tel Aviv University Assaf Schuster.
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.
12/1/2005Comp 120 Fall December Three Classes to Go! Questions? Multiprocessors and Parallel Computers –Slides stolen from Leonard McMillan.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Meenaktchi Venkatachalam.
Cormac Flanagan UC Santa Cruz Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Jaeheon Yi UC Santa Cruz Stephen Freund.
Analyzing the CRF Java Memory Model Yue Yang Ganesh Gopalakrishnan Gary Lindstrom School of Computing University of Utah.
/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
1.8History of Java Java –Based on C and C++ –Originally developed in early 1991 for intelligent consumer electronic devices Market did not develop, project.
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Foundations of the C++ Concurrency Memory Model Hans-J. Boehm Sarita V. Adve HP Laboratories UIUC.
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Detecting Atomicity Violations via Access Interleaving Invariants
1 Distributed BDD-based Model Checking Orna Grumberg Technion, Israel Joint work with Tamir Heyman, Nili Ifergan, and Assaf Schuster CAV00, FMCAD00, CAV01,
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Gauss Students’ Views on Multicore Processors Group members: Yu Yang (presenter), Xiaofang Chen, Subodh Sharma, Sarvani Vakkalanka, Anh Vo, Michael DeLisi,
Concurrent and Distributed Programming Lecture 1 Introduction References: Slides by Mark Silberstein, 2011 “Intro to parallel computing” by Blaise Barney.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Lecture 5. Example for periority The average waiting time : = 41/5= 8.2.
Detecting Data Races in Multi-Threaded Programs
Chapter 4 – Thread Concepts
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Introduction to threads
Distributed Shared Memory
For Massively Parallel Computation The Chaotic State of the Art
Parallel Programming By J. H. Wang May 2, 2017.
Chapter 4 – Thread Concepts
Pattern Parallel Programming
Distributed Dynamic BDD Reordering
The University of Adelaide, School of Computer Science
Specifying Multithreaded Java semantics for Program Verification
Chapter 4: Threads.
Amir Kamil and Katherine Yelick
Threads and Memory Models Hal Perkins Autumn 2011
Chapter 4: Threads.
Chapter 4: Threads.
Outline Midterm results summary Distributed file systems – continued
Chapter 26 Concurrency and Thread
Modified by H. Schulzrinne 02/15/10 Chapter 4: Threads.
Object-Oriented and Classical Software Engineering Fifth Edition, WCB/McGraw-Hill, 2002 Stephen R. Schach
Threads and Memory Models Hal Perkins Autumn 2009
Reachability testing for concurrent programs
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Multithreaded Programming
Parallel Algorithm Models
Distributed Systems CS
Amir Kamil and Katherine Yelick
Xinyu Feng University of Science and Technology of China
Problems with Locks Andrew Whitaker CSE451.
Pointer analysis John Rollinson & Kaiyuan Li
Presentation transcript:

Tools for the development of parallel applications Assaf Schuster Technion – Israel Institute of Technology 4/28/2019 Intel MultiCore Conference

Intel MultiCore Conference Message Passing? People will not give up shared memory programming Do not want to learn messaging APIs Shared memory considered a “natural” extension of serial code Distributed computing sounds scary; multithreading? “I’ve tried it, it worked ” Shared memory considered “easier to implement” on upper layers of memory hierarchy 4/28/2019 Intel MultiCore Conference

Multithreading Issue #1: Debugging Tools The hard problems: over- and under-synchronization Very few tools No full coverage No precise tools No scalable tools No educational material Some positive research techniques – very limited 4/28/2019 Intel MultiCore Conference

Intel MultiCore Conference Therac 25 A medical radiation machine to treat cancer 6 patients got a radiation overdose 4 died 2 injured 4/28/2019 Intel MultiCore Conference

State of the Art - Industry Intel Thread Checker? IBM’s Contest: Add yields on random locations Expose deserted interleavings Watch the output: any problem? Look for the causes of the problem in the interleaving Steps 2 & 3 human-based  “travel debugging” 4/28/2019 Intel MultiCore Conference

State of the Art – Research On the fly DR detection for C++ programs Instrumentation-based 4/28/2019 [Pozniansky and Schuster, PPoPP 2003] (4-way IBM Netfinity server, 550MHz, Win-NT)

Playing With Detection Granularity Coverage: a single interleaving 4/28/2019 [Pozniansky and Schuster, PPoPP 2003] (4-way IBM Netfinity server, 550MHz, Win-NT)

State of the Art - Research Model Checking DRs (IBM tools) Boosted with Dynamic Information (Lockset) Provides a witness to the race Detects races in many interleavings …….. Scalability: 4 threads max Scalability: Sequential Consistency only [Sagiv, Shacham, Schuster, PPoPP 2005] 4/28/2019 Intel MultiCore Conference

Multithreading Issue #2: Memory Models Observation: most current multithreaded “scalable” applications were/are created by programmers who do not know what MM is all about. No educational material “Here’s my new great book: introduction to parallel programming…” Existing debugging tools ignore MM 4/28/2019 Intel MultiCore Conference

Memory Models Education T1 T2 ======= ======= Initially X=Y=0 Print(X) Print(Y) Y = 1 X = 1 Is <<< 1 1 >>> a legitimate output? 4/28/2019 Intel MultiCore Conference

Intel MultiCore Conference Java Memory Model The first popular language attempting a “formal definition”. JLS 1995, 17 pages. Theorem: JMM is Coherent [Gontmakher and Schuster, IPDPS 1997] Did they know? Consequence: ***All*** tested JVMs in 1999-2000 breach JMM specification. 4/28/2019 Intel MultiCore Conference

Coherence makes reads prevent common compiler optimizations p and q might point to same object p.x = 0 p.x=1 a=p.x b=q.x assert(p==q  a  b  c) Cannot put c=a c=p.x reads can make a process see writes by another process. The read “kills” later reuse of local values. Coherence, prevents important compiler optimizations. One of the common optimizations is a reuse of a local copy of a global variable. When there is a read of another global variable which may be equal to the first, this might be a read of a modified value of the global variable, and thus the previously read value cannot be reused, otherwise Coherence is violated. ---------------------------- KONST> “<=“ to “” 4/28/2019 Intel MultiCore Conference

Intel MultiCore Conference August 2004: A New JMM Java Community Process, JSR-133 Bill Pugh Sarita Adve And others Took three years (could have been longer) Is now being adopted as a standard for C++ But is it any good?????? 4/28/2019 Intel MultiCore Conference

Causality requirements: guarantees for correctly synchronized programs The program is correctly synchronized if it is data-race free on a sequentially consistent platform Java must guarantee sequentially consistent behavior for correctly synchronized program Initially, x = y = 0 Thread 1 | Thread 2 r1 = x; r2 = y; if (r1 != 0) if (r2 != 0) y = 1; x = 1; r1 == r2 == 0 is the only legal behavior 4/28/2019 Intel MultiCore Conference

Causality Requirements: Safety Guarantees 42 42 42 42 42 42 42 42 42 42 42 42 justified Initially, x = y = 0 Thread 1 | Thread 2 r1 = x; r2 = y; y = r1; x = r2; Incorrectly synchronized. Still: r1 == r2 == 42 must not be allowed 42 Java must provide safety guarantees for incorrectly synchronized programs: values must not appear “out of thin air” 4/28/2019 Intel MultiCore Conference

Intel MultiCore Conference Causality Test Case 6 Initially, A = B = 0 Thread 1 | Thread 2 1: r1 = A; | 3: r2 = B; if (r1 == 1) | if (r2 == 1) 2: B = 1; | 4: A = 1; | if (r2 == 0) | 5: A = 1; r1 == r2 == 1 ALLOWED?! 4/28/2019 Intel MultiCore Conference

Intel MultiCore Conference Conclusions Parallelization algorithms, load balancing, most of the work in 30 years -- usually non issues Debugging tools for multithreading We do not have good technology, yet And may never have, a hard problem Memory models for shared memory We do not know what we want, yet I do not see programmers understand them Solution Better programming paradigms Dara-race free languages Valiant’s BSP (shared memory-like, easy to debug) Next panel! 4/28/2019 Intel MultiCore Conference