Correcting Threading Errors with Intel® Parallel Inspector.

Slides:



Advertisements
Similar presentations
Confronting Race Conditions Intel Software College Introduction to Parallel Programming – Part 4.
Advertisements

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Implementing Domain Decompositions Intel Software College Introduction to Parallel Programming – Part 3.
Intel Software College Tuning Threading Code with Intel® Thread Profiler for Explicit Threads.
Part IV: Memory Management
Operating Systems Part III: Process Management (Process Synchronization)
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Lecture 5 Concurrency and Process/Thread Synchronization     Mutual Exclusion         Dekker's Algorithm         Lamport's Bakery Algorithm.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
INTEL CONFIDENTIAL Deadlock Introduction to Parallel Programming – Part 7.
Correcting Threading Errors with Intel® Thread Checker for Explicit Threads Intel Software College.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
Multi-core Programming Thread Checker. 2 Topics What is Intel® Thread Checker? Detecting race conditions Thread Checker as threading assistant Some other.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.
Concurrent Processes Lecture 5. Introduction Modern operating systems can handle more than one process at a time System scheduler manages processes and.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
INTEL CONFIDENTIAL OpenMP for Domain Decomposition Introduction to Parallel Programming – Part 5.
INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6.
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
INTEL CONFIDENTIAL OpenMP for Task Decomposition Introduction to Parallel Programming – Part 8.
Threads CNS What is a thread?  an independent unit of execution within a process  a "lightweight process"  an independent unit of execution within.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
INTEL CONFIDENTIAL Reducing Parallel Overhead Introduction to Parallel Programming – Part 12.
10/04/2011CS4961 CS4961 Parallel Programming Lecture 12: Advanced Synchronization (Pthreads) Mary Hall October 4, 2011.
Programming Models using Windows* Threads Intel Software College.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 5 Shared Memory Programming with OpenMP An Introduction to Parallel Programming Peter Pacheco.
This module was created with support form NSF under grant # DUE Module developed by Martin Burtscher Module B1 and B2: Parallelization.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
Multi-core Programming: Basic Concepts. Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
1 OpenMP Writing programs that use OpenMP. Using OpenMP to parallelize many serial for loops with only small changes to the source code. Task parallelism.
Concurrency, Mutual Exclusion and Synchronization.
Threading and Concurrency Issues ● Creating Threads ● In Java ● Subclassing Thread ● Implementing Runnable ● Synchronization ● Immutable ● Synchronized.
The University of Adelaide, School of Computer Science
 2004 Deitel & Associates, Inc. All rights reserved. 1 Chapter 4 – Thread Concepts Outline 4.1 Introduction 4.2Definition of Thread 4.3Motivation for.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
1. 10/24/ Upon completion of this module, you will be able to: Use Thread Checker to detect and identify a variety of threading correctness issues.
Programming with POSIX* Threads Intel Software College.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Internet Software Development Controlling Threads Paul J Krause.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 7: Process Synchronization Background The Critical-Section Problem Synchronization.
Programming with Windows* Threads Intel Software College.
Writing a Run Time DLL The application loads the DLL using LoadLibrary() or LoadLibraryEx(). The standard search sequence is used by the operating system.
Introduction to OpenMP
INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
Lecture 3 Concurrency and Thread Synchronization     Mutual Exclusion         Dekker's Algorithm         Lamport's Bakery Algorithm.
Barriers and Condition Variables
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Slides created by: Professor Ian G. Harris Operating Systems  Allow the processor to perform several tasks at virtually the same time Ex. Web Controlled.
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
Tuning Threaded Code with Intel® Parallel Amplifier.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Background on the need for Synchronization
Computer Engg, IIT(BHU)
Intel Software College
Intel Software College
Module 7a: Classic Synchronization
Concurrency: Mutual Exclusion and Process Synchronization
Lecture 2 The Art of Concurrency
CSE 153 Design of Operating Systems Winter 2019
Presentation transcript:

Correcting Threading Errors with Intel® Parallel Inspector

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 2 Intel® Parallel Inspector Objectives After successful completion of this module you will be able to… Use Parallel Inspector to detect and identify a variety of threading correctness issues in threaded applications Determine if library functions are thread-safe

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 3 Intel® Parallel Inspector Agenda What is Intel® Parallel Inspector? Detecting race conditions Detecting potential for deadlock Checking library thread-safety

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 4 Intel® Parallel Inspector Motivation Developing threaded applications can be a complex task New class of problems are caused by the interaction between concurrent threads Data races or storage conflicts More than one thread accesses memory without synchronization Deadlocks Thread waits for an event that will never happen

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 5 Intel® Parallel Inspector Debugging tool for threaded software Plug-in to Microsoft* Visual Studio* Finds threading bugs in OpenMP*, Intel® Threading Building Blocks, and Win32* threaded software Locates bugs quickly that can take days to find using traditional methods and tools Isolates problems, not the symptoms Bug does not have to occur to find it!

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Parallel Inspector Features Integrated into Microsoft Visual Studio.NET* IDE 2005 & 2008 Editions Supports different compilers Microsoft* Visual* C++.NET* Intel Parallel Composer View (drill-down to) source code for Diagnostics One-click help for diagnostics Possible causes and solution suggestions 6 Intel® Parallel Inspector

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 7 Intel® Parallel Inspector Parallel Inspector: Analysis Dynamic as software runs Data (workload) -driven execution Includes monitoring of: Thread and Sync APIs used Thread execution order Scheduler impacts results Memory accesses between threads Code path must be executed to be analyzed

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 8 Intel® Parallel Inspector Parallel Inspector: Before You Start Instrumentation: background Adds calls to library to record information Thread and Sync APIs Memory accesses Increases execution time and size Use small data sets (workloads) Execution time and space is expanded Multiple runs over different paths yield best results Workload selection is important!

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 9 Intel® Parallel Inspector Workload Guidelines Execute problem code once per thread to be identified Use smallest possible working data set Minimize data set size Smaller image sizes Minimize loop iterations or time steps Simulate minutes rather than days Minimize update rates Lower frames per second Finds threading errors faster!

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 10 Intel® Parallel Inspector Building for Parallel Inspector Compile Use dynamically linked thread-safe runtime libraries ( /MDd ) Generate symbolic information ( /ZI ) Disable optimization ( /Od ) Link Preserve symbolic information ( /DEBUG ) Specify relocatable code sections ( /FIXED:NO )

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 11 Intel® Parallel Inspector Binary Instrumentation Build with supported compiler Running the application Must be run from within Parallel Inspector Application is instrumented when executed External DLLs are instrumented as used

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Starting Parallel Inspector Build the Debug version of the application with appropriate flags set 12 Intel® Parallel Inspector

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Starting Parallel Inspector Select Parallel Inspector from the Tools menu 13 Intel® Parallel Inspector You can choose to look for Memory Errors Threading Errors

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Starting Parallel Inspector The Configure Analysis window pops up 14 Intel® Parallel Inspector Select the level of analysis to be carried out by Parallel Inspector The deeper the analysis, the more thorough the results and the longer the execution time Click Run Analysis

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Starting Parallel Inspector The initial (raw) results come up after analysis 15 Intel® Parallel Inspector Click the Interpret Results button to filter the raw data into more human consumable formats

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Starting Parallel Inspector The analysis results are gathered together in related categories 16 Intel® Parallel Inspector Double-click a line from the Problem Sets pane to see the source code that generated the diagnostic

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Starting Parallel Inspector The source lines involved in a data race can be shown 17 Intel® Parallel Inspector

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 18 Intel® Parallel Inspector Activity 1a - Potential Energy Build and run serial version Build threaded version Run application in Parallel Inspector to identify threading problems

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 19 Intel® Parallel Inspector Race Conditions Execution order is assumed but cannot be guaranteed Concurrent access of same variable by multiple threads Most common error in multithreaded programs May not be apparent at all times

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 20 Intel® Parallel Inspector Solving Race Conditions Solution: Scope variables to be local to threads When to use Value computed is not used outside parallel region Temporary or “work” variables How to implement OpenMP scoping clauses ( private, shared ) Declare variables within threaded functions Allocate variables on thread stack TLS (Thread Local Storage) API

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 21 Intel® Parallel Inspector Solving Race Conditions Solution: Control shared access with critical regions When to use Value computed is used outside parallel region Shared value is required by each thread How to implement Mutual exclusion and synchronization Lock, semaphore, event, critical section, atomic… Rule of thumb: Use one lock per data element

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 22 Intel® Parallel Inspector Activity 1b - Potential Energy Fix errors found by Parallel Inspector

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 23 Intel® Parallel Inspector Deadlock Caused by thread waiting on some event that will never happen Most common cause is locking hierarchies Always lock and un-lock in the same order Avoid hierarchies if possible DWORD WINAPI threadA(LPVOID arg) { EnterCriticalSection(&L1); EnterCriticalSection(&L2); processA(data1, data2); LeaveCriticalSection(&L2); LeaveCriticalSection(&L1); return(0); } DWORD WINAPI threadB(LPVOID arg) { EnterCriticalSection(&L2); EnterCriticalSection(&L2); EnterCriticalSection(&L1); EnterCriticalSection(&L1); processB(data2, data1) ; processB(data2, data1) ; LeaveCriticalSection(&L1); LeaveCriticalSection(&L1);LeaveCriticalSection(&L2); return(0); return(0);} ThreadA: L1, then L2 ThreadB: L2, then L1

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 24 Intel® Parallel Inspector Deadlock Add lock per element Lock only elements, not whole array of elements void swap (shape_t A, shape_t B) { lock(a.mutex); lock(b.mutex); // Swap data between A & B unlock(b.mutex); unlock(a.mutex); } typedef struct { // some data things SomeLockType mutex; } shape_t; shape_t Q[1024]; swap(Q[986], Q[34]); Thread 4 swap(Q[34], Q[986]); Thread 1 Grabs mutex 34 Grabs mutex 986

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 25 Intel® Parallel Inspector Windows* Critical Section Lightweight, intra-process only mutex Most useful and most used New type CRITICAL_SECTION cs; Create and destroy operations InitializeCriticalSection(&cs) DeleteCriticalSection(&cs);

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 26 Intel® Parallel Inspector Windows* Critical Section CRITICAL_SECTION cs ; Attempt to enter protected code EnterCriticalSection(&cs) Blocks if another thread is in critical section Returns when no thread is in critical section Upon exit of critical section LeaveCriticalSection(&cs) Must be from obtaining thread

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 27 Intel® Parallel Inspector Example: Critical Section #define NUMTHREADS 4 CRITICAL_SECTION g_cs; // why does this have to be global? int g_sum = 0; DWORD WINAPI threadFunc(LPVOID arg ) { int mySum = bigComputation(); EnterCriticalSection(&g_cs); g_sum += mySum; // threads access one at a time LeaveCriticalSection(&g_cs); return 0; } main() { HANDLE hThread[NUMTHREADS]; InitializeCriticalSection(&g_cs); for (int i = 0; i < NUMTHREADS; i++) hThread[i] = CreateThread(NULL,0,threadFunc,NULL,0,NULL); WaitForMultipleObjects(NUMTHREADS, hThread, TRUE, INFINITE); DeleteCriticalSection(&g_cs); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 28 Intel® Parallel Inspector Activity 2 - Deadlock Use Intel® Parallel Inspector to find and correct the potential deadlock problem.

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 29 Intel® Parallel Inspector Thread Safe Routines All routines called concurrently from multiple threads must be thread safe How to test for thread safety? Use OpenMP and Parallel Inspector for analysis Use sections to create concurrent execution

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 30 Intel® Parallel Inspector Thread Safety Example Check for safety issues between Multiple instances of routine1() Instances of routine1() and routine2() Set up sections to test all permutations Still need to provide data sets that exercise relevant portions of code #pragma omp parallel sections { #pragma omp section routine1(&data1); #pragma omp section routine1(&data2); #pragma omp section routine2(&data3); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 31 Intel® Parallel Inspector It is better to make a routine reentrant than to add synchronization Avoids potential overhead Two Ways to Ensure Thread Safety Routines can be written to be reentrant Any variables changed by the routine must be local to each invocation Don’t modify globally shared variables Routines can use mutual exclusion to avoid conflicts with other threads If accessing shared variables cannot be avoided What if third-party libraries are not thread safe? Will likely need to control threads access to library

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 32 Intel® Parallel Inspector Activity 3 – Thread Safety Use OpenMP framework to call library routines concurrently Three library calls = 6 combinations to test A:A, B:B, C:C, A:B, A:C, B:C

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 33 Intel® Parallel Inspector Intel® Parallel Inspector What’s Been Covered Threading errors are easy to introduce Debugging these errors by traditional techniques is hard Intel® Parallel Inspector catches these errors Errors do not have to occur to be detected Greatly reduces debugging time Improves robustness of the application

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 34 Intel® Parallel Inspector