Intel Software College Tuning Threading Code with Intel® Thread Profiler for Explicit Threads.

Slides:



Advertisements
Similar presentations
1 ZonicBook/618EZ-Analyst Resonance Testing & Data Recording.
Advertisements

Implementing Task Decompositions Intel Software College Introduction to Parallel Programming – Part 5.
Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Confronting Race Conditions Intel Software College Introduction to Parallel Programming – Part 4.
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Improving Parallel Performance Intel Software College Introduction to Parallel Programming – Part 7.
Implementing Domain Decompositions Intel Software College Introduction to Parallel Programming – Part 3.
1 Concurrency: Deadlock and Starvation Chapter 6.
Analysis of Computer Algorithms
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Chapter 3 Demand and Behavior in Markets. Copyright © 2001 Addison Wesley LongmanSlide 3- 2 Figure 3.1 Optimal Consumption Bundle.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
Processes and Operating Systems
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 1 Embedded Computing.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
6 Copyright © 2005, Oracle. All rights reserved. Building Applications with Oracle JDeveloper 10g.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
1 Term 2, 2004, Lecture 6, TransactionsMarian Ursu, Department of Computing, Goldsmiths College Transactions 3.
1 Processes and Threads Creation and Termination States Usage Implementations.
Chapter 6 File Systems 6.1 Files 6.2 Directories
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Version 1.0 digitaloffice.intel.com Intel ® vPro Technology Intel ® Active Management Technology Setup and Configuration HP Laptop – Compaq 6910p Small.
Pinwheel Scheduling for Power-Aware Real-Time Systems Gaurav Chitroda Komal Kasat Nalini Kumar.
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Outline Introduction Assumptions and notations
PP Test Review Sections 6-1 to 6-6
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Chapter 10: Virtual Memory
INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9.
DAQmx下多點(Multi-channels)訊號量測
Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
3.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Process An operating system executes a variety of programs: Batch system.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
Processes Management.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
INTEL CONFIDENTIAL Threading for Performance with Intel® Threading Building Blocks Session:
Compiler Construction
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
1. 10/24/ Upon completion of this module, you will be able to: Use Thread Checker to detect and identify a variety of threading correctness issues.
Tuning Threaded Code with Intel® Parallel Amplifier.
Prof. Chih-Hung Wu Dept. of Electrical Engineering
Tuning Threading Code with Intel® Thread Profiler for Explicit Threads
Presentation transcript:

Intel Software College Tuning Threading Code with Intel® Thread Profiler for Explicit Threads

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Objectives After successful completion of this module you will be able to… Use Thread Profiler to recognize and fix common performance problems in applications using Windows* threads

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 3 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Agenda Look at Intel® Thread Profiler features Define Critical Path Analysis Examine Thread Profiler data views available Review common performance issues of multithreaded applications Focus on Load imbalance Focus on Synchronization contention Describe general optimizations to gain better performance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 4 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Motivation Developing efficient multithreaded applications is hard New performance problems are caused by the interaction between concurrent threads Load imbalance Contention on synchronization objects Threading overhead

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 5 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Intel® Thread Profiler Plugs in to the VTune performance environment Instrumentation-based data collector in VTune Identifies performance issues in OpenMP* or threaded applications using the Win32* API, POSIX* threads, and Intel® Threading Building Blocks Pinpoints performance bottlenecks that directly affect execution time

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 6 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Intel® Thread Profiler Features Supports several different compilers Intel® C++ and Fortran Compilers, v7 and higher Microsoft* Visual* C++.NET* 2002, 2003 & 2005 Editions Integrated into Microsoft Visual Studio.NET* IDE Binary instrumentation of applications Different views and filters available to assist and organize analysis Uses critical path analysis

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 7 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads critical path is the longest execution flow The critical path is the longest execution flow What is the Critical Path? Threaded applications contain multiple execution flows A new flow is created when a thread is created or resumes Flow ends when a thread terminates or blocks on a synchronization primitive Thread 1 Thread 2 Thread 3 T0T0 T1T1 T2T2 T3T3 T4T4 T5T5 T6T6 T7T7 T8T8 T9T9 T 10 T 11 T 12 T 13 T 14 T 15 Acquire L Threads 2 & 3 Done Acquire L Wait for Threads 2 & 3 Release L Acquire lock L Wait for L Release LWait for L Thread 2 terminates Thread 3 terminates Thread 1 terminates

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 8 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Critical Path Analysis System Utilization Relative to the system executing the application Idle: no threads Serial: a single thread Under Utilized: more than one thread, less than cores Fully Utilized: # threads == # cores Over Utilized: # threads > # cores Thread interaction categories Cruise: threads running without interference Overhead: thread operation overhead Blocking: thread waiting on external event Impact: thread preventing some other thread from executing If the critical path is shortened, the application will run in less time

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 9 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Thread 1 Thread 2 Thread 3 T0T0 T1T1 T2T2 T3T3 T4T4 T5T5 T6T6 T7T7 T8T8 T9T9 T 10 T 11 T 12 T 13 T 14 T 15 Acquire lock L Wait for Threads 2 & 3 Wait for L Release LWait for L Release L Acquire L Threads 2 & 3Done System Utilization Examines processor utilization to determine concurrency level of the application Concurrency is the number of active threads Categorization shown for a system configuration with 2 processors IdleSerialFully UtilizedUnder UtilizedOver Utilized Concurrency Level Time

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 10 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Execution Time Categories Analyze thread interaction and behavior along critical path Record objects that cause CP transitions Cruise timeOverheadBlocking timeImpact time Categorization shown for a system configuration with 2 processors Thread Interaction Time Thread 1 Thread 2 Thread 3 T0T0 T1T1 T2T2 T3T3 T4T4 T5T5 T6T6 T7T7 T8T8 T9T9 T 10 T 11 T 12 T 13 T 14 T 15 Acquire lock L Wait for Threads 2 & 3 Wait for L Release LWait for L Release L Acquire L Threads 2 & 3 Done

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 11 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Merging Concurrency and Behavior Concurrency Level Critical Path Thread Behavior Time Start with system utilization Further categorize by behavior

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 12 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Thread Profiler Views Critical Path View Shows breakdown of the critical path Profile View Shows the breakdown of selected critical paths User can select other views of the selected profile Concurrency level, threads, objects Timeline View Shows thread activity and critical path transitions for the entire application Source View Transition source view, creation source view

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 13 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Activity 1a Threaded version of potential code Is there a performance issue? Goal Run application through Thread Profiler Examine thread activities by reviewing different views

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 14 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Thread Profiler Profile View Profile Pane Timeline Pane

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 15 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Profile Pane – Concurrency Level View Concurrency Level View Two threads ran in parallel ~33% of the time Ran single threaded ~65% of the time Lets look at the Thread View

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 16 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Profile Pane – Thread View Time on the Critical Path Active time of the thread Lifetime of the thread Lets look at the Object View

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 17 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Profile Pane – Object View This object caused all of the impact Lets look at Timeline View

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 18 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Timeline Pane

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 19 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Source View

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 20 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Activity 1b Threaded version of potential code Is there a performance issue? Goal Examine thread activities by reviewing different views Determine system utilization Identify any performance issues

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 21 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Review Activity 1 Concurrency Level view can be used to determine system utilization by the application Timeline view enables you to understand the thread activity in your application Instrumentation time will be included in first run results; thus, for applications running in a short amount of time, a second run may produce more realistic timings.

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 22 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Common Performance Issues Load balance Improper distribution of parallel work Synchronization Excessive use of global data, contention for the same synchronization object Parallel Overhead Due to thread creation, scheduling.. Granularity Not sufficient parallel work

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 23 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Load Imbalance Unequal work loads lead to idle threads and wasted time Busy Idle Time Thread 0 Thread 1 Thread 2 Thread 3 Start threads Join threads

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 24 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Redistribute Work to Threads Static assignment Are the same number of tasks assigned to each thread? Do tasks take different processing time? Do tasks change in a predictable pattern? Rearrange (static) order of assignment to threads Use dynamic assignment of tasks

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 25 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Redistribute Work to Threads Dynamic assignment Is there one big task being assigned? Break up large task to smaller parts Are small computations agglomerated into larger task? Adjust number of computations in a task More small computations into single task? Fewer small computations into single task? Bin packing heuristics

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 26 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Unbalanced Workloads Threads are unbalanced Active Times not equal

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 27 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Activity 2 – Load Imbalance Threaded version of potential code with thread pools Has a load balance performance issue

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 28 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Review Activity 2 Threads view can be used to determine activity levels of each thread within the application Timeline view enables you to understand the thread activity in your application

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 29 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Synchronization By definition, synchronization serializes execution Lock contention means more idle time for threads Busy Idle In Critical Thread 0 Thread 1 Thread 2 Thread 3 Time

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 30 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Synchronization Fixes Eliminate synchronization Expensive but necessary evil Use storage local to threads Use local variable for partial results, update global after local computations Allocate space on thread stack ( alloca ) Use thread-local storage API (TlsAlloc) Use atomic updates whenever possible Some global data updates can use atomic operations (Interlocked API family)

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 31 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Atomic Updates Use Win32 Interlocked* intrinsics in place of synchronization object static long counter; // Fast InterlockedIncrement (&counter); // Slower EnterCriticalSection (&cs); counter++; LeaveCriticalSection (&cs);

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 32 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Synchronization Fixes Reduce size of critical regions protected by synchronization object Larger critical regions tie up sync objects longer; other threads sit idle longer waiting to acquire objects Only accesses to shared variables need to be protected

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 33 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Synchronization Fixes Use best synchronization object for job Critical Section Local object Available to threads within the same process Lower overhead (~8X faster than mutex) Mutex Kernel object Accessible to threads within different processes Deadlock safety (can only be released by owner) Other objects are available

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 34 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Object Contention This object caused all of the impact What is all this?These four threads… …are impacting threads by this object

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 35 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Activity 3 Threaded version of numerical integration Has serious performance issues Goal Understand thread activity Use the Thread Profiler groupings Examine synchronization and its effect on performance Fix performance issue

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 36 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Review Activity 3 Grouping objects and threads provides the information on which objects impact what threads Apply the heuristics from labs for locating bottlenecks in the source code For longer running applications, the difference in first and second run- times is negligible

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 37 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads General Optimizations Serial Optimizations Serial optimizations along the critical path should affect execution time Parallel Optimizations Reduce synchronization object contention Balance workload Functional parallelism Analyze benefit of increasing number of processors Analyze the effect of increasing the number of threads on scaling performance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 38 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Intel® Thread Profiler for Explicit Threads Whats Been Covered Identifying performance issues can be time consuming without tools Tools are required to understand and to optimize parallel efficiency and hardware utilization Thread Profiler helps you understand your applications thread activity, system utilization, and scaling performance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 39 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads