Multiprocessing & The.Net Parallel Extensions Guy Ben Haim Senior Application Engineer Intel Asaf Shelly Senior Consultant Pacific Software
Session Objectives and Agenda Multicore Parallel Software.Net Parallel Extensions Q&A Summary
What is Multicore Pentium Pentium Processor Dual Core Quad Core
Moore’s Law – GHz to Multicore Performance 2006 Intel MC Assistance Threading Multi-tasking Training Tools Performance Through Multi-Core frequency -+
Intel Processor Advancement Multiple execution cores ramping across Intel platforms
Why Multi Core? Power Performance 2 GHz 100%
CPU that is 20% Faster Power Performance 2.4 GHz2 GHz 174% 100% 113% 100%
CPU that is 20% Slower Power Performance 1.6 GHz 100% 2 GHz 50% 87% 2.4 GHz 174% 100% 113%
Multi Core: Energy Efficient Performance Power Performance 1.6 GHz 100% 2 GHz 100% 174% 2.4 GHz 174% 100% 113% 174%
What does it mean Multi Cores? Performance
Start Thinking Parallel Software Today Instructions – Assembly – Making it workInstructions – Assembly – Making it work Thinking like a CPUThinking like a CPU Functions – C, Pascal, Basic – Faster CodeFunctions – C, Pascal, Basic – Faster Code Procedural ThinkingProcedural Thinking Objects – C++, Java, C#, Delphi, VB – Manage CodeObjects – C++, Java, C#, Delphi, VB – Manage Code OOD, OOP – Thinking in objectsOOD, OOP – Thinking in objects Tasks - ? – Optimize RuntimeTasks - ? – Optimize Runtime Thinking Parallel
Situation Today Experts, Freelance Specialists, Skilled Groups API is not intuitive Hard to understand execution flow Problematic Design Patterns Little awareness of tools Hidden Problems Hard to test and debug
Understanding Parallel Computing Resources Ownership Global data / Shared data Collisions and Race Conditions Task Design Conjunction Points
Task Oriented Design Modify Write Open Modify Scan
Simple For for ( int y = 0; y < bmp.Height; y++ ) { for ( int x = 0; x < bmp.Width; x++ ) { Pixels[ x, y ] = bmp.GetPixel( x, y ); }
Parallel For Parallel.For( 0, bmp.Height, y => { for ( int x = 0; x < bmp.Width; x++ ) { Pixels[ x, y ] = bmp.GetPixel( x, y ); } });
.Net Parallel Extensions - Performance
Parallel Class Parallel.For Parallel.Do Parallel.ForEach Inplace code / Function Object Type
Parallel Do Parallel Quick Sort: void QuicksortParallel(,, ) { int pivot = Partition(arr, left, right); Parallel.Do( () => QuicksortParallel(arr, left, pivot - 1), () => QuicksortParallel(arr, pivot + 1, right)); }
PLINQ
.Net Parallel Extensions – PLINQ
Task Parallel Library Parallel For, Do, ForEach PLINQ Tasks over Threads Tasks over Cores TaskManager Conjunction Points
.Net Parallel Extensions – Tasks Parallel Library
.Net Parallel Extensions – RayTracer
Tips Shared are Globals Parallel Loops are not loops Define data as Loop internal Race Conditions are still here Don’t use Locks!! Don’t use MUTEXs
Threading Tools Intel® Thread Checker Used to create correct multi- threaded code Intel® Thread Profiler Used to analyze performance Intel Software Solutions Group:
Data Race example Serial program What is value of A_SUM: A_Sum = 4 R S1: x = 1.0; y = 2.0 ; A1 = 0; S2: A1 = x * y; S3: A_SUM = 2 * A1; x y A1
Data Race example (Cont.) Initiate x = 1.0; y = 2.0 ; A1 = 0; Thread1 A1 = x * y Thread2 A_SUM = 2 * A1 What is value of x if: Thread1 runs before Thread2? Thread2 runs before Thread1? Execution order is not guaranteed x y A_Sum = 4 A_Sum = 0 A1
Intel® Thread Checker Diagnostics
Source Code Viewer
Performance Profile Threads Speedup Possible causes for this scalability profile: 1.Insufficient parallel work 2.Load imbalance 3.Synchronization overhead 4.Memory bandwidth limitations
Finding Serial and Parallel Time
Load Imbalance Multi Threading should be managed Multi Threading should be managed Programming should consider load imbalance
Load Imbalance Unequal work loads lead to idle threads and wasted time Busy Idle Time Thread 0 Thread 1 Thread 2 Thread 3 Start threads Join threads
Synchronization Programming should consider Synchronizations issues
Synchronization By definition, synchronization serializes execution Lock contention means more idle time for threads Busy Idle In Critical Thread 0 Thread 1 Thread 2 Thread 3 Time
Real example : Before fix Serial Parallel Switching Overhead
Real example: After fix Serial Parallel 2 X Speed Up
Summary Parallelize or Perish !
Do we really want Parallel Code? Do users even care?
Change In Mindset Everything is stopped. Waiting for the photographer Everyone is working independently
Developers are writing functions Developers are managing tasks
Doing things the way we always have Things are going to be different
Keep yourself in the loop Public event by Pacific Software Register to the User Group Asynchronous Operations Web Site has all the online resources that you need... and more Register to my five day course titled Multiprocessing Traps and Pitfalls Use our poster to let people know that you know
Resources Download the Microsoft.Net Parallel Extensions bc7f180ba&displaylang=en bc7f180ba&displaylang=en Asynchronous Operations Web Site Intel’s Multicore Pacificsoft Training and Consulting Microsoft Forum for Parallel Computing
Make a difference Let us know what you think Feedback for the.Net Parallel Extensions Dev team Video blog about parallel computing Fill the feedback form …
כדאי למלא משוב ! איך ממלאים? בעקבות מייל שישלח בסיום כל יום, ב-Business Center במתחם HP, בעמדות האינטרנט במלונות הילטון ודן מילאת משוב - מגיעה לך חולצת Live It! מילאת משוב בשלושת ימי הכנס? יש לך הזדמנות לזכות בכרטיס טיסה לתאילנד מתנת סוכנות BTC, מכשיר בלאק ג'ק מתנת סמסונג, מכשיר HTC מתנת ניופאן, מדיה סנטר מתנת DataSafe ועוד...
© 2007 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.