Download presentation
Presentation is loading. Please wait.
Published byEdwin Hopkins Modified over 9 years ago
1
Function Level Parallelism Driven by Data Dependencies By Sean Rul, Hans Vandierendonck, Koen De Bosschere dasCMP 2006, December 10
2
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 2 Eras in Processor Architecture 0.01 0.1 1 10 100 1000 10000 100000 1000000 1970198520002015 MIPS 486 386 286 8086 Pipelined Architecture Superscalar Speculative Instruction Level Parallelism HyperThreaded Pentium 4 Multi-Threaded Multi-Core Thread & Processor Level Parallelism with Special Purpose HW 4004 i386Conroe March 2005 Increased Parallel Power
3
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 3 Do we have the software? Embarrassingly parallel Ray tracing Bioinformatics Old(er) programs Inherently sequential Broad range of general purpose applications
4
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 4 What can we do? Parallelize … By handConservative Time consumingCorrect static analysis Non-Conservative Assume perfect knowledge of dependencies Requires verification: Programmer feedback Speculative hardware/software support Restrictions limit amount of function level parallelism Reveals more function level parallelism
5
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 5 Framework : overview Profiling Possible Parallelization Abstraction Sequential program Multi- threaded program Matchingpatterns
6
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 6 Framework: collect information Profiling Possible Parallelization Abstraction Sequential program Multi- threaded program Matchingpatterns
7
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 7 Profiling Not Safe: information may be input dependent ! Address / Object X Producer Times produced Consumer F1F2F3 F11 F2212 F32 Program execution trace Load / Store Data dependencies @ function level F1: Store X F2: Load X
8
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 8 Framework: convert information Profiling Possible Parallelization Abstraction Sequential program Multi- threaded program Matchingpatterns
9
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 9 Control Flow Graph x 10 x 20 x 100 x 30 x 20 x 10 1% f h g i l j k 20% 14%15% 10% 14% # executions % execution time x 100 m
10
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 10 Interprocedural Data Flow Graph m f h g i l j k Intercluster data stream Intracluster data stream
11
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 11 Classifying data dependencies
12
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 12 Data Sharing Graph m f h g i l j k ds 1 ds 4 ds 7 ds 8 ds 5 ds 6 ds 9 ds 2 ds 3 Cluster private Cluster shared Rectangular node: Function Elliptic node: Data structure Read Write
13
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 13 Framework: analyze information Profiling Possible Parallelization Abstraction Sequential program Multi- threaded program Matchingpatterns
14
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 14 Parallel Constructs Other constructs such as master/slave are possible Pipeline Constructs Finding unidirectional producer/consumer relations Using Control Flow Graph & Interprocedural Data Flow Graph
15
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 15 Parallelization Using Data Sharing Graph Synchronization Duplication Communication entails sequentiality Which data can be made private ?
16
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 16 Implementing pipeline constructs Heterogeneous threadsHomogeneous threads … ……… … ……… Thread 1Thread 2Thread 3Thread 4 Thread 1 Thread 2 Thread 3 Iteration 1 Iteration 2 Iteration 3 Stage 1Stage 2 Stage 3Stage 4
17
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 17 Bzip2 from SPEC2000 with reference input program Quad Itanium processor Linux 2.4.20 Gcc 2.96 with linuxthreads Analysis of compression part reveals pipeline of 4 stages Analysis of decompression part reveals pipeline of 2 stages Manual verification Evaluation of Bzip2
18
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 18 Speedup results parallelized Bzip2 Compression speedup is 3.60 Decompression speedup is 1.41 Global speedup is 2.45
19
Function Level Parallelism Driven by Data Dependencies Sean Rul – December 10, 2006 19 Profiling control and data flow Detecting function-level parallelism with call graph and interprocedural data flow graph Data sharing graph reveals communication requirements Conclusion Non-deterministic parallelization allows to reveal more function level parallelism Framework in a nutshell: Evaluation for Bzip2 resulted in a global speedup of 2.45
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.