Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instructor’s Intent for this course

Similar presentations


Presentation on theme: "Instructor’s Intent for this course"— Presentation transcript:

1 Instructor’s Intent for this course

2 Making connections between concepts
Cache C Pipelining Locality Code tf Threads Parallelism MPI OpenMP Locks Depend.

3 Challenge Question for (i = 0; i < 100000; i++)
a[i ] = a[i] + 1; Dependences between a[0], a[1000], a[2000] … a[1], a[1001], a[2001] … “Dependence distance” is 1000 First idea: make the “dependence distance” fall outside of loop

4 General Example Dependence if m<n. for( i=0; i<n; i++ )
a[i+m] = a[i] + 1; Dependence if m<n.

5 Answer 1 for (i = 0; i < 100; i++) #pragma omp parallel for
for (j = i*1000; j < (i+1)*1000; j++) a[j+1000] = a[j] + 1; Not ideal – parallelizes inner loop

6 Answer 2 #pragma omp parallel for for (i = 0; i < 1000; i++)
for (j = 0; j < 100; j++) a[i + (j+1)*1000] = a[i + j*1000] + 1; Not ideal – same thread “jumps” through array Poor cache locality – leads to high false sharing (because array is written “interspersed” by different threads)

7 False Sharing Example cache line a[0] a[1] a[2] a[3] a[4] a[5] a[6]
Written by processor 0 Written by processor 1

8 Answer 3 #pragma omp parallel for private j
for (i=1; i <100; i++) { stride = i*1000; for(j = 0; j < 1000; j++) a [stride+j] = a[j] + i; } Maintain “intent” of the code, completely restructure it For i 1000 … a[i] is incremented by 1 For i 2000 … a[i] is incremented by 2, etc

9 Answer 4 (additional 0.2% bonus)
#pragma omp parallel for for (i = 1000; i < ; i++) a[i] = a[i%1000] + i/1000;


Download ppt "Instructor’s Intent for this course"

Similar presentations


Ads by Google