Download presentation
Presentation is loading. Please wait.
1
Optimization: The Art of Computing
Intel Challenge experience and other tricks … Mathieu Gravey
2
Golden principle of Optimizing
- t e r m Algorithm Implementation Hardware P e r f o m a n c
3
Example: Prime Number Algorithm
For i=2 to N bool isPrime=true; For j=2 to N If (mod(i,j)==0 and i != j) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for
4
Example: Prime Number Algorithm
For i=2 to N bool isPrime=true; For j=2 to i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for
5
Example: Prime Number Algorithm
For i=2 to N bool isPrime=true; For j=2 to √i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for
6
Example: Prime Number Algorithm
// the job For i=2 to N bool isPrime=true; For j=2 to √i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for
7
Example: Prime Number Algorithm
// the job For i=2 to N bool isPrime=true; For j=2 to √i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for
8
Example: Prime Number Algorithm
// the job For i=2 to N bool isPrime=true; vectorize the job For j=2 to √i isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber End for
9
Example: Prime Number Algorithm
// the job For i=3 to N step 2 bool isPrime=true; vectorize the job For j in √i step 2 isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber End for
10
Example: Prime Number Algorithm
// the job For i=2 to N step 2 bool isPrime=true; vectorize the job For j=2 to √i step 2 isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber End for
11
Example: Prime Number Algorithm
// the job For i==2 to N bool isPrime=true; vectorize the job For j in listOfPrimeNumber and j<√i isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber in order End for
12
Example: Prime Number Algorithm
// the job For i==1 or i==5 in base 6, to N bool isPrime=true; vectorize the job For j in listOfPrimeNumber and j<√i isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber in order End for
13
Basic principles Pareto principle Structure Parallelization
Vectorization inotes4you.files.wordpress.com
14
Basic principles Start by the main issues Global view critical issue
Monkey development Start simple go to complex Iterative process Optimizing, start by slowing down Global picture !
15
Rules Guidelines Be lazy Don’t reinvent the wheel Don’t be idle
Design pattern Global variables are your enemies Don’t Overgeneralize
16
Rules Guidelines Trust the compiler
Simple for you = simple for compiler | computer Share your knowledge Compiler
17
Rules Guidelines Think different, try, change and try again …
Don’t aim for the Best, but something Good and Better
18
Concrete trick : Memory
Array vs. List Prefetch | random access
19
Concrete trick : First step Optimization
Compiler optimization icpc myCodeFile –O3 -xhost –o myCompiledProgram ⚠ -g const No-writes inline restrict/__restrict__ No read updates Loop-unroll __builtin_expect((x),(y))
20
Concrete trick : OpenMP
Vectorization => SIMD #pragma omp simd Multi-operation with one instruction ⚠ non-aligned data Multi-Thread L3 cache-communication Shared memory How to use : #pragma omp parallel for default(none) shared(x,y) fisratPrivate(array) reduction(max:MaxValue) schedule(static) for(int i=0; i< 10000; i++){ something … } #pragma omp critical #pragma omp barrier
21
Multi-Chip | Multi-Sockets
NUMA (Non-uniform memory access) slower than local memory Position in memory => first touch Parallelize the initialisation with : schedule(static) read only data => copy in each local memory Thread Affinity
22
Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.