Download presentation
Presentation is loading. Please wait.
Published byPaul Smith Modified over 9 years ago
1
Lecture 3 Tuesday, February 10, 2015 [With the help of free online resources]
2
Divide and Conquer Divide and Conquer? Split the entire range into smaller manageable parts. Solve each part separately. Combine the results to get result for the original larger problem. Merge
3
Divide and Conquer Recursive Divide and Conquer? Doing Divide and Conquer recursively. i.e., First divide into two parts, then divide each of those two parts into two more smaller parts and so on until reach a small enough size. cilk_for is also implemented as a recursive divide and conquer. Splits the loop range in a divide and conquer way ………
4
Homework: Implement it and show the performance.
5
Divide and Conquer Fibonacci number: In mathematics, the Fibonacci numbers or Fibonacci sequence are the numbers in the following integer sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, 34,... F(n) = F(n - 1) + F (n - 2) int fib(int n) { if (n < 2) return n; int x = fib(n-1); int y = fib(n-2); return x + y; } int fib(int n) { if (n < 2) return n; int x = cilk_spawn fib(n-1); int y = fib(n-2); cilk_sync; return x + y; } http://www.mathsisfun.com/numbers/fibonacci-sequence.html
6
Merge Sort: Sort n numbers recursively using divide and conquer. MergeSort(A, p, r) 1.if (p<r)then 2.q = (p+r) / 2 3.MergeSort(A, p, q) 4.MergeSort(A, q+ 1, r) 5.Merge(A, p, q, r) Picture: Wikipedia Amdahl's law ??
7
Matrix-multiplication Iterative-MM ( Z, X, Y ) // X, Y, Z are n × n matrices, // where n is a positive integer 1. for i ← 1 to n do 3. Z[ i ][ j ] ← 0 4. for k ← 1 to n do 2. for j ← 1 to n do 5. Z[ i ][ j ] ← Z[ i ][ j ] + (X[ i ][ k ] * Y[ k ][ j ]) Par-Iterative-MM ( Z, X, Y ) // X, Y, Z are n × n matrices, // where n is a positive integer 1. parallel for i ← 1 to n do 3. Z[ i ][ j ] ← 0 4. for k ← 1 to n do 2. parallel for j ← 1 to n do 5. Z[ i ][ j ] ← Z[ i ][ j ] + X[ i ][ k ] ⋅ Y[ k ][ j ] Source: http://www.mathsisfun.com/algebra/matrix-multiplying.htmlhttp://www.mathsisfun.com/algebra/matrix-multiplying.html http://algebra.nipissingu.ca/tutorials/matrices.html
8
Iterative matrix multiplication Par-Iterative-MM ( Z, X, Y ) // X, Y, Z are n × n matrices, // where n is a positive integer 1.parallel for i ← 1 to n do 2. parallel for j ← 1 to n do 3. Z[ i ][ j ] ← 0 4. for k ← 1 to n do 5. Z[ i ][ j ] ← Z[ i ][ j ] + X[ i ][ k ] * Y[ k ][ j ] All six i, j, k orderings are valid ! Figure out which one is the fastest!
9
Job script and submitting a job in cluster #!/bin/bash export CILK_NWORKERS=16 for ((i=1; i <=5; i=i+1)) do./hello_world done Filename: jobscript.sh qsub –q jjahan jobscript.sh #!/bin/bash #PBS -N NAME_OF_JOB #Name displayed on the queue when you issue ‘qstat –a’ #PBS -l nodes=1:ppn=16 #this specifies that the program should use 1 node with 16 threads on each core for a total of 64 #PBS –l walltime=00:15:00 #this specifies the requested run time for the job in hh:mm:ss format. The queue min/max will supersede this request. #PBS –q queue_name #this specifies which queue to submit to if other than the default #PBS -j oe #this specifies that program Output & Error should be directed to a single file #Assuming MPI program, calling the Mvapich MPIRUN & mpihello.exe (compiled with mvapich wrapper for GCC)./mm
10
Measuring cache misses What is the CPU Cache? How does it work? Cache miss: Cache hit: Measuring cache misses: In theory: By tools: PAPI http://www.pantherproducts.co.uk/ind ex.php?pageid=cpucache
11
Measuring Cache misses Demo on IACS cluster
12
Recursive MM Source: http://www3.cs.stonybrook.edu/~rezaul/Spring- 2012/CSE613/CSE613-lecture-8.pdf
13
Recursive MM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.