Distributed and Parallel Processing George Wells
UNIX System V IPC: Shared Memory Control The shmctl() system call: – Query internal data structures – Set permissions – Mark shared memory for deletion Waits until no processes are attached
Example: Shared Queue #define WRITE_SEM 0 #define READ_SEM 1 #define QUEUE_SIZE 10 int shmid, semid; int *shmaddr; int hd = -1, tl = -1; int size = QUEUE_SIZE * sizeof(int); shmid = shmget(key, size, IPC_CREAT | 0600); shmaddr = shmat(shmid, 0, 0); // Now create semaphore set with 2 semaphores semid = semget(key, 2, IPC_CREAT | 0600); union semun semopts; semopts.val = QUEUE_SIZE; // Space available semctl(semid, WRITE_SEM, SETVAL, semopts); semopts.val = 0; // No data to read yet semctl(semid, READ_SEM, SETVAL, semopts); No error detection/handling
Example: Writing to the queue struct sembuf sb; sb.sem_num = WRITE_SEM; sb.sem_op = -1; sb.sem_flg = 0; semop(semid, &sb, 1); // Wait for space hd = (hd+1) % QUEUE_SIZE; shmaddr[hd] = item; sb.sem_num = READ_SEM; sb.sem_op = 1; semop(semid, &sb, 1); // Signal data is available
Example: Removing from the queue struct sembuf sb; sb.sem_num = READ_SEM; sb.sem_op = -1; sb.sem_flg = 0; semop(semid, &sb, 1); // Wait for data tl = (tl+1) % QUEUE_SIZE; item = shmaddr[tl]; sb.sem_num = WRITE_SEM; sb.sem_op = 1; semop(semid, &sb, 1); // Signal space available
OpenMP (Open Multi-Processing) Semi-automatic parallelisation Multi-platform shared memory multiprocessing C, C++ and Fortran Makes use of: – compiler directives – library routines – environment variables
Hello World #include int main() { #pragma omp parallel printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads()); return 0; } // main Uses fork/join model Use {... } for more than one statement
Compiling Using gcc: – gcc -fopenmp HelloWorld.c Pragmas are ignored by compilers that don't support OpenMP At runtime: –Automatically creates a thread team –Number of threads dependent on processors available
Array Initialisation int main () { const int N = ; int i, a[N]; #pragma omp parallel for for (i = 0; i < N; i++) a[i] = 2 * i; return 0; } // main Divides loop iterations between threads
Scheduling Optional settings control thread scheduling #pragma omp for schedule(static) Optional settings control thread scheduling Static: – Default, each thread allocated section of for loop Dynamic: – Threads allocated single iterations on demand Useful if amount of work per iteration varies Can request more than single iterations #pragma omp for schedule(dynamic, 3)
Sharing Variables OpenMP for ensures the loop control variable is not shared Variables declared in the loop block are not shared Other variables are shared by default – Use private clause to prevent sharing int a; #pragma omp for for (int k = 0; k < MAX; k++) { int c = k*k; a += c; } private(a)
Reductions We can arrange to have the total summed automatically int a; #pragma omp parallel for reduction(+:a) for (int k = 0; k < MAX; k++) { int c = k*k; a += c; } Other reductions: – *, -, ^, ||, |, &&, &
Critical Sections Alternatively: int a; #pragma omp parallel for for (int k = 0; k < MAX; k++) { int c = k*k; #pragma omp atomic a += c; } Or: int a; #pragma omp parallel for for (int k = 0; k < MAX; k++) { int c = k*k; #pragma omp critical a += c; }
Sharing Variables firstprivate – Like private, but initialised from original variable lastprivate – Like private, but value from last thread copied to original variable
Synchronisation Barriers – Cause all threads to synchronise – #pragma omp barrier Implicit barrier at end of parallel section – Can be ignored using nowait #pragma omp parallel { #pragma omp for nowait for (int k=0; k < MAX; k++) method1(); // No barrier at end of for method2(); } // Barrier at end of parallel method3();
Parallel Processing Recap Threads – Java threads – Java Concurrency library CSP – Programming model – Formal notation supporting analysis/reasoning – occam, JCSP Interprocess communication OpenMP – Semi-automatic parallelisation
Distributed Processing How can cooperative systems be built using networks of computers? Abstraction levels Object “spaces” Net. services, ORBs, mobile agents RPC/RMI Client-server, peer-to-peer Message passing
Distributed Message Passing Many libraries exist – MPI (Message Passing Interface) – PVM (Parallel Virtual Machine)
Remote Procedure/Method Calls Provides familiar procedure/method calling interface for distributed communication Program... call f(x)... procedure f (a)... return
CPU 1CPU 2 Remote Call Program... call f(x)... procedure f (a)... return Network Parameters, return values, etc. passed over network
Remote Method Calls Program obtains reference to a remote object – Calls methods Proxy/stub receives call locally – Marshals parameters – Sends parameters to server – Waits for return – Receives any return value – Returns to caller
Server Side Skeleton code on server waits for incoming “call” – Unmarshals parameters – Calls method – Waits for method to return Marshals return value(s), if necessary – Sends message back to client
Java: Remote Method Invocation (RMI) Write Java interface to specify actions to be performed by remote objects – Must extend java.rmi.Remote – Methods must throw java.rmi.RemoteException – Parameters/return values must be Serializable Write class that implements interface Create object – “Export” to obtain stub – Register to make available on network
RMI — Client Side Lookup remote object using registry – Returns stub (implements the interface) Call methods of stub
Example: Remote Mandelbrot Service Interface import java.rmi.Remote; import java.rmi.RemoteException; /** This interface specifies the remote object * methods. */ public interface MandelbrotCalculator extends Remote { public byte[][] calculateMandelbrot (int xsize, int ysize, double x1, double x2, double y1, double y2) throws RemoteException; } // interface MandelbrotCalculator