Parallel and Distributed Programming Kashif Bilal
Parallel and Distributed Programming. Generally We have two types of Distributed and parallel Systems. Generally We have two types of Distributed and parallel Systems. –Multiprocessors ( Shared memory System) Having single memory accessible to every processor Having single memory accessible to every processor –Multicomputers ( Message Passing System) Every Processor has its own local memory. Every Processor has its own local memory.
Parallel Programming Architectures Generally we have two basic architectures for parallel programming Generally we have two basic architectures for parallel programming –Supervisor Worker Structure (Master Slave model) –Hierarchy Structure (Tree Model )
Supervisor Worker Model There is only one level of hierarchy in this structure one supervisor and many workers. Supervisor – –It normally interacts with the user – – activates the workers – –assigns work to the workers – –collects results from the workers. Workers perform calculations and send result back to supervisor
Hierarchy Structure The hierarchy structure allows the workers to create new levels of workers. The top-level supervisor is the initiating task, which creates a set of workers at the second level. These workers may create other sets of workers.
Process ….. A program (task) in execution. A program (task) in execution. A process is a set of executable instructions (program) which runs on a processor. Process is basic entity to achieve parallelism in both multiprocessors and multicomputers. Process is basic entity to achieve parallelism in both multiprocessors and multicomputers.
Distributes System Classification John Flynn classified computers in four categories. – –SISD – Single Instruction stream/Single Data stream – –SIMD – Single Instruction stream/Multiple Data stream OR SPMD – Single Program/Multiple Data stream – –MISD – Multiple Instruction stream/Single Data stream (No real application) – –MIMD – Multiple Instruction stream/Multiple Data stream.
Message Passing The method by which data from one processor’s memory is copied to the memory of another processor. In distributed memory systems, data is generally sent as packets of information over a network from one processor to another.
Thumb Recognition Example Suppose we have a database of 10 lack thumb impressions and their related data. Suppose we have a database of 10 lack thumb impressions and their related data. A user comes, system takes ones impression and searches the database for ones information. A user comes, system takes ones impression and searches the database for ones information. Suppose one database match take 1/100 seconds. Suppose one database match take 1/100 seconds. To search the complete database, system will take approx seconds. i.e. 2.7 hours. To search the complete database, system will take approx seconds. i.e. 2.7 hours.
Algorithm for Thumb Recognition Main(){ Picture =Capture the thumb impression to be matched Details = Match( picture ) }Match(Picture){ Pick record from database one by one and match it with Picture If (matched) return Details of that record }
Distributed Algorithm… Main(){ Receive impression, start record number and end record number from master. Details = Match( picture, start_no, end_no ) Send details to Master or supervisor code. } Match( Picture, start_no, end_no) { Pick record from database one by one and match it with Picture from record no start_no till end_no. If (matched) return Details of that record }
Problem… How to receive data from supervisor to workers. How to receive data from supervisor to workers. How send details back to supervisor process. How send details back to supervisor process. How supervisor allocate and communicate with workers. How supervisor allocate and communicate with workers. Etc…… Etc……
Possible Solutions Make a new programming language for programming in distributed and parallel. Make a new programming language for programming in distributed and parallel. Change existing languages. Change existing languages. Build libraries and API’s( Set of functions) to perform all tasks related to Distributed and parallel programming like remote spawning to task, communication, synchronization etc. Build libraries and API’s( Set of functions) to perform all tasks related to Distributed and parallel programming like remote spawning to task, communication, synchronization etc.
PVM and MPI PVM and MPI are two famous libraries used for parallel and distributed programming. PVM and MPI are two famous libraries used for parallel and distributed programming. PVM is a bit older than MPI. PVM is a bit older than MPI. MPI is now considered as standard for building parallel and distributed programs. MPI is now considered as standard for building parallel and distributed programs. Both libraries provide different functions to perform different tasks required in parallel programs. Both libraries provide different functions to perform different tasks required in parallel programs.
PVM (Parallel Virtual Machine) Started as a research project in 1989 The Parallel Virtual Machine (PVM) was originally developed at Oak Ridge National Laboratory and the University of Tennessee. The PVM offers a powerful set of process control and dynamic resource management functions.
PVM….. It provides programmers with a library of routines for – –Initiation and termination of tasks. – –Synchronization. – –Alteration of the virtual machine configuration. – –Facilitates message passing via a number of simple constructs. – –Interoperability among different heterogeneous computers. Programs written for some architecture can be copied to another architecture, compiled and executed without modification.
PVM… PVM has two components: – –A library of PVM routines – – A daemon that should reside on all the hosts in the virtual machine. The pvmd serves as a message router and controller. It provides a point of contact, authentication, process control, and fault detection. One task (process) normally instantiated on one machine (Supervisor), and other tasks as instantiated automatically by supervisor.
Task Creation… A task in PVM can be started manually or can be spawned from another task. The initiating task is always activated manually by simply running its executable code on one of the hosts. Other PVM tasks can be created dynamically from within other tasks. The function pvm_spawn() is used for dynamic task creation. The task that calls the function pvm_spawn() is referred to as the parent and the newly created tasks are called children.
To Create a child, you must specify The machine on which the child will be started A path to the executable file on the specified machine The number of copies of the child to be created An array of arguments to the child tasks
Pvm_spawn()… Num=pvm_spawn (Child, Arguments, Flag, Where, How Many, &Tids) This function has six parameters and returns the actual number of the successfully created tasks in the variable num. –Child: Name of task (process) to be executed. –Arguments : Same as command line arguments. –Flag : –Flag : A flag value decides what machine will run the spawned task.
Flag values Flag values –PvmTaskDefault 0 PVM can choose any machine to start task 0 PVM can choose any machine to start task –PvmTaskHost 1 where specifies a particular host 1 where specifies a particular host –PvmTaskArch 2 where specifies a type of architecture 2 where specifies a type of architecture Where : Depends on value of flag. Where : Depends on value of flag. How Many : Number of tasks to be spawned. How Many : Number of tasks to be spawned. Tids: Array to store Tid’s of spawned tasks. Tids: Array to store Tid’s of spawned tasks. Return : Total number of tasks created. Return : Total number of tasks created.
n1 = pvm_spawn(“simple_pvm”, 0, 1, “mynode11”, 1, &tid1) –Will create 1 tasks and run executables named simple_pvm on node mynode. numt = pvm_spawn( “simple_pvm”, 0, PvmTaskArch, “LINUX", 5, &tids); numt = pvm_spawn( “simple_pvm”, 0, PvmTaskArch, “LINUX", 5, &tids); –Will create 5 tasks running simple_pvm onspecific architecture i.e. Linux. res = pvm_spawn(“simple_pvm”, NULL, PvmTaskDefault, "", 5, children); res = pvm_spawn(“simple_pvm”, NULL, PvmTaskDefault, "", 5, children); –Will ask pvm to choose nodes itself.
Task Id’s All PVM tasks are identified by an integer task identifierAll PVM tasks are identified by an integer task identifier When a task is created it is assigned a unique identifier (TID)When a task is created it is assigned a unique identifier (TID) Task identifiers can be used to identify senders and receivers during communication.Task identifiers can be used to identify senders and receivers during communication. It can also be used to assign functions to different tasks based on their TIDsIt can also be used to assign functions to different tasks based on their TIDs
Retrieval Of Tid Task’s TID pvm_mytid() Task’s TID pvm_mytid() Mytid = pvm_mytid; Child’s TID pvm_spawn() Child’s TID pvm_spawn() pvm_spawn(…,…,…,…,…, &tid); Parent’s TID pvm_parent() Parent’s TID pvm_parent() my_parent_tid = pvm_parent(); Daemon’s TID pvm_tidtohost() Daemon’s TID pvm_tidtohost() daemon_tid = pvm_tidtohost(id);
Communication among Tasks Communication among PVM tasks is performed using the message passing approach, Achieved using a library of routines and a daemon. User program communicates with the PVM daemon Daemon Daemon determines the destination of each message. Communication is generally asynchronous.
User application Library Daemon User application Library Daemon Sending TaskReceiving Task
How to send a message Sending a message requires 3 steps. Sending a message requires 3 steps. 1.A send buffer must be initialized. 2.The message is packed into the buffer. 3.The completed message is sent to its destination(s). Receiving of message requires 2 steps Receiving of message requires 2 steps 1.The message is received. 2.The received items are unpacked.
Message Sending… Buffer Creation (before packing) Buffer Creation (before packing) Bufid = pvm_initsend(encoding_option) Bufid = pvm_mkbuf(encoding_option) Encoding optionMeaning 0XDR 1No encoding 2Leave data in place
Message Sending… Data Packing Data Packingpvm_pk*() –pvm_pkstr() – one argument pvm_pkstr(“This is my data”); –Others – three arguments 1. Pointer to the first item 2. Number of items to be packed 3. Stride pvm_pkint(my_array, n, 1); pvm_pkint(my_array, n, 1);
Message Sending… Point to point (one receiver) Point to point (one receiver) info = pvm_send(tid, tag) broadcast (multiple receivers) broadcast (multiple receivers) info = pvm_mcast(tids, n, tag) info = pvm_bcast(group_name, tag) Pack and Send (one step) Pack and Send (one step) info = pvm_psend(tid, tag, my_array, length, data type) The call returns integer status code info. A negative value of info indicates an error. The call returns integer status code info. A negative value of info indicates an error.
Receiving a Message PVM supports three types of message receiving functions: blocking, nonblocking, and timeout. PVM supports three types of message receiving functions: blocking, nonblocking, and timeout. Blocking Blocking bufid = pvm_recv (tid, tag) -1 wild card in either tid or tag Nonblocking Nonblocking bufid = pvm_nrecv (tid, tag) bufid = pvm_nrecv (tid, tag) bufid = 0 (no message was received) bufid = 0 (no message was received) Timeout Timeout bufid = pvm_trecv (tid, tag, timeout) bufid = pvm_trecv (tid, tag, timeout) bufid = 0 (no message was received)
Data Unpacking pvm_upk*() –pvm_upkstr() – one argument pvm_upkstr(string); –Others – three arguments 1. Pointer to the first item 2. Number of items to be unpacked 3. Stride pvm_upkint(my_array, n, 1); pvm_upkint(my_array, n, 1);
Data Unpacking pvm_upk*() pvm_upk*() –pvm_upkstr() – one argument pvm_upkstr(string); –Others – three arguments 1. Pointer to the first item 2. Number of items to be unpacked 3. Stride pvm_upkint(my_array, n, 1); pvm_upkint(my_array, n, 1); Receiving and unpacking in single step Receiving and unpacking in single step – –Info = pvm_precv(tid, tag, my_array, len, datatype, &src,&atag, &alen)
Work Assignment (different programs) info1 = pvm_spawn(“worker1”, 0, 1, “lpc01”, 1, &tid1) info2 = pvm_spawn(“worker2”, 0, 1, “lpc02”, 1, &tid2) info3 = pvm_spawn(“worker3”, 0, 1, “lpc03”, 1, &tid3) info4 = pvm_spawn(“worker4”, 0, 1, “lpc04”, 1, &tid4)
Distributed Algorithm… Main(){ Receive impression, start record number and end record number from master. Details = Match( picture, start_no, end_no ) Send details to Master or supervisor code. } Match( Picture, start_no, end_no) { Pick record from database one by one and match it with Picture from record no start_no till end_no. If (matched) return Details of that record }
Distributed Algorithm using PVM Master MasterMain() { Input thumb Impression… Int arr[2]; Int start=1; Pvm_spawn(“slave”,0,0,””,100,children);Pvm_init_send(XDR);for(i=0;i<100;i++){arr[0]=start;arr[1]=start+999;pvm_pkint(arr,2,1);pvm_send(children[i],)Start+=1000;Pvm_send(picture);}Pvm_recv(-1,-1);}
Distributed Algorithm using PVM Slave SlaveMain(){ Int arr[2]; Pvm_recv(pvm_parent(),1);Pvm_upkint(arr,2,1);Pvm_recv(pvm_parent(),-1);Match(picture,arr[0],arr[1]);} Match( Picture, start_no, end_no) { Pick record from database one by one and match it with Picture from record no start_no till end_no. If (matched) return Details of that record// pvm_send(pvm_parent,2); }
Distributed Algorithm using PVM Master and slave both in single program Master and slave both in single programMain() { int tid = pvm_parent(); if(tid=PvmNoParent){ Write supervisor code Here.. } Else{ Write worker code here.. }}
PVM Master Program #include #include int main() { int myTID; int myTID; int x = 12; int x = 12; int children[10]; int children[10]; int res; int res; myTID = pvm_mytid(); myTID = pvm_mytid(); printf("Master: TID is %d\n", myTID); printf("Master: TID is %d\n", myTID); res = pvm_spawn("slave", NULL, PvmTaskDefault, "", 1, children); res = pvm_spawn("slave", NULL, PvmTaskDefault, "", 1, children); if (res<1) { if (res<1) { printf("Master: pvm_spawn error\n"); printf("Master: pvm_spawn error\n"); pvm_exit(); pvm_exit(); exit(1); exit(1); }
pvm_initsend(PvmDataDefault); pvm_initsend(PvmDataDefault); pvm_pkint(&x, 1, 1); pvm_pkint(&x, 1, 1); pvm_send(children[0], 1); pvm_send(children[0], 1); pvm_recv(-1, -1); pvm_recv(-1, -1); pvm_upkint(&x, 1, 1); pvm_upkint(&x, 1, 1); printf("Master has received x=%d\n", x); printf("Master has received x=%d\n", x); pvm_exit(); pvm_exit(); return 0; return 0;}
How to Compile and Execute? gcc -I /opt/pvm3/include myprogram.c -L /opt/pvm3/lib/LINUX/ -lpvm3 -o myprogramexe gcc -I /opt/pvm3/include myprogram.c -L /opt/pvm3/lib/LINUX/ -lpvm3 -o myprogramexe Illustration: Illustration: gcc = C Compilergcc = C Compiler -I = Include-I = Include opt/pvm3/include = Path (include files)opt/pvm3/include = Path (include files) myprogram = Name of source code filemyprogram = Name of source code file -L = search path info as well as the pm3 lib-L = search path info as well as the pm3 lib -o = output file-o = output file