Multi process-Multi Threaded Amir Averbuch Nezer J. Zaidenberg Amir Averbuch Nezer J. Zaidenberg
Referances – From APUE 2e / Select and pselect – Ch / Process and forking – Ch. 8 / Threads – Ch / Select and pselect – Ch / Process and forking – Ch. 8 / Threads – Ch
Doing things in parallel / Many times we are faced with a system that must handle multiple requests in parallel. / Handling multiple inputs in multiple terminals (or sockets, or sessions etc.) / Processing multiple requests by a server / Handling several transactions, avoiding being hang if one transactions takes too long. / Doing things while waiting for something else(I/O computation etc.) / “Busy waiting” is usually a bad idea. / Several APIs provide alternatives. / Many times we are faced with a system that must handle multiple requests in parallel. / Handling multiple inputs in multiple terminals (or sockets, or sessions etc.) / Processing multiple requests by a server / Handling several transactions, avoiding being hang if one transactions takes too long. / Doing things while waiting for something else(I/O computation etc.) / “Busy waiting” is usually a bad idea. / Several APIs provide alternatives.
Busy waiting / Busy waiting (v) – a process who keeps asking the kernel – do I have something to do? (do I have I/O? did I wait enough time)
Doing things with single process / Initial solution was to do things in a single process. With API that allows for concurrency / Select(2) API is the most common / Other API’s include / Aio_XXX API (and kaio_XXX) / Various forms of “graceful multi-tasking” / Signals / Select API is the API that is most widely used today / Initial solution was to do things in a single process. With API that allows for concurrency / Select(2) API is the most common / Other API’s include / Aio_XXX API (and kaio_XXX) / Various forms of “graceful multi-tasking” / Signals / Select API is the API that is most widely used today
Select(2) / The situation : / Inputs come over several file descriptors (in the UNIX OS an open terminal, communication socket, and actual file I/O are all done over file descriptors) / Output may be written to several interfaces and it may take time to write (less frequent) / Waiting for exceptions on file descriptors (practically non-existent) / Usually it takes very little to process input or output / The situation : / Inputs come over several file descriptors (in the UNIX OS an open terminal, communication socket, and actual file I/O are all done over file descriptors) / Output may be written to several interfaces and it may take time to write (less frequent) / Waiting for exceptions on file descriptors (practically non-existent) / Usually it takes very little to process input or output
Select(2) API int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Nfds - The first nfds file descriptors are checked in each set. Therefore, should be equal max fd used +1 (since fd’s start from zero) Fd_sets - actually bit_array. The OS provide facilities to manipulate. Timeout - return with timeout after XXX seconds int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Nfds - The first nfds file descriptors are checked in each set. Therefore, should be equal max fd used +1 (since fd’s start from zero) Fd_sets - actually bit_array. The OS provide facilities to manipulate. Timeout - return with timeout after XXX seconds
Select example int main(void) { struct timeval tv; fd_set readfds; tv.tv_sec = 10; tv.tv_usec = 0; FD_ZERO(&readfds); FD_SET(0, &readfds); select(1, &readfds, NULL, NULL, &tv); if (FD_ISSET(0, &readfds)) { char c = getc(stdin); printf("%c was pressed",c); } else printf("timeout\n"); return 0; } int main(void) { struct timeval tv; fd_set readfds; tv.tv_sec = 10; tv.tv_usec = 0; FD_ZERO(&readfds); FD_SET(0, &readfds); select(1, &readfds, NULL, NULL, &tv); if (FD_ISSET(0, &readfds)) { char c = getc(stdin); printf("%c was pressed",c); } else printf("timeout\n"); return 0; }
Problems with select / Only file descriptors can be handled. (In Windows – ONLY sockets can be handled) – One can not wait for file descriptor and semaphore /computation/ mutex / etc. / Un-fairness in large set on some implementation / Select ruins its arguments (timeval and fdsets) however no assumption can be made on “how they are ruined” (I.e. how much time is left on timeval) / Many modern UNIX OS support Poll(2) a select replacement. On such systems Select is usually implemented using poll. (but Poll is not available anywhere!) / Only file descriptors can be handled. (In Windows – ONLY sockets can be handled) – One can not wait for file descriptor and semaphore /computation/ mutex / etc. / Un-fairness in large set on some implementation / Select ruins its arguments (timeval and fdsets) however no assumption can be made on “how they are ruined” (I.e. how much time is left on timeval) / Many modern UNIX OS support Poll(2) a select replacement. On such systems Select is usually implemented using poll. (but Poll is not available anywhere!)
When to use select(2) / Use select(2) when the following occur : / Needing to handle multiple inputs / Inputs treatment can be sequential / Treating individual request is very short / All inputs are file descriptors / Wishing to avoid threading/multi process problems / When inputs comes in different format it may be possible to use other API. However same considerations apply. / Use select(2) when the following occur : / Needing to handle multiple inputs / Inputs treatment can be sequential / Treating individual request is very short / All inputs are file descriptors / Wishing to avoid threading/multi process problems / When inputs comes in different format it may be possible to use other API. However same considerations apply.
What not to use instead of select / Other Async I/O methods, unless you know what you are doing. (use Poll if you like, but take note of portability issues) / Busy waiting / Extra Thread/Process to do what select can do just fine. / Other Async I/O methods, unless you know what you are doing. (use Poll if you like, but take note of portability issues) / Busy waiting / Extra Thread/Process to do what select can do just fine.
select and pselect / Modern implementation of UNIX also include pselect(2) system call which is similar to select(2) / pselect(2) have almost the same parameters with two differences / Wait times can be given in nanoseconds instead of milliseconds / A signal mask for signals to be ignored while waiting is given / Modern implementation of UNIX also include pselect(2) system call which is similar to select(2) / pselect(2) have almost the same parameters with two differences / Wait times can be given in nanoseconds instead of milliseconds / A signal mask for signals to be ignored while waiting is given
Running multiple tasks
Thread, Process, Task - definitions / Process(n) – a running program. With its own memory protected by the OS from other processes. / Thread(n) – “mini-program” or “program within a program” a separate running environment inside a process. / A single process may contain numerous threads. / Threads have memory protection from other processes (including threads in these process) but not from other threads in the same process. / Task(n) – will be used to refer to either thread or process. / Process(n) – a running program. With its own memory protected by the OS from other processes. / Thread(n) – “mini-program” or “program within a program” a separate running environment inside a process. / A single process may contain numerous threads. / Threads have memory protection from other processes (including threads in these process) but not from other threads in the same process. / Task(n) – will be used to refer to either thread or process.
clarification / Each process we know of has atleast one thread – the main() thread.
Multi tasking methods / Graceful multi tasking – each task specify when it “agrees” to be moved out of the CPU for another task. – lwp library an example. / Does not exist today / MasOS classic and Windows 3.11 are examples / Pre-emptive multitasking – The kernel decides which process receives CPU and when. The kernel moves tasks into the running scope. / Graceful multi tasking – each task specify when it “agrees” to be moved out of the CPU for another task. – lwp library an example. / Does not exist today / MasOS classic and Windows 3.11 are examples / Pre-emptive multitasking – The kernel decides which process receives CPU and when. The kernel moves tasks into the running scope.
Multi tasking definition / Pre-empt (v) – the act of swapping processes / Scheduler (n) - Part of the OS kernel that is responsible on pre-empting tasks and putting new tasks to execute / Pre-empt (v) – the act of swapping processes / Scheduler (n) - Part of the OS kernel that is responsible on pre-empting tasks and putting new tasks to execute
Multi-process programming / Running multiple tasks tasks in different process. / Task switching is managed by the OS in pre-emptive multi-tasking. / Each process has its own memory space. (heap, stack, global variables, process environment) / Information and synchronization should be delivered from process to process using multi process communications API (such as Unix domain sockets) / Running multiple tasks tasks in different process. / Task switching is managed by the OS in pre-emptive multi-tasking. / Each process has its own memory space. (heap, stack, global variables, process environment) / Information and synchronization should be delivered from process to process using multi process communications API (such as Unix domain sockets)
Creating new process : Fork(2) / Fork creates a new process identical to the current one except for the response to fork(2) / Other methods to invoke a new processes under UNIX / System (run executable) / execXXX (function family to replace current process image with a new one) / Fork creates a new process identical to the current one except for the response to fork(2) / Other methods to invoke a new processes under UNIX / System (run executable) / execXXX (function family to replace current process image with a new one)
Fork example + why does hello world printed twice Int main() { printf(“hello world”); fork(); printf(“\n”) fflush(stdout); } Int main() { printf(“hello world”); fork(); printf(“\n”) fflush(stdout); }
Answer / Printf(3) works with buffers (that we can fflush(3) later. / First printf(3) just copied stuff to the buffer / fork(2) duplicated the process. (including the buffer) / Both buffers were flushed. / Printf(3) works with buffers (that we can fflush(3) later. / First printf(3) just copied stuff to the buffer / fork(2) duplicated the process. (including the buffer) / Both buffers were flushed.
This example doesn’t work on every system because printf(3) and flushing implementation are not standard and depend on compiler versions) but when it does work its KEWL!
How to pass information between process / Using network sockets / Using Unix domain sockets / Using Sys V/Posix IPC (message queues) / Using shared memory / Using RPC (or COM/CORBA/RMI etc.) / File locking “semaphore” kludge, Linux unmapped shared memory kludge / Platfrom specific APIs (Linux sendfile, Sun Doors etc.) / Using network sockets / Using Unix domain sockets / Using Sys V/Posix IPC (message queues) / Using shared memory / Using RPC (or COM/CORBA/RMI etc.) / File locking “semaphore” kludge, Linux unmapped shared memory kludge / Platfrom specific APIs (Linux sendfile, Sun Doors etc.)
In this course / We will discuss network and unix domain sockets as means to deliver information / We will discuss file locking via fcntl(2) as means to implement semaphores. / Other methods are described in APUE. / We will discuss network and unix domain sockets as means to deliver information / We will discuss file locking via fcntl(2) as means to implement semaphores. / Other methods are described in APUE.
Waiting for process to die / A process will usually run un effected by other process it spawned. / When process terminates it returns a return code (the int from “int main()”) to it’s parent process. / The parent process usually (unless we do something smart) ignores it. / Parent process can wait for a child process (or any child process.) to terminate using wait(2) and waitpid(2) API. / A process will usually run un effected by other process it spawned. / When process terminates it returns a return code (the int from “int main()”) to it’s parent process. / The parent process usually (unless we do something smart) ignores it. / Parent process can wait for a child process (or any child process.) to terminate using wait(2) and waitpid(2) API.
Zombie process / A process that terminates, but whose parent has not received it’s termination status (usually means something is wrong with the parent) remain in the system as “zombie” process / “Orphaned” processes are adopted by init (process number 1) who always wait for its children to die / A process that terminates, but whose parent has not received it’s termination status (usually means something is wrong with the parent) remain in the system as “zombie” process / “Orphaned” processes are adopted by init (process number 1) who always wait for its children to die
exit(2) / Process can die and notify it’s parent about it’s exit status using the exit(2) system call. / Calling this system call terminate the calling process / Process can die and notify it’s parent about it’s exit status using the exit(2) system call. / Calling this system call terminate the calling process
Network sockets example for IPC Beej’s guide to network programming provide helpful tutorial on how to communicate between two process on a single host. This guide will be described at recitation. / e/bgnet.html Beej’s guide to network programming provide helpful tutorial on how to communicate between two process on a single host. This guide will be described at recitation. / e/bgnet.html
Select - revisited / When child process terminates parent process receive signal which causes select(2) to abort returning EINTR value. / If you code multi process application and use select you should usually ignore this return status. (or mask SIGCHLD and use pselect(2)) / When child process terminates parent process receive signal which causes select(2) to abort returning EINTR value. / If you code multi process application and use select you should usually ignore this return status. (or mask SIGCHLD and use pselect(2))
Problems with multi process / Since processes are memory protected it is relatively hard to sync and pass information between multiple processes. / Using API’s force us to some constraints inherited by the API / Process overhead especially process creation overhead is heavy / Context switching is expensive / Since processes are memory protected it is relatively hard to sync and pass information between multiple processes. / Using API’s force us to some constraints inherited by the API / Process overhead especially process creation overhead is heavy / Context switching is expensive
Software engineering : when to use multi-process environment / Requests should be handled simultaneously. / select not suitable. / Process are created infrequently (or preferably, only once). / Relatively low number of processes overall / IPC is not needed frequently. / You want process memory protection. / Requests should be handled simultaneously. / select not suitable. / Process are created infrequently (or preferably, only once). / Relatively low number of processes overall / IPC is not needed frequently. / You want process memory protection.
When not to use processes / Whenever we can (reasonably) do the job in one process. / Lots of information is transferred. / High performance is needed and you don’t know what you are doing. (context switch is expensive.) / In almost any case when thread be just as good, much simpler and won’t hurt us. / Whenever we can (reasonably) do the job in one process. / Lots of information is transferred. / High performance is needed and you don’t know what you are doing. (context switch is expensive.) / In almost any case when thread be just as good, much simpler and won’t hurt us.
User threads Multi-thread programming / Process are managed in separate memory spaces by the OS that requires us to use IPC to transfer information between processes. / Threads are mini-processes. Sharing heap, process environment and global variables scope but each thread has a different stack for it’s own. / Using threads - the entire heap is shared memory! (actually the entire process!) / Process are managed in separate memory spaces by the OS that requires us to use IPC to transfer information between processes. / Threads are mini-processes. Sharing heap, process environment and global variables scope but each thread has a different stack for it’s own. / Using threads - the entire heap is shared memory! (actually the entire process!)
Threads API / POSIX 95 threads API is now common on all UNIX OS and should be used whenever threads are needed on UNIX OS for all new applications. / Legacy applications may use different threads API (usually prior to Posix 95) such as Solaris threads. Those APIs are usually almost identical to Posix API. / Microsoft windows has similar API. / POSIX 95 threads API is now common on all UNIX OS and should be used whenever threads are needed on UNIX OS for all new applications. / Legacy applications may use different threads API (usually prior to Posix 95) such as Solaris threads. Those APIs are usually almost identical to Posix API. / Microsoft windows has similar API.
In this course / We will cover POSIX threads API. / We will briefly discuss microsoft windows threads API / We will give example to cross platform thread class. / We will cover POSIX threads API. / We will briefly discuss microsoft windows threads API / We will give example to cross platform thread class.
pthread_create(3) / Creates a new thread / Gets a function pointer to serve as the thread main function / Threads can be manipulated (waited for, prioritized) in a similar way to processes but only internally. / Creates a new thread / Gets a function pointer to serve as the thread main function / Threads can be manipulated (waited for, prioritized) in a similar way to processes but only internally.
Critical section / Very often we reach a situation when two tasks need access to the same memory area. / This can happen with processes and shared memory / This occurs very frequently with threads. / Allowing access to both tasks will very often result in corrupt reads or writes. / When both try to write / When one write and one read / No problem with two reads / Very often we reach a situation when two tasks need access to the same memory area. / This can happen with processes and shared memory / This occurs very frequently with threads. / Allowing access to both tasks will very often result in corrupt reads or writes. / When both try to write / When one write and one read / No problem with two reads
Memory corruption / When two tasks try to access same memory space / We would like to guarantee that / After a read either the new or old state of the memory will be given (not a mishmash) / After multiple write – either write state will be reside completely in the memory (but no a mishmash of two writes) / Failing that we have memory corruption, / When two tasks try to access same memory space / We would like to guarantee that / After a read either the new or old state of the memory will be given (not a mishmash) / After multiple write – either write state will be reside completely in the memory (but no a mishmash of two writes) / Failing that we have memory corruption,
Handling critical section / Elimination (preferred method) / Locking / Mutex / semaphores etc. / Risk memory overrun - DO NOT DO IT. Even if you are 100% sure you know what you are doing!!!! (and if you do, consult some one, think again, and consult somebody else too!) / Elimination (preferred method) / Locking / Mutex / semaphores etc. / Risk memory overrun - DO NOT DO IT. Even if you are 100% sure you know what you are doing!!!! (and if you do, consult some one, think again, and consult somebody else too!)
pthread_mutex / Posix 95 provide two main forms of sync / A Mutex – or Mutually exclusion is a device served to lock other threads from entering critical section while I (I am a thread) am using it. / Cond - sort of “reverse mutex” a device that is served to lock myself (I am a thread) from entering critical section while another thread prepares it for use. / Posix 95 provide two main forms of sync / A Mutex – or Mutually exclusion is a device served to lock other threads from entering critical section while I (I am a thread) am using it. / Cond - sort of “reverse mutex” a device that is served to lock myself (I am a thread) from entering critical section while another thread prepares it for use.
Deadlock (software engineering bug) / Consider a state were two resources are required in order to do something. / Each resource is protected by a mutex. / Two tasks each locks a mutex and wait for the other mutex to be available. / Both tasks hang and no work is done. / It is up to the software engineer to avoid deadlocks. / Consider a state were two resources are required in order to do something. / Each resource is protected by a mutex. / Two tasks each locks a mutex and wait for the other mutex to be available. / Both tasks hang and no work is done. / It is up to the software engineer to avoid deadlocks.
Recursive mutex / What happens if a thread locks a mutex then by some chain of events re-locks it? / By no means should the process be blocked (deadlocked) by itself. / Should the thread unlock the mutex (which was locked twice) does it unlocks or should it be unlocked twice? / Different implementation have different answers. Linux requires equal numbers of locks and unlocks while default Solaris behavior is to unlock all locks. / Default behavior can be changed (for Linux or Solaris) by specifying the mutex is/is not recursive. / Recursive = Linux interpretation. / What happens if a thread locks a mutex then by some chain of events re-locks it? / By no means should the process be blocked (deadlocked) by itself. / Should the thread unlock the mutex (which was locked twice) does it unlocks or should it be unlocked twice? / Different implementation have different answers. Linux requires equal numbers of locks and unlocks while default Solaris behavior is to unlock all locks. / Default behavior can be changed (for Linux or Solaris) by specifying the mutex is/is not recursive. / Recursive = Linux interpretation.
Using recursive mutexes is usually deprecated way to write code. (since programmers reading the code tend to think the mutex is unlocked while in practice it is) But programmers do it anyway…. Using recursive mutexes is usually deprecated way to write code. (since programmers reading the code tend to think the mutex is unlocked while in practice it is) But programmers do it anyway….
pthread_cond / Cond is a “reverse mutex” i.e. unlike a mutex which is usable in first use and is blocked until released, a cond is blocked when first acquired and is “released” when a second thread acquires it. / Cond is typically used in a “producer- consumer” environment when the consumer is ready to consume before the producer is ready to produce. The consumer locks the cond. The producer unlocks when something is available. / Cond is a “reverse mutex” i.e. unlike a mutex which is usable in first use and is blocked until released, a cond is blocked when first acquired and is “released” when a second thread acquires it. / Cond is typically used in a “producer- consumer” environment when the consumer is ready to consume before the producer is ready to produce. The consumer locks the cond. The producer unlocks when something is available.
Pthread create int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg); int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg);
Arguments for pthread_create / First argument is the thread id. (so that we can later do stuff with the thread) / 2 nd argument is used for creation attributes (can be safely ignored in this course) / 3 rd argument is the thread start routine. (the thread int main()) / 4 th argument is the thread function arg (the thread argc/argv) / More on that in the recitation / First argument is the thread id. (so that we can later do stuff with the thread) / 2 nd argument is used for creation attributes (can be safely ignored in this course) / 3 rd argument is the thread start routine. (the thread int main()) / 4 th argument is the thread function arg (the thread argc/argv) / More on that in the recitation
Windows create thread HANDLE WINAPI CreateThread( __in_opt LPSECURITY_ATTRIBUTES lpThreadAttributes, __in SIZE_T dwStackSize, __in LPTHREAD_START_ROUTINE lpStartAddress, __in_opt LPVOID lpParameter, __in DWORD dwCreationFlags, __out_opt LPDWORD lpThreadId ); HANDLE WINAPI CreateThread( __in_opt LPSECURITY_ATTRIBUTES lpThreadAttributes, __in SIZE_T dwStackSize, __in LPTHREAD_START_ROUTINE lpStartAddress, __in_opt LPVOID lpParameter, __in DWORD dwCreationFlags, __out_opt LPDWORD lpThreadId );
Comparison of windows and UNIX threads functions / Windows 1 st, 2 nd and 5 th arguments are contained in UNIX 2 nd arguments (the thread attributes) / 3 rd and 4 th windows argument correspond to 3 rd and 4 th unix argument. (the thread function and its arguments) / 6 th windows argument correspond to first unix argument (thread id) / Windows 1 st, 2 nd and 5 th arguments are contained in UNIX 2 nd arguments (the thread attributes) / 3 rd and 4 th windows argument correspond to 3 rd and 4 th unix argument. (the thread function and its arguments) / 6 th windows argument correspond to first unix argument (thread id)
Different OS’s have different API but same principles rule everywhere. (including embedded OS’s, realtime OS’s, mainframe, cellphone OS’s etc.)
Threads benefits / Threads provide easy method for multi- programming (because we have easier time passing information) / Threads are lighter to create and delete then process / Threads have easy access to other threads variables (and thus doesn’t need to write a IPC protocol) / Context switching is usually cheaper then process / Threads are cool and sexy / Threads provide easy method for multi- programming (because we have easier time passing information) / Threads are lighter to create and delete then process / Threads have easy access to other threads variables (and thus doesn’t need to write a IPC protocol) / Context switching is usually cheaper then process / Threads are cool and sexy
Problems when using threads / No need to do IPC means all problems with locking and unlocking are up to the programmer - All seasoned programmers have several horror stories chasing bugs past midnight in dreaded threaded environment! / Context switching makes it more efficient to use single thread the multi thread. / Because threads are cool and sexy, Threads use is often overdone. De-threading is common task in many mature applications. / No need to do IPC means all problems with locking and unlocking are up to the programmer - All seasoned programmers have several horror stories chasing bugs past midnight in dreaded threaded environment! / Context switching makes it more efficient to use single thread the multi thread. / Because threads are cool and sexy, Threads use is often overdone. De-threading is common task in many mature applications.
Common misconception about thread stacks / Each thread require its own stack in order to enter function define automatic variables etc. / So the OS gives a new stack to each thread / But the OS have no memory protection between threads period / That means if we create a pointer and point to thread local stack scope, other threads can change it with no locking. / Each thread require its own stack in order to enter function define automatic variables etc. / So the OS gives a new stack to each thread / But the OS have no memory protection between threads period / That means if we create a pointer and point to thread local stack scope, other threads can change it with no locking.
As always… When people are doing things that will only confuse other programmers this is deprecated. As always… When people are doing things that will only confuse other programmers this is deprecated.
Thread safety and re-entrant code / Consider the function “strtok(3)”. / Beside the fact that this function is one of the worst atrocities devised by mankind it is also non-reentrant / This function uses a char * in the global scope so that multiple calls can be called with NULL as the first argument. / Consider what happens to this function when multiple threads use it simultaneously. / Consider the function “strtok(3)”. / Beside the fact that this function is one of the worst atrocities devised by mankind it is also non-reentrant / This function uses a char * in the global scope so that multiple calls can be called with NULL as the first argument. / Consider what happens to this function when multiple threads use it simultaneously.
Calling strtok from two threads / First thread calls strtok. Gives char pointer which is saved in strtok static char * / Second thread calls strtok. Overwrites first char pointer. / First thread call strtok with NULL / Poetic justice? Just what the caller deserve? / First thread calls strtok. Gives char pointer which is saved in strtok static char * / Second thread calls strtok. Overwrites first char pointer. / First thread call strtok with NULL / Poetic justice? Just what the caller deserve?
Strtok example cont’d. / The global char * is a “CRITICAL SECTION” in the sense it may not be used twice by two different threads / So the second call for strtok would ruin it for the first call. / A different function was offered that doesn’t use global buffer. – strtok_r() this function takes an external buffer. / Similarly ctime() now has ctime_r() localtime() has localtime_r() etc. / The global char * is a “CRITICAL SECTION” in the sense it may not be used twice by two different threads / So the second call for strtok would ruin it for the first call. / A different function was offered that doesn’t use global buffer. – strtok_r() this function takes an external buffer. / Similarly ctime() now has ctime_r() localtime() has localtime_r() etc.
Remove the critical section #include char * strtok(char *str, const char *sep); char * strtok_r(char *str, const char *sep, char **last);
Compiling multi-threaded code / Multi-threaded code requires several compile time consideration / Usually a compile/link switch (-lpthread or –pthread in UNIX platfroms or /MT (/MTd) in Microsoft Windows) / Linking multi-thread and non- multithreaded code may result in link or runtime errors on different platforms. / Multi-threaded code requires several compile time consideration / Usually a compile/link switch (-lpthread or –pthread in UNIX platfroms or /MT (/MTd) in Microsoft Windows) / Linking multi-thread and non- multithreaded code may result in link or runtime errors on different platforms.
Software engineering : when to use threads / We cannot do things in a single thread efficiently. / Multi processing is required. / Lots of data is shared between threads. / We don’t need OS memory protection. / We think the new thread is absolutely necessary. / We cannot do things in a single thread efficiently. / Multi processing is required. / Lots of data is shared between threads. / We don’t need OS memory protection. / We think the new thread is absolutely necessary.
Common mal usage of threads / Create a new thread for every request received in a server. / expensive to create and delete threads. / often causes starvation. (the OS doesn’t know which thread needs CPU) / reduce overall performance. / Create multiple threads for each running transaction on a server. (such as DB). / Instead create a thread pull of worker thread. Share a work queue. / Create a new thread for every request received in a server. / expensive to create and delete threads. / often causes starvation. (the OS doesn’t know which thread needs CPU) / reduce overall performance. / Create multiple threads for each running transaction on a server. (such as DB). / Instead create a thread pull of worker thread. Share a work queue.
Common mal use of threads 2 / Create many “this little thread only does this” threads. / Impossible to design reasonable locking and unlocking state machines. / once number of threads go up, too many thread-2-thread interfaces locking and unlocking are guaranteed to cause bugs. / Only create threads when things must be done in parallel and no other thread can reasonably do the task. / Create many “this little thread only does this” threads. / Impossible to design reasonable locking and unlocking state machines. / once number of threads go up, too many thread-2-thread interfaces locking and unlocking are guaranteed to cause bugs. / Only create threads when things must be done in parallel and no other thread can reasonably do the task.
Summary / Single process should provide best overall performance / Easiest to program / Single process may be hard to design, specifically if needs to handle inputs from multiple sources types / Single process may be prune to be hang on specific request / Should be preferred when ever complexity rising from multiplicity is not severe / Single process should provide best overall performance / Easiest to program / Single process may be hard to design, specifically if needs to handle inputs from multiple sources types / Single process may be prune to be hang on specific request / Should be preferred when ever complexity rising from multiplicity is not severe / Multi process use the OS to create processes, swap process and context switch, thus adding load / IPC makes it hard to program / Usually easy to design if process tasks are easily separated / Should be preferred when IPC is minimal and we wish to have better control over memory access in each process. Multi thread use the OS to create threads and context switch, adding load. However not as much as process because threads are lighter Easy to program and pass information between threads, but also dangerous Usually hard to design to avoid deadlocks, bottlenecks, etc Should be preferred when lots of IPC is needed Dangerous : novice programmers reading and writing to unprotected memory segments
Common multi-threaded design patterns
Producer - Consumer / Produce something / Put it in queue. / Inform consumer it is ready / Produce something / Put it in queue. / Inform consumer it is ready / Wait on queue / Take stuff from queue / Consume it / Return to queue
Producer - Consumer / Producer – Consumer used typically with handler threads. Some thread does some work and puts it for the other thread(s) to consume. / Sometimes a series of producer-consumer define a single transaction / Real world examples : handle requests by web server, db server or many other server that gets request in a single pipe and have several handling threads / Producer – Consumer used typically with handler threads. Some thread does some work and puts it for the other thread(s) to consume. / Sometimes a series of producer-consumer define a single transaction / Real world examples : handle requests by web server, db server or many other server that gets request in a single pipe and have several handling threads
Guard / Converting non reentrant code to reentrant code is sometimes tedious task. / Code from multiple threads may enter non reentrant scope from many places. / If we use locking and forget to release the mutex we may suffer from deadlocks (sometimes releasing the mutex is not as trivial as it sounds because legacy code tends to have many surprises in store (such as break, continue, goto and other “goodies”)) / Guard or “Scope Mutex” is a class that wraps a mutex implementation / Class destructor releases the mutex. / By using the C++ destructor mechanism we insure that when we leave the “critical segment” the mutex will be released / Converting non reentrant code to reentrant code is sometimes tedious task. / Code from multiple threads may enter non reentrant scope from many places. / If we use locking and forget to release the mutex we may suffer from deadlocks (sometimes releasing the mutex is not as trivial as it sounds because legacy code tends to have many surprises in store (such as break, continue, goto and other “goodies”)) / Guard or “Scope Mutex” is a class that wraps a mutex implementation / Class destructor releases the mutex. / By using the C++ destructor mechanism we insure that when we leave the “critical segment” the mutex will be released
Scope Mutex header class CScopeMutex { public: CScopeMutex(Cmutex& mutex); ~CScopeMutex() {unlock();} void wait(); void signal(); inline void lock() { wait(); } inline void unlock() { signal(); } private: Cmutex& Mutex; } class CScopeMutex { public: CScopeMutex(Cmutex& mutex); ~CScopeMutex() {unlock();} void wait(); void signal(); inline void lock() { wait(); } inline void unlock() { signal(); } private: Cmutex& Mutex; }
Signal / Using select is very easy and is very often required. / We cannot wait on socket and cond using select. / Instead we will use socket buffer. We will read (using select(2) off course) 1 byte from the socket buffer, when we wish to wait for cond / We will write 1 byte when we wish release cond / Using select is very easy and is very often required. / We cannot wait on socket and cond using select. / Instead we will use socket buffer. We will read (using select(2) off course) 1 byte from the socket buffer, when we wish to wait for cond / We will write 1 byte when we wish release cond
Signal header class Csignal { private: int fd[2]; char buf; void InitSignal(); public: Csignal(); virtual ~Csignal(); Csignal(const Csignal& other) { InitSignal(); } int send(); int signal() {return send();} int wait(); int GetWaitFD(); }; class Csignal { private: int fd[2]; char buf; void InitSignal(); public: Csignal(); virtual ~Csignal(); Csignal(const Csignal& other) { InitSignal(); } int send(); int signal() {return send();} int wait(); int GetWaitFD(); };
Signal implementation Csignal::Csignal() { InitSignal(); buf = 42; } void Csignal::InitSignal() { if (socketpair(AF_UNIX, SOCK_STREAM, 0, fd) == -1) THROW_SOCKETERROR; } Csignal::~Csignal() { close(fd[0]); close(fd[1]); } Csignal::Csignal() { InitSignal(); buf = 42; } void Csignal::InitSignal() { if (socketpair(AF_UNIX, SOCK_STREAM, 0, fd) == -1) THROW_SOCKETERROR; } Csignal::~Csignal() { close(fd[0]); close(fd[1]); }
Signal example int Csignal::send() { if (::send (fd[0], &buf, sizeof(char), 0) != sizeof(char)) THROW_ERRNO; return 1; } int Csignal::wait() { char res; if (recv(fd[1], &res, sizeof(char), 0) != 1) THROW_ERRNO; return 1; } int Csignal::GetWaitFD() { return fd[1]; } int Csignal::send() { if (::send (fd[0], &buf, sizeof(char), 0) != sizeof(char)) THROW_ERRNO; return 1; } int Csignal::wait() { char res; if (recv(fd[1], &res, sizeof(char), 0) != 1) THROW_ERRNO; return 1; } int Csignal::GetWaitFD() { return fd[1]; }
Further reading and examples / Numerous libraries exist on the net to manage OS services and provide infrastructure design pattern on efficient multi platfrom environment / Examples include / Nspr (netscape portable run time) / ICE / ACE – which I prefer / Numerous libraries exist on the net to manage OS services and provide infrastructure design pattern on efficient multi platfrom environment / Examples include / Nspr (netscape portable run time) / ICE / ACE – which I prefer
Example code : Thread wrapper class This class will create a thread using Windows threads, Solaris threads and Posix threads. The class has an “Execute” method to be inherited and modified by derived classes (The derived class “is a” thread) I will only discuss the thread creation. Real world implementation should also include / Methods to wait for termination, suspend, prioritize / Attributes to get status (started, stopped, terminated) and return code / Queues for working threads / Etc etc…. This class will create a thread using Windows threads, Solaris threads and Posix threads. The class has an “Execute” method to be inherited and modified by derived classes (The derived class “is a” thread) I will only discuss the thread creation. Real world implementation should also include / Methods to wait for termination, suspend, prioritize / Attributes to get status (started, stopped, terminated) and return code / Queues for working threads / Etc etc….
Cthread : header file class CThread { public: CThread(CMutexedSignal* FinishedSignal = NULL); virtual ~CThread(); virtual void * Execute() = 0; int CreateThread(); void WaitFor(); void Terminate(); thread_t Thread; CMutexedSignal* FinishedSignal; bool Started; bool Finished; }; class CThread { public: CThread(CMutexedSignal* FinishedSignal = NULL); virtual ~CThread(); virtual void * Execute() = 0; int CreateThread(); void WaitFor(); void Terminate(); thread_t Thread; CMutexedSignal* FinishedSignal; bool Started; bool Finished; };
Cthread function body int CThread::CreateThread() { #ifdef WIN32 HANDLE Thread; DWORD ID; Thread = ::CreateThread(NULL, 0, call_start, (void*)this, 0, &ID); this->Thread = Thread; return (int)(Thread == NULL); #else return ctf_thread_create(&Thread, call_start, this); #endif } int CThread::CreateThread() { #ifdef WIN32 HANDLE Thread; DWORD ID; Thread = ::CreateThread(NULL, 0, call_start, (void*)this, 0, &ID); this->Thread = Thread; return (int)(Thread == NULL); #else return ctf_thread_create(&Thread, call_start, this); #endif }
Call_start - Thread main() #ifndef WIN32 extern "C" { static void * call_start(void * This) #else DWORD WINAPI call_start(LPVOID This) #endif { if (This) { ((CThread *)This)->Started = true; ((CThread *)This)->Execute(); ((CThread *)This)->Finished = true; if ( ((CThread *)This)->FinishedSignal) ((CThread *)This)->FinishedSignal->signal(); } return NULL; } #ifndef WIN32 } #endif #ifndef WIN32 extern "C" { static void * call_start(void * This) #else DWORD WINAPI call_start(LPVOID This) #endif { if (This) { ((CThread *)This)->Started = true; ((CThread *)This)->Execute(); ((CThread *)This)->Finished = true; if ( ((CThread *)This)->FinishedSignal) ((CThread *)This)->FinishedSignal->signal(); } return NULL; } #ifndef WIN32 } #endif
Ctf_create_thread inline int ctf_thread_create(pthread_t *thread, void* (*start_routine)(void *), void* arg) { #ifdef Solaris_threads return thr_create(NULL, (size_t)0, start_routine, arg, 0, thread); #else // POSIX THREADS return pthread_create(thread, NULL, start_routine, arg); #endif } inline int ctf_thread_create(pthread_t *thread, void* (*start_routine)(void *), void* arg) { #ifdef Solaris_threads return thr_create(NULL, (size_t)0, start_routine, arg, 0, thread); #else // POSIX THREADS return pthread_create(thread, NULL, start_routine, arg); #endif }