Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Enhanced Portable Thread Manager Presentation by Vijay Murthi Supervisor: David Levine Committee Members: Behrooz Shirazi and Bob Weems.

Similar presentations


Presentation on theme: "An Enhanced Portable Thread Manager Presentation by Vijay Murthi Supervisor: David Levine Committee Members: Behrooz Shirazi and Bob Weems."— Presentation transcript:

1 An Enhanced Portable Thread Manager Presentation by Vijay Murthi Supervisor: David Levine Committee Members: Behrooz Shirazi and Bob Weems

2 2 Multithreading Creating portable and automatically scalable parallel software has been a goal for many researchers and practitioners since the advent of parallel computing. Threading is an effective way to use resources of a shared memory system. –Similar to multi-processing except threads share the same address space whereas processes don’t. –Inter-thread communication much faster and less restrictive. –Less overhead compared to process in terms of creation, deletion and management (like context switching). Even on single processor machines sometimes threads are useful –An application accessing the database while waiting for an input from the user.

3 3 Existing Solutions and Bottlenecks Hand code threading directives inside source code for calls to thread libraries. Parallel programming languages. Compiler based directives –The above methods are unduly expensive in terms of software development time, development cost and programming expertise required. Programmers still needs to concentrate on the ‘how’ of programming rather than the ‘what’ of programming.

4 4 Programming using PARSA™ Makes parallel programming similar to developing sequential programming Provides two levels of abstraction : –Abstraction from low level parallel programming issues –Abstraction from deployment issues Two tightly coupled tools have been developed to achieve this: –Software Development Environment (SDE) Address programming issues –Thread Manager Address deployment and portability issues

5 5 Software Development Environment  Based on object based programming methodology that transforms a project automatically into a parallel and scalable source code (in terms of CPUs).  Projects consists of graphical objects and arcs.  Each object represents a project task to be performed.  Arcs indicate the dependencies between objects.

6 6 Programming using SDE Interfaces define the “contract” a graphical object has with other graphical objects in a project. Each graphical object can have an INPUT interface and an OUTPUT interface. Arcs are lines connecting a desired OUTPUT port to another INPUT port. Semantically, however, arcs represent data “being passed” from a source object to a destination object.

7 7 Projects  Two types of projects –Applications and Functions  Function projects are like applications except  Functions have inputs and outputs.  The code generated is reentrant.  Invocations similar to calling functions from standard languages like C (or) C++.  FunctionInputs - passing the input arguments to the corresponding objects to start execution  FunctionOutputs - delivering the output variables back to the calling program.

8 8 Execution Model The project implicitly defines the order of execution. The SDE has an in-built source code generator for automatic code generation. The generated source code make calls to the thread manager for runtime management of threads.

9 9 ThreadMan™ Thread Manager A dynamic linkable library with a standard API. Eliminates the need for programmers to develop code to manage the runtime execution. Ensures parallel software executes according to the execution model. Supports various forms of parallelism. Makes the generated code portable. –NOTE: Programmers needs to take care of their own system specific library calls.

10 10 Contribution ThreadMan Enhancements. –Motivation. –Design. –Runtime Analysis. Reentrant Code Generation in PARSA. –Motivation. –Design. –Runtime Analysis.

11 11 ThreadMan Enhancements

12 12 Motivation Problems found in the architecture and implementation of ThreadMan v1.x. –Poor API design. Inconsistent naming conventions used. –Redundant information stored and updated. Poor run-time performance and memory utilization. –PARSA-generated source code bloat. Required more code to be generated for run time management of PARSA projects. –Memory leaks. Unreliable code – applications could crash. –Limited system support. No support for MS Windows.

13 13 Design ThreadMan has been completely re-designed –An entirely new architecture for better runtime performance –An new Application Programming Interface (API). –Consistent naming convention used. –Efficient handling of data structures. –Expanded to extend system support. API functions added to support MS Windows. –Programmer accessible. Provides APIs to the most commonly used threading, mutex and semaphore directives.

14 14 ThreadMan Architecture

15 15 ThreadMan Components

16 16 Different types of parallelism ThreadMan supports different types of parallelism –Regular parallelism (Data) –Irregular parallelism –Repeat parallelism (loop) –Nested parallelism (parallelism inside parallelism) Usually, ThreadMan is invoked as a main thread that manages the execution of other threads (or) thread managers. A child thread manager is invoked by the parent whenever there is a repeat (or) nested parallelism to control. –The children executes independently. –The parent ThreadMan waits for its children and other threads it has spawned to complete. –After execution the control is passed to the parent. ThreadMan can manage its children and other threads simultaneously.

17 17 Runtime Analysis Hardware Dual Processor Sun Ultra Enterprise 3000 with Solaris 2.6 Dual bootable Pentium II machine with Windows NT server 4.0 and Linux 6.2. Same compiler used to compile sequential and parallel source code. Applications Two applications are was developed in sequential C and PARSA version 2.0 –Merge sort posses irregular and repeat parallelism –Matrix multiplication contains regular parallelism. Thread manager manages the runtime execution of parallel version across Solaris, Linux and Windows NT platforms.

18 18 Dual Processor Performance The speedup data points were calculated by dividing the sequential execution times by the parallel execution times to normalize the performance on each system. Scalable applications executes at very nearly the maximum theoretical speedup of 2 on the Solaris and Linux systems with slightly lower performance on the Windows system. In Linux, Merge sort speedup slightly exceeds the theoretical speed limits. This is a consistent behavior seen. May be due to caching effects.

19 19 Dual Processor Performance (Contd.) Some applications by nature doesn’t scale very well. –Based on algorithmic design of the program. These applications may not suitable for multi-threading. So is the case with PARSA. NOTE: Non-scalable applications developed using multi-threading may sometimes exhibit poor performance due to overhead incurred on multithreading.

20 20 Reentrant Code generation using PARSA

21 21 Reentrant Code Reentrant code - Multiple invocations of a code can execute safe when executing concurrently: –Functions, web-based applications and libraries. Lack of reentrant behavior limits how multithreaded code can be deployed. –Unreliable results can be generated because multiple invocations of the code can corrupt each other data during execution. Non-reentrant code is unacceptable for many general purpose uses.

22 22 Motivation Originally, PARSA-generated code was not designed to be non-reentrant. –Investigation revealed problems with multithreaded functions. The problem: –The PARSA generated source code shares global data space for passing data. Multiple threads spawned during execution pass data in a global space. –This feature efficiently utilizes the shared memory architecture of multithreaded systems (e.g., single processor systems and symmetric multiprocessors (SMPs). Multithreaded function projects developed in PARSA can generate unexpected (i.e., incorrect) results when multiple invocations of the function are executed concurrently. –All invocations share the same data space and can produce unreliable results. However, application projects developed in PARSA are unaffected because they are executed with their own state at run time. –A solution was needed to eliminate the use of global shared space. create individual local space for each invocation.

23 23 Design Issues The design required a balance between conflicting requirements. –Uniformity of PARSA-generated code Multi-threaded function may require fundamentally different code because of reentrant issues. –Reduce volume of code A separate source code generator and thread manager may be needed for generating reentrant multithreaded functions. –Run time performance A solution for multithreaded functions could impact the run time performance of applications.

24 24 Design New API functions added to ThreadMan: –parsa_global_new: Dynamically allocate data passing memory. Each invocation has its own local memory allocated. –parsa_global_delete: Free allocated data passing memory. –parsa_global_getref: Get the pointer to the invocation’s locally allocated memory. Reentrant code generation feature integrated with source code generator. –The source code generator generates additional code that calls the new API. Changes to UI –Options that enable reentrant code generation for applications.

25 25 Runtime Analysis Dual Processor Sun Ultra Enterprise 3000. Solaris version 2.6. Same compiler used to compile sequential and parallel source code. Single and Dual processor analysis were performed on the same machine.

26 26 Single Processor Performance Graphs show speedup of parallel software to that of sequential ones on a single processor Solaris machine. The parallel code performance is almost equal to the sequential performance on a single processor machine. This shows thread manager has little overhead on managing threads.

27 27 Dual Processor Performance Graphs show speedup of parallel software to that of sequential one on a two processor Solaris machine. The parallel code performance is almost equal to twice the sequential performance for higher loads. This shows thread manager scales as the number of processors grows.

28 28 Papers Published a paper titled "A Tool Based Methodology for Development of Automatically Scalable and Reusable Parallel Code" for "The Tenth IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems", Fort Worth, Texas, October 2002. Published a paper titled " Creating Portable and Automatically Scalable Parallel Software Using the PARSA™ Programming Methodology" for "The 5th IEEE International conference on Algorithms and Architectures for Parallel Processing", Beijing, China, October 2002. Presented a paper titled "Performance Analysis and Scalability of the ThreadMan(TM) Thread Manager“ for "The 2002 International Conference on Parallel and Distributed Processing Techniques and Applications", Las Vegas, Nevada, July 2002. ThreadMan API published as White paper by PrismPTI,Inc,2001. [Downloadable at http://omega.uta.edu/~vxm7387/projects.html]http://omega.uta.edu/~vxm7387/projects.html

29 29 Conclusion The thesis presents ThreadMan–an integral component of PARSA that –eliminates the need for programmers to generate code that controls the run time execution of their parallel software projects –support development of multi-threaded applications and functions –supports different forms of parallelism. –makes PARSA generated parallel source code portable across a wide range of platforms. Reentrant multi-threaded functions can be developed in PARSA that will safely execute in a wide range of deployment environments: –C functions. –C++ methods. –Web-based applications.


Download ppt "An Enhanced Portable Thread Manager Presentation by Vijay Murthi Supervisor: David Levine Committee Members: Behrooz Shirazi and Bob Weems."

Similar presentations


Ads by Google