Parallelisation of Desktop Environments Nasser Giacaman Supervised by Dr Oliver Sinnen Department of Electrical and Computer Engineering, The University of Auckland
Overview Introduction to parallel computing Motivation State of the art Objectives Methodology Conclusion
What is parallel computing?
Sequential computingParallel computing
Parallel computing challenges: Lack of a central unifying model Communication mechanisms Machine architectures
Parallel computing challenges: Task decomposition
Parallel computing challenges: Dependency analysis and Scheduling
Background to desktop parallelisation Introduction to parallel computing Motivation The need for desktop parallelisation Additional challenges for desktop parallelisation State of the art Objectives Methodology Conclusion
The need for desktop parallelisation ~1980
The need for desktop parallelisation ~1985~1980
The need for desktop parallelisation ~1990~1985~1980
The need for desktop parallelisation ~1995~1990~1985~1980
The need for desktop parallelisation
Parallelise Sequential program Parallel program
other applications will be competing for the processor Additional Challenges for Desktop Apps Non-dedicated system
Sequential Parallel Additional Challenges for Desktop Apps Overhead sensitive
repetitive computation computationally intense one-off computation interactive Additional Challenges for Desktop Apps Different application types
Background to desktop parallelisation Introduction to parallel computing Motivation State of the art Typical desktop applications The parallelisation process and tools Objectives Methodology Conclusion
worker Thread Structure of a typical desktop application Event loop Event queue Event handler... GUI Thread execute long task Paint GUI GUI update GUI components KEY Thread execution flow data structure code
The current situation Most development of desktop applications use Object Oriented (OO) languages and Class libraries 8 of the 10 most popular languages are OO*, such as Java, C++, VB.NET, Python, C#, Ruby * TPC index
How to parallelise a desktop application In general, 2 types of parallelism: Data parallelism Task parallelism
Parallelising compiler Automatic parallelisation Too specific (only simple loops over array variables) Conservative OpenMP Compiler directives Better, but not object oriented (integer-index based) e.g. #pragma omp parallel for for(int i = 0; i < 25; i++){... } How to parallelise a desktop application Data parallelism
Thread libraries (e.g. Qt, Java, Pthreads) Everything manual Coarse grained and high overhead Code restructuring OpenMP sections Again, too coarse grained ThreadWeaver, A good approach... but... Code restructuring How to parallelise a desktop application Task parallelism
Objectives Re-think the approach of desktop programming Need to provide parallelism in an OO way For both data and task parallelism Find the common “patterns” & merge with OOP Familiar to developers Focus on Maintainable code Benefiting from parallel hardware with minimal effort
Overview Introduction to parallel computing Motivation State of the art Objectives Methodology Conclusion
... hasNext()next() Data Parallelism Image resizing - Sequential version
List list = getImages(); Iterator it = list.iterator(); while ( it.hasNext() ) { Image image = it.next() resize( image ); } Data Parallelism Image resizing - Sequential version
... Data Parallelism Image resizing – Typical parallel approach
... hasNext()next()hasNext()next()hasNext()next()hasNext()next() Data Parallelism Image resizing – Java-style Iterator
List list = getImages(); Iterator it = list.iterator(); Image *array = new Image[ list.size() ]; while ( it.hasNext() ) { array[i] = it.next(); } #pragma omp parallel for for (int i = 0; i < list.size(); i++) { resize( array[i] ); } Data Parallelism Image resizing - OpenMP
... next( image ) Data Parallelism Image resizing – Our solution next( image )
List list = getImages(); ParallelIterator it = new ParallelIterator(list); // each thread does this Image image; while (it.next( image )) { resize( image ); } Data Parallelism Image resizing – Our solution
Supports all collection types, even if inherently sequential Scheduling on user's behalf static, dynamic, guided chunk size Reductions Global semantics for break statement Negligible overhead Data Parallelism The Parallel Iterator - Results
Task Parallelism class MyClass { // class variables void myMethod() { // task A // task B // task C } class MyClass { // class variables void myMethod() { // call tasks } TASK void taskA(){...} TASK void taskB(){...} TASK void taskC(){...} } class TaskA : Thread { // class variables void run() {...} } class TaskB : Thread { // class variables void run() {...} } class TaskC : Thread { // class variables void run() {...} }
Task Parallelism Encapsulate a task within a method Follows good OOP practice Unlike threads, these tasks can share resources Issues: Task completion Dependencies task keyword Like OpenMP's section, but support fine-grained parallelism and dependencies
Rethinking desktop application structure Queuing events is inherently sequential Create parallelisable tasks Accessing GUI components through multiple threads OO parallel patterns
Conclusions Multicores are here! Additional challenges exist for parallel desktop applications Allow the expression of parallelism in a way developers are already familiar with (OOP) e.g. ParallelIterator Data and Task parallelism not enough, need to look at the desktop application structure
Than k you! CPU_0 CPU_1
“Welcome to the world of Bob the Builder! With their chorus of "Can we fix it?" "Yes, we can!" Bob and his crew of talking machines teach the value of teamwork, problem solving and achieving something by working together” From Bob the Builder website...
Additional Challenges for Desktop Apps Unknown target system
Structure of a typical desktop application Event loop Event handler Paint GUI Event queue... GUI update GUI Thread Sequential desktop program KEY Thread code execution flow data structure
Blah blah blah... paralle lising compil er
Related work PARC++ (Parallel C++) & PRESTO COOL (Concurrent Object Oriented Language) & CC++ (Compositional C++) PSTL (Parallel Standard Template Library) & STAPL (Standard Template Adaptive Parallel Library) ThreadWeaver Qt Concurrent java.util.concurrent package in Java 1.5
The Parallel Iterator Different scheduling types are allowed, as in OpenMP static dynamic guided Different chunk sizes may also be specified chunk size
Additional Features 37, 78, 32, 76 sum = 0 37, 78 my_sum = , 76 my_sum = 108 sum = 223 Reductions Semantics of break for early traversal termination Local vs Global