IM.Grid: A Grid Computing Solution for image processing Image Mining Group @ Institut Pasteur Korea HongKee Moon IM.Grid: A Grid Computing Solution for image processing
Introduction Goals How? Increase speed of computation Make it much faster! How? Super powerful computer Distributed computing Multithreading Thread is a light-weight processing unit A program can have multiple threads running simultaneously Grid computing A program exploits multiple remote computers Difficult to implement GPGPU-Grid computing A program exploits multiple remote computers powered by GPGPU Very difficult to implement
Index Goals & Structure Comparisons Demo Multithreads & Grids Results Possible applications
Goals Reliability High performance
Image Processing Pipeline
Structure
Structure
Batch Processing Load an image Process the image Results Results
Multithreading Load two images Process the images Results
Grid Computing Network Results Load ten images Process the images
GPGPU-Grid Computing Network Results Load ten images Process the images
IM 1.0 IM 1.0 Batch process mode Costs less memory but slow Process screening well by well Takes 384 seconds (if it takes 1 second for processing a well of 384 well-plate) Costs less memory but slow As you have seen the batch process diagram already, It processes screening well by well.
IM 2.0 + Multithreads IM 2.0 + Multithreads Multithreaded process Process screening multiple wells simultaneously Takes 384/N seconds (N: number of threads) High performance Needs large amounts of memory for loading N images Out-of-memory exception In the multithreads, It processes screening multiple wells simultaneously. If you have 10 threads, it just takes 38 seconds. However, it might be out-of-memory exception if it loads multiple big pictures in parallel.
IM 2.0 + Grids IM 2.0 + Grids Same as Multithreaded process Process screening multiple wells simultaneously Takes 384/M seconds (M: number of grids) Communication to grids Guarantees high performance with less memory Suitable for HT-HCS The performance of grids can be more or less same as the multithreading approach.
IM 2.0 + GPGPU + Grids
Demo Compare batch-mode and multithreads IM2 multithread demo.avi Compare multithreads and grid computing Multithreads Video.avi Dell 690 workstation 2 1.8GHz Quad-core with 4GB memory Grid Video.avi 10 Dell PowerEdge Blade Servers Each grid has 2 3GHz processors with hyper-threading and 2GB memory Now I’m showing you several movies in which we tested our approaches.
Multithreads & Grids Multithreads Grids Multiple instances of plugin Allocates (N x plugin instance) + (N x images) in the memory of one computer Image data is loaded in PC high network bandwidth Better performance if PC has multi cores Grids Multithreads only for communication with multiple computers sends parameters and receives results Less memory usage Image data is loaded in each grid low network bandwidth Performance is guaranteed regardless of PC performance
Result 1 We wanted to check how different they are with respect to the speed of computations. we gave them the hard summation problems. The trends of Batch and multithread are quite steep compared to grid computing having a flat line. The performance gaps become big as the number of iterations increase.
Result 2 We also wanted to test how the network bandwidth affects image processing. In the grid mode, processing times are almost same in 1GB line and Wireless. Because the PC didn’t have to load images when it used multiple grids. However, there are huge differences in the batch mode.
Possible applications High Throughput–High Content Screening Search for optimal parameters during new algorithm development Solve complex problems with divide-conquer strategy Can be used for another general purpose Let us know if you want to use Grids!
Reference “IM.Grid, a Grid Computing Approach in High Throughput-High Content Screening”, The 9th IEEE/ACM International Conference on Grid Computing, 2008