Parallelizing an Image Compression Toolbox MSE Project - Presentation 1 Hadassa Baker
Topics of Discussion Introduction Overview Requirements Methodology Image Toolbox Description Project Plan Cost Estimation SQA Plan
Introduction The use of digital motion pictures is gaining much popularity in various industries, such as film production companies, museums, etc. Digital image files are used to create digital motion pictures. Digital image files are generally large and require compression to be used effectively. Image compression processes are generally computationally intensive The use of digital motion pictures is gaining much popularity in various industries, motion picture production, surveillance, museums, etc. Digital graphical images are used in creation of digital motion pictures. Digital image files are generally large and usually require compression to be used effectively. Image compression process is generally computationally intensive as huge amount of data must be processed. To process 1hr of high definition video:
Introduction To process 1hr of high definition video: 1 frame = 1920 pixels wide, 1080 pixels high, 3 components/pixel (RGB) = 6220.8 KB In general 24 frames/second Total number of frames per hour = 24 fr/s * 60 s/min * 60 min/hr = 86400 Total data size = 8640*6220800 = 537477120000 B = 537477 MB/hr Need to speed up compression process
Overview Purpose To explore the use of parallel programming techniques to speed up a computationally intensive image compression and decompression process
Overview Goal To rewrite a sequential image compression toolbox source code into a parallel program in an effort to speed up the software Analyze factors that affect execution speed – such as number of processors Look for general trends
Requirement Specification The image toolbox is a sequential command-line program that takes a RAW image file as input, encodes it, and outputs a compressed encoded file . On the reverse, the image compression toolbox takes an encoded file as input, decodes it, and outputs a RAW image file
Main Requirements The encoding and decoding processes of the image toolbox will be rewritten into a parallel program The RAW image reader and writer will be replaced with a Tiff image reader and writer Assessment will be made on the usefulness of parallel programming in speeding up the image compression toolbox
Use Cases Use Case 1 : Compressing an Image Description – The user wants to compress a tiff file. Scenario – The user runs the image compression console program to compress a tiff image. The user provides the name and path of a tiff image file as program input. The user provides a name and path for the compressed output file. The “cmp” extension is used for the compressed file. Specific Requirements- Correctness – The compressed output file produced by the parallel program should be identical to the compressed output file produced with the sequential program.
Use Cases Use Case 2 : Decompressing an Image Description – The user wants to decompress a compressed file into a tiff image file. Scenario – The user runs the image compression console program to decompress a cmp file and write it out to a tiff file. The user provides the name and path of a compressed file as program input. The user also provides a name and path for the tiff file. Specific Requirements- Correctness – The output tiff file produced by the parallel program should be exactly the same as the output tiff file produced with the sequential program.
Methodology Described in “Designing and Building Parallel Programs”, by Ian Foster Structures the design process as four distinct stages :partitioning, communication, agglomeration, and mapping. Partitioning– the computation that should be operated on the data and the data to be operated on are decomposed into smaller takes
Methodology Communication – communication structures between tasks are defined for proper execution of program Agglomeration – the outcome of the partitioning and the communication stages are evaluated Mapping – each task is mapped to a processor in such a way that communication between tasks is decreased and execution is speeded up. Put notes about message passing
Performance Modeling Goal Performance models for: Develop mathematical expressions that specify certain metrics as a function of problem size, number of processors, number of tasks, and other important characteristics. Performance models for: Execution time –the time that elapses from when the first processor starts executing on the problem to when the last processor completes execution. Parallel scalability - how algorithm performance varies with parameters such as problem size, processor count, number of tasks, and message startup cost. The goal of performance modeling is to develop mathematical expressions that specify certain metrics as a function of problem size, number of processors, number of tasks, and other important characteristics. The models will be built based on an idealized multicomputer parallel architecture. Performance model will be prepared for execution time, and parallel scalability. Execution time of parallel program is defined as the time that elapses from when the first processor starts executing on the problem to when the last processor completes execution. Scalabilty analysis studies how algorithm performance varies with parameters such as problem size, processor count, and message startup cost. How effectively it can use an increased number of processors
Image Compression Toolbox A wavelet based image compression tool, written by Satish Kumar. The source code was obtained from the internet. The program is written in C++ Permission is granted by the author to use the software for research purposes. Contains a collection of functions that are commonly used in wavelet based image compression techniques.
Image Compression RAW image file read Four steps to the compression process: Wavelet Transformation Optimal Bit Allocation Quantization Entropy Encoding
Image Compression Toolbox Wavelet transformation – Low frequency components of the data are separated from high frequency components of the data. On an image plane the low frequency components represent the base of the image, where small variation between neighboring coefficient exists. High frequency components represent areas where sharper differences between components exist. High pass and low pass filters are used on the image data first horizontally and then vertically to divide the frequency into two.
Image Compression Toolbox
Image Compression Toolbox Optimal Bit Allocation Each class is allocated a portion of the total bit budget, such that the compressed image has the minimum possible distortion. The aim of bit allocation using rate-distortion techniques is meeting the requirement of overflow prevention while maximizing the image/video quality.
Image Compression Toolbox Quantization The division of a quantity into discrete number of small parts, that are integral multiples of a common quantity. A scalar and uniform quantizer is used. Uniform where the levels are spaced equally, and scalar where each data is processed individually.
Image Compression Toolbox Entropy Encoding Arithmetic encoding method used. Arithmetic coding takes a stream of input symbols and replaces it with a single number less than 1 and greater than 0. The arithmetic coding process requires each input symbol to be encoded sequentially.
Project Plan Initial Phase Develop overall requirements Documentation Vision document Project Plan Image Toolbox Description, SQA Plan Milestone - Presentation 1 - Get approval from committee and/or incorporate changes and suggestions
Project Plan Architecture Phase Design architecture of the parallel program for the image toolbox Documentation Algorithm Design Parallel program design analysis Refine Vision document, Project Plan and SQA Plan Test Plan Milestone - Presentation 2 - Get approval from committee and/or incorporate changes and suggestions.
Project Plan Implementation phase Implement parallel program Perform testing Documentation Well documented source code Test report Test evaluation report Milestone - Presentation 3 - Get approval from committee and/or incorporate changes and suggestions.
Project Plan Cost Estimation COCOMO - Organic model – uncomplicated Person Months = 2.4 * KDSI^1.05 KDSI – Project size in thousands of delivered source instructions. Function point analysis best works with business type applications. Therefore, the size of the image compression toolbox is used as an estimate Approximately 1300 lines of source code in the image compression toolkit. Person month = 3.2 DURATION = 2.5 * EFFORT^0.38 = 3.5 month
SQA Plan Tools Deliverables Microsoft C++ 6.0 Microsoft Visio Microsoft Word Deliverables Vision Document Project Plan Document SQA Plan Document Image Toolbox Description
SQA Plan Deliverables (cont) Architecture Design Document Test Plan Document Image Toolbox sequential source code. Image Toolbox parallel source code. Test Report Document Test Evaluation Report Document