Pipelined and Parallel Computing Data Dependency Analysis for 1 Hongtao Du AICIP Research Mar 9, 2006.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Parallelism & Locality Optimization.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Computer Systems/Operating Systems - Class 8
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Evaluation of Data-Parallel Splitting Approaches for H.264 Decoding
Reference: Message Passing Fundamentals.
1 Threads, SMP, and Microkernels Chapter 4. 2 Process: Some Info. Motivation for threads! Two fundamental aspects of a “process”: Resource ownership Scheduling.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
Chapter 17 Parallel Processing.
Page 1 CS Department Parallel Design of JPEG2000 Image Compression Xiuzhen Huang CS Department UC Santa Barbara April 30th, 2003.
Conventional Image Processing. grids Digital Image Notation Digital images are typically stored with the first index representing the row number and.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
1 Real time signal processing SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic.
CS 470/570:Introduction to Parallel and Distributed Computing.
Parallel Architectures
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Parallel ICA Algorithm and Modeling Hongtao Du March 25, 2004.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Introduction to Parallel Rendering Jian Huang, CS 594, Spring 2002.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
CSE Advanced Computer Architecture Week-1 Week of Jan 12, 2004 engr.smu.edu/~rewini/8383.
Data Structures and Algorithms in Parallel Computing Lecture 1.
A Programmable Single Chip Digital Signal Processing Engine MAPLD 2005 Paul Chiang, MathStar Inc. Pius Ng, Apache Design Solutions.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
Outline Why this subject? What is High Performance Computing?
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Dec 1, 2005 Part 2.
Lecture 3: Computer Architectures
Parallel Processing Presented by: Wanki Ho CS147, Section 1.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
Parallel Computing Presented by Justin Reschke
SIMD Implementation of Discrete Wavelet Transform Jake Adriaens Diana Palsetia.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
Parallel Computing Chapter 3 - Patterns R. HALVERSON MIDWESTERN STATE UNIVERSITY 1.
Parallel Image Processing: Active Contour Algorithm
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
Overview Parallel Processing Pipelining
Parallel Architecture
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
Distributed and Parallel Processing
buses, crossing switch, multistage network.
Parallel Processing - introduction
Parallel Programming By J. H. Wang May 2, 2017.
CS 147 – Parallel Processing
Flynn’s Classification Of Computer Architectures
Introduction to Parallelism.
A Gentle Introduction to Bilateral Filtering and its Applications
Threads, SMP, and Microkernels
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Multiprocessors - Flynn’s taxonomy (1966)
buses, crossing switch, multistage network.
Overview Parallel Processing Pipelining
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
AN INTRODUCTION ON PARALLEL PROCESSING
Chapter 4 Multiprocessors
COMPUTER ORGANIZATION AND ARCHITECTURE
Presentation transcript:

Pipelined and Parallel Computing Data Dependency Analysis for 1 Hongtao Du AICIP Research Mar 9, 2006

2 Motivation Image processing –Complicated algorithms –Large data sets Pipelined and parallel computing –Speedup? Not Necessary! Solution: Efficiently –Allocate process –Distribute data

3 Outline Pipelined and parallel computing overview Related work on data dependency Independence Uniform dependency Regional dependency Dependency on image sequence and multispectral image Data Distributing Schemes

4 Computing Overview

5 Driving Force Data-driven –How to divide data sets into different sizes for multiple computing resources –How to coordinate data flows along different directions Function-driven –How to perform different functions of one task on different computing resources at the same time.

6 Data - Flynn's Taxonomy Single Instruction Flow Single Data Stream (SISD) Multiple Instruction Flow Single Data Stream (MISD) Single Instruction Flow Multiple Data Stream (SIMD) Multiple Instruction Flow Multiple Data Stream (MIMD) –Shard memory –Distributed memory

7 Data Dependency Decreasing even dismissing the speedup Caused by edge pixels on different blocks BlockReverse diagonal

8 Algorithm and Data Dependency Algorithm Design Stage –Algorithm depends on problem and data (image). Algorithm Implementation Stage –Data (image) is restricted to specific algorithms. –Dependency analysis follows the features of algorithms. Dependency –Window size –Existence

9 Outline Pipelined and parallel computing overview Related work on data dependency Independence Uniform dependency Regional dependency Dependency on image sequence and multispectral image Data Distributing Schemes

10 Definitions and Notations Suppose –Input image, and input pixel is the coordinate of the input pixel, is the size of the input image. –A function is a portion of an image processing algorithm –Output data set, and output data If is an image, then is a pixel in the output image. Definition: –A dependency edge b is a connection from data to data, denoting that the output data depends on the value of both and, i.e.,

11 The dependency model of I is defined as D = (I, B) – data and – dependency edges and –

12 Independence Definition: For a function, the input data set is considered as independent if and only if the output data depends only on current input data but no other data. That is, for each output data. No requirement of extra storage space. Pixel based image processing algorithms –Contrast stretching –Power-law transformation

13 Uniform Dependency Definition: For a function, the input data set has uniform dependency, if and only if 1.The output data depends more than one input data. 2.The dependency exists and is identical for every output data. The uniform dependency is expressed as: for each output data, where j is the number of input data,, and

14 Example: 3x3 filter Assign weights Definition: The weight w(b) of a dependency edge b is the amount of new input data if the window moves along the direction of the dependency edge.

15 Regional Dependency Definition: For a function, the input data set has regional dependency, if and only if 1. The output data depends more than one input data. 2. One or more dependencies exist only for partial output data. 3. All other data are independent. The regional dependency is expressed as: for each output data,

16 Grouping Data with Regional Dependency

17 Hierarchical Data Grouping and Partitioning

18 Outline Pipelined and parallel computing overview Related work on data dependency Independence Uniform dependency Regional dependency Dependency on image sequence and multispectral image Data Distributing Schemes

19 Dependency on Image Sequence and Multispectral Image Suppose –Input image, and input pixel is the coordinate of the input pixel, is the temporal or spectrum axis –A function is a portion of an image processing algorithm –Output data set, and output data

20 The input data required from the input data set I in order to calculate necessarily lie in a finite set of data locations which together form a region inside the input data set I. Where is the data location designated by row x, column y, of frame z. If N regions are required from the input data set, then the total data dependency set is

21 Case 1. No dependency exists between individual images along z axis. The 3-D input data set is treated as multiple single images. No matter if independence, uniform or regional dependencies exist on individual images.

22 Case 2. Uniform dependencies exist on pixels with the same coordinates along z axis but no dependency exists on individual images. The data dependency is similar to the uniform dependency on single image, but on z axis. The basic partitioning unit is one column along the z axis.

23 Case 3. Uniform dependencies exist both along z axis and on the same image. The processing window move in the 3-D data set. The weight of each possible movement direction are assigned in the 3-D data set

24 Case 4. Regional dependencies exist along z axis, or both along z axis and on the same image. Expand the hierarchical data grouping and partitioning in 3-D data set. Several 3-D data blocks may be formed in the data set. At each level, the relationship among data or data blocks are either independence or uniform dependency.

25 Outline Pipelined and parallel computing overview Related work on data dependency Independence Uniform dependency Regional dependency Dependency on image sequence and multispectral image Data Distributing Schemes

26 Data Distributing Schemes Block ScatterContiguous point Contiguous row

27 Communication Patterns and Costs Communication expense is the first concern in data partitioning. Successor/Predecessor (S-P) pattern North/South/East/West (NSEW) pattern is the message preparation latency, is the transmission speed (Byte/s), is the number of processors, is the number of data, is the length of each data item to be transmitted.

28 Thank you!