Bernstein’s Conditions. Techniques to Exploit Parallelism in Sequential Programming Hierarchy of levels of parallelism: Procedure or Methods Statements.

Slides:



Advertisements
Similar presentations
How SAS implements structured programming constructs
Advertisements

1 ECE734 VLSI Arrays for Digital Signal Processing Loop Transformation.
COSC513 Operating System Research Paper Fundamental Properties of Programming for Parallelism Student: Feng Chen (134192)
Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Parallelism & Locality Optimization.
CS107 Introduction to Computer Science Lecture 3, 4 An Introduction to Algorithms: Loops.
Compiler techniques for exposing ILP
Programmability Issues
Optimizing single thread performance Dependence Loop transformations.
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
1 ILP (Recap). 2 Basic Block (BB) ILP is quite small –BB: a straight-line code sequence with no branches in except to the entry and no branches out except.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
Parallell Processing Systems1 Chapter 4 Vector Processors.
 Control structures  Algorithm & flowchart  If statements  While statements.
1 Discrete Structures Lecture 29 Predicates and Programming Read Ch
Stanford University CS243 Winter 2006 Wei Li 1 Loop Transformations and Locality.
Sequential Statements Module F3.2. Sequential Statements Statements executed sequentially within a process If Statements Case Statements Loop Statements.
Compilation Techniques for Multimedia Processors Andreas Krall and Sylvain Lelait Technische Universitat Wien.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic B: Loop Restructuring José Nelson Amaral
Parallelizing Compilers Presented by Yiwei Zhang.
Visual Basic: An Object Oriented Approach 3 – Making Objects Work.
Review Algorithm Analysis Problem Solving Space Complexity
Algorithm. An algorithm is a procedure for solving a problem in terms of the actions to be executed and the order in which those actions are to be executed.
An ordered sequence of unambiguous and well-defined instructions that performs some task and halts in finite time Let's examine the four parts of this.
CIS Computer Programming Logic
Sub-expression elimination Logic expressions: –Performed by logic optimization. –Kernel-based methods. Arithmetic expressions: –Search isomorphic patterns.
Chapter 5 Control Structures: Loops 5.1 The while Loop The while loop is probably the most frequently used loop construct. The while loop is a conditional.
Algorithms and Algorithm Analysis The “fun” stuff.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Dependence Analysis and Loop Transformations.
Control Structures CPS120: Introduction to Computer Science Lecture 5.
Algorithm Design.
High-Level Transformations for Embedded Computing
Advanced Program Design. Review  Step 1: Problem analysis and specification –Specification description of the problem’s inputs and output –Analysis generalize.
CSCI-100 Introduction to Computing
`. Lecture Overview Structure Programming Basic Control of Structure Programming Selection Logical Operations Iteration Flowchart.
Chapter 4: Elementary Number Theory and Methods of Proof 4.8 Application: Algorithms 1 Begin at the beginning…and go on till you come to the end: then.
FOUNDATION IN INFORMATION TECHNOLOGY (CS-T-101) TOPIC : INFORMATION SYSTEM – SOFTWARE.
 In computer programming, a loop is a sequence of instruction s that is continually repeated until a certain condition is reached.  PHP Loops :  In.
Component 4/Unit 5-3. Data Type Alphanumeric (Character set: A-Z, 0-9 and some special characters) –Customer address, name, phone number, Customer ID,
Algorithm Discovery and Design Objectives: Interpret pseudocode Write pseudocode, using the three types of operations: * sequential (steps in order written)
Topic: Control Statements. Recap of Sequence Control Structure Write a program that accepts the basic salary and allowance amount for an employee and.
Problem-solving with Computers. 2Outline  Computer System  5 Steps for producing a computer program  Structured program and programming  3 types of.
Flowchart. a diagram of the sequence of movements or actions of people or things involved in a complex system or activity. a graphical representation.
LECTURE V TEST BENCHES. As your projects become more complex and multiple modules are employed, it will no longer be possible to simulate them as we did.
IST 210: PHP LOGIC IST 210: Organization of Data IST210 1.
Flow Control in Imperative Languages. Activity 1 What does the word: ‘Imperative’ mean? 5mins …having CONTROL and ORDER!
L9 : Low Power DSP Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab.
IST 210: PHP Logic IST 210: Organization of Data IST2101.
Computer Science 210 Computer Organization Machine Language Instructions: Control.
Code Optimization Overview and Examples
Computer Application in Engineering Design
Dependence Analysis Important and difficult
Loop Restructuring Loop unswitching Loop peeling Loop fusion
Think What will be the output?
Data Flow Testing.
Presented by: Huston Bokinsky Ying Zhang 25 April, 2013
Programming Misconceptions
Computer Science 210 Computer Organization
Algorithms Take a look at the worksheet. What do we already know, and what will we have to learn in this term?
` Structured Programming & Flowchart
Computer Science Core Concepts
Supplement on Verilog adder examples
The structure of programming
Loop Optimization “Programs spend 90% of time in loops”
The structure of programming
Thinking procedurally
Michael Ernst UW CSE 190p Summer 2012
Introduction to Optimization
Optimizing single thread performance
Presentation transcript:

Bernstein’s Conditions

Techniques to Exploit Parallelism in Sequential Programming Hierarchy of levels of parallelism: Procedure or Methods Statements Expressions

Techniques to Exploit Parallelism in Sequential Programming Expression Level (Implicit Parallelism) Using the associative, commutative, and distributive properties of arithmetic expressions, object code will be produced where more than one instruction can be executed in parallel. Associativity: Z = ([(A+B)+C]+D)  Z = (A+B) + (C+D)

Techniques to Exploit Parallelism in Sequential Programming Statement Level This level is known as the D-structure level (for Dijkstra). Statements can be assigned into one of three categories: - Sequence: S1, S2 - Selection: IF b THEN S1 ELSE S2 - Iteration: (while, repeat, …) Bernstein, A.J. (1966), “Analysis of Programs for Parallel Processing, IEEE transactions of computers”, EC-15, pg

Sequence Example S1: x  a+b Each instruction has an input space and output space, IS and OS respectively. IS is defined as the variables in the instruction that do not change after executing the instruction. For the example above IS= {a,b}. OS is defined as the variables in the instruction that change after executing the instruction. For the example above OS={x}

Berstein’s Conditions Given S1 and S2 IF IS(S1) & OS(S2) =  (Data Dependent) & OS(S1) & IS(S2) =  (Anti Dependent) & OS(S1) & OS(S2) =  (Output Dependent) THEN S1 and S2 can be executed in parallel

Iteration Example for i  1 to 6 do A[i]  B[i] + C[i] Using vectorization (expansion) we can write as A[1]  B[1]+C[1] A[2]  B[2]+C[2] A[3]  B[3]+C[3]… Applying Berstein’s laws tell us that the above instructions can be executed in parallel.

Selection Given A  B+1 IF A>5 C  A+1; Go to 5 ELSE D  A+C; Go to 7 END IF 5: E  A+1 6: F  A+1 To exploit parallelism the program can be written as A  B+1

Program Transformation FOR I  1 TO n do X  A[I] + B[I]. Y[I]  2 * X. X  C[I] / D[I]. P  X + 2 ENDFOR FOR I  1 TO n do X  A[I] + B[I]. Y[I]  2 * X. XX  C[I] / D[I]. P  XX + 2 ENDFOR removes data dependency data dependency 

Scalar Expansion FOR I  1 TO n do X  A[I] + B[I]. Y[I]  2 * X. ENDFOR FOR I  1 TO n do X  A[I] + B[I]. Y[I]  2 * X[I]. ENDFOR removes data dependency data dependency 

Loop Unrolling FOR I  1 TO n do X[I]  A[I] * B[I] ENDFOR  X[1]  A[1] * B[1] X[2]  A[2] * B[2]. X[n]  A[n] * B[n]

Loop Fusion or Jamming FOR I  1 TO n do X[I]  Y[I] * Z[I] ENDFOR FOR I  1 TO n do M[I]  P[I] + X[I] ENDFOR a) FOR I  1 TO n do X[I]  Y[I] * Z[I] M[I]  P[I] + X[I] ENDFOR b) FOR I  1 TO n do M[I]  P[I] + Y[I] * Z[I] ENDFOR 