Automatic Generation of Parallel OpenGL Programs Robert Hero CMPS 203 December 2, 2004
Motivation The use of graphics hardware can significantly increase the performance of visualization applications Many visualization algorithms can be modified to run on parallel systems Most applications don’t take advantage of both parallelism and the performance of graphics cards
Background Previous Research into Automatic Parallelism – Typically integrated with compiler – System or Application specific – Requires User interaction OpenGL – Uses graphics hardware – Not designed for parallel implementation – Some research into extending the OpenGL API with parallel commands
Automatic Parallelization Two stage process – First create a Program Dependency Graph Contains control flow information and data dependencies – Analyze PDG Find loop regions Look at data dependencies within loop region
Program Dependency Graph Contains three types of nodes – Region Nodes Functions and Loops Contains OpenGL state information –Must be sent to each processor when a loop is run in parallel – Control Nodes If/else and switch – Statement Nodes Variable declarations, Assignments, OpenGL commands
For Loop int x if cout int z Function Display(x) Display(y) int y Region Node Statement Node Control Node
Loops that can be Parallelized A loop can be parallelized if the data within the loop doesn’t depend on an assignment within the loop A loop can also be parallelized if the data in the loop depends on an assignment in the loop that can be predicted before the loop is called – ie loop counters
Future Work Implementing a full parser Better analysis of OpenGL commands – Many complex algorithms rely on special uses of alpha and texture buffers that can be detected Performance Considerations – Detect when the cost of parallelizing a loop is more than just running it sequentially