Presentation is loading. Please wait.

Presentation is loading. Please wait.

DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters Thu D. Nguyen and Christopher Peery Department of Computer Science.

Similar presentations


Presentation on theme: "DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters Thu D. Nguyen and Christopher Peery Department of Computer Science."— Presentation transcript:

1 DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters Thu D. Nguyen and Christopher Peery Department of Computer Science Rutgers University John Zahorjan Department of Computer Science & Engineering University of Washington

2 IPDPS 2001 Overview  Improve real-time rendering performance using distributed rendering on commodity clusters Real-time rendering -> interactive rendering applications Improve performance -> Render more complex scenes at interactive rates  Why real-time rendering? A critical component of an increasing number of continuous media applications  Virtual reality, data visualization, CAD, flight simulators, etc. Rendering performance will continue to be a bottleneck  Model complexity increasing as fast (or faster) than hardware performance  Part of the challenge is to leverage increasingly powerful hardware accelerators

3 IPDPS 2001 Challenges  How to structure the distributed renderer to leverage hardware-assisted rendering Information that is useful for work partitioning and assignment may be hidden in the hardware rendering pipeline  How to minimize non-parallelizable overheads (avoiding Amdhal’s Law)  How to decouple bandwidth requirement from the complexity of the scene and the cluster size

4 IPDPS 2001 Image Layer Decomposition (ILD)  Per-frame rendering load is partitioned using ILD presented in IPDPS 2000  Briefly review ILD because it affects DDDDRRaW’s architecture and performance  Basic idea: assign scene objects such that sets of objects assigned to different nodes are not mutually occlusive  Advantages of using ILD Do not need position of polygons in 2D  This information may be hidden inside the graphics pipeline Do not need Z-buffer information  This reduces the required bandwidth by at least 50%

5 IPDPS 2001 Spatial partitioning Image Layer Decomposition (ILD) 1 2 3 4 5 6 3 54 1 26

6 IPDPS 2001  Non-mutually occlusive assignment -> legal for back-to-front compositing  Use heuristic-based algorithm to Balance load across cluster Minimize the screen real-estate covered by each assignment ILD: Work Assignment 3 5 41 6 2 Legal

7 IPDPS 2001 App. DDDDRRaW Library DDDDRRaW Library DDDDRRaW Library DDDDRRaW Library DDDDRRaW Library … Display Work Assignment Partial Image VRML Scene, Display Window Viewpoint Display Node Rendering Nodes Implementation: Architecture Partitioning Assignment Decompress Compositing Rendering Compress

8 IPDPS 2001 Implementation Details  Implemented an optimization to ILD: dynamic selection of octants to be rendered Minimize overhead of geometric transformation due to polygon splitting (in scene decomposition)  Compression of image layers before communication Reduce bandwidth requirement to accommodate slower networks (eg., 100 Mb/s LANs)  Use dynamic clipping to enforce octant boundaries for scene with smooth shading and/or texturing Simplification to ease implementation of prototype – this clipping could/should be done statically 20-25 percent overhead for 5 of our 6 test scenes that would not be present in a production system

9 IPDPS 2001 Performance Measurement  Application: VRML viewer VRweb – http://www.iicm.edu/vrwavehttp://www.iicm.edu/vrwave  Collected 6 VRML scenes from the web Use fix paths through scenes to measure performance in terms of average frame rate (frames/sec)  Two clusters representing different points in the technology spectrum Cluster of 5 SGI O2s  180 MHz Mips R5000, 256 MB memory, SGI Graphics Accelerator, 100 Mb/s switched Ethernet LAN  IRIX 6.5.7 Cluster of 13 PCs  Pentium III 800 MHz, 512 MB memory, Giganet 1 Gb/s cLAN  Red Hat Linux (kernel 2.2.14), Mesa 3D library version 3.2

10 IPDPS 2001 Two Test Scenes

11 IPDPS 2001 Overheads on SGI O2s OperationTime (ms) Display NodeRendering Node P=1P=2P=4P=1P=2P=4 ILD 2.081.978.68 Clear Image Buffer 3.50 Decompress 18.0822.8430.28 Display Frame 0.18 Compress 36.0327.1317.70

12 IPDPS 2001 Overheads on PCs Operation Time (ms) Display NodeRendering Node P=1P=4P=8P=12P=1P=4P=8P=12 ILD 2.622.63 2.70 Clear Image Buffer 4.985.015.375.24 Decompress 3.294.114.334.46 DisplayFrame 15.7915.3415.73 Compress 7.427.527.467.79

13 IPDPS 2001 Speed-up of Average Frame Rate on O2s

14 IPDPS 2001 Speed-up of Average Frame Rate on PCs

15 IPDPS 2001 Speed-up of Rendering Component on PCs

16 IPDPS 2001 Conclusions  Can build an ILD-based distributed renderer to significantly improve real-time rendering performance on commodity hardware  DDDDRRaW currently scales to modestly sized cluster This limitation is due to non-optimal hardware configurations This is NOT because more suitable hardware is not available! Expect good scalability to clusters of 16-32 nodes  Overlapping communication with computation increases average frame rate but ONLY at the expense of increasing frame latency Problem is CPU contention for rendering & communication Either need dedicated hardware or can only optimize after reaching 10-15 fps, the nominal interactive frame rate  Project URL: www.cs.washington.edu/research/ddddrraw/

17 IPDPS 2001 Overlapping Communication & Computation  Communication and compression are significant sources of overhead  Apply standard parallel optimization technique: overlap communication of rendered image layers for one frame with rendering of the next  Requires pipelining of DDDDRRaW

18 IPDPS 2001 The DDDDRRaw Pipeline Render Compress Receive Send Decompress Composite & Display ILDSend Receive Stage 1Stage 3 Stage 2 Display Node Rendering Nodes

19 IPDPS 2001 Average Frame Rates

20 IPDPS 2001 Average Frame Latency


Download ppt "DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters Thu D. Nguyen and Christopher Peery Department of Computer Science."

Similar presentations


Ads by Google