Presentation is loading. Please wait.

Presentation is loading. Please wait.

Siggraph 2009 RenderAnts: Interactive REYES Rendering on GPUs Kun Zhou Qiming Hou Zhong Ren Minmin Gong Xin Sun Baining Guo JAEHYUN CHO.

Similar presentations


Presentation on theme: "Siggraph 2009 RenderAnts: Interactive REYES Rendering on GPUs Kun Zhou Qiming Hou Zhong Ren Minmin Gong Xin Sun Baining Guo JAEHYUN CHO."— Presentation transcript:

1 Siggraph 2009 RenderAnts: Interactive REYES Rendering on GPUs Kun Zhou Qiming Hou Zhong Ren Minmin Gong Xin Sun Baining Guo JAEHYUN CHO

2 2 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

3 3 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

4 4 REYES rendering ● “Renders Everything You Ever Saw” ● In 1980s by Carpenter and Cook ● Photo-realistic images ● Main Idea ● Subdivide every primitive into micropolygons ● In use by Pixar ● PhotoRealisticRenderMan ( PRMan )

5 5 Basic REYES pipeline Modeling Application primitives unshaded micropolygons Bucketing Bound Too Large? Dice Shade Sample Composite & Filter Split No Yes visible points pixels shaded micropolygons

6 6 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

7 7 System overview ● Map all basic REYES stages to the GPU ● Add 3 dynamic scheduling stages ● Support multi-GPU rendering

8 8 RenderAnts system pipeline

9 9 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

10 10 Bound/Split and Dice

11 11 Bound/Split and Dice ● Bound/Split ● All input primitives are stored in a queue ● Primitives in queue are bound and split in parallel ● Dice ● Primitives in dicing region are subdivided into micropolygons in parallel

12 12 Shade

13 13 Shade ● Main idea ● Translate RenderMan shader instructions to GPU shader instructions ● Use shader compiler ● Each vertex of micropolygons is shaded

14 14 Shade ● Out-of-core Texture fetch ● Too large to load on GPU memory at one time ● Use CPU-side cache manager ● If not in cache, interrupt GPU then cache reads from disk and copy to GPU

15 15 Sample

16 16 Sample ● Main idea ● Each pixel in sampling region is divided into subpixels ● If micropolygon covers sample location of subpixel, compute and store sample point sample point of left micropolygon sample point of right micropolygon

17 17 Sample ● Compute sample point ● Interpolate color, opacity and depth values of micropolygon at sample location

18 18 Composite & Filter

19 19 Composite & Filter ● Composite ● Sort sample points of each subpixel in depth order ● Composite sample points of each subpixel in depth order until meeting the depth of subpixel in parallel ● Filter ● For each pixel, blend color and opacity of subpixels in parallel

20 20 Advanced features ● Shadow ● Use shadow maps through shadow pass ● Motion blur & Depth-of-field ● Use accumulation buffer ● Assign unique sample time to each subpixel ● Sample subpixel whose sample time is equal to current rendering time

21 21 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

22 22 Dynamic scheduling ● Main idea ● Maximize parallelism at each stage ● Estimate memory requirements at each stage

23 23 Dicing scheduler

24 24 Dicing scheduler ● Main factor of memory requirements ● Total data of micropolygons ● Estimate memory requirements ● Total # of micropolygons computed from total # of primitives

25 25 Dicing scheduler ● Main idea ● Split current bucket into dicing regions ● Until # of primitives in processing region fits available GPU memory ● Use binary space partitioning ( BSP )

26 26 How to split dicing region? ● Let # of primitive to fit GPU memory = 2 bucket primitive

27 27 How to split dicing region? ● Let # of primitive to fit GPU memory = 2 bucket subregion bucket primitive

28 28 How to split dicing region? ● Let # of primitive to fit GPU memory = 2 bucket subregion bucket subregion primitive

29 29 Shading scheduler

30 30 Shading scheduler ● Main factor of memory requirements ● Temporary data allocated during shader execution ● Estimate memory requirements ● Different shaders require different sizes of temporary data

31 31 Shading scheduler ● Main idea ● Split micropolygon list into sublist ● Until # of micropolygons for current shader execution fits available GPU memory ● Do scheduling per shader execution

32 32 Sampling scheduler

33 33 Sampling scheduler ● Main factor of memory requirements ● Total data of subpixel framebuffer and sample points ● Estimate memory requirements ● Framebuffer size equals to region size ● Use line scanning process to estimate # of sample points

34 34 Sampling scheduler ● Main idea ● Split current dicing region into sampling regions ● Until # of sample points in processing region + region size fits available GPU memory ● Use binary space partitioning ( BSP )

35 35 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

36 36 Multi-GPU rendering ● Main idea ● Minimize inter-GPU communication ● Balance workloads among GPUs

37 37 How to minimize inter-GPU communication? ● GPU maintains a complete list of all primitives in a bucket ● Only transfer region description

38 38 How to minimize inter-GPU communication? ● Let A, B, C denote each GPU bucket A

39 39 How to minimize inter-GPU communication? ● Let A, B, C denote each GPU bucket A subregion bucket BA

40 40 How to minimize inter-GPU communication? ● Let A, B, C denote each GPU bucket A B C subregion bucket BA A

41 41 How to balance workloads among GPUs? ● Split region under both conditions ● If # of primitives > threshold ● If idle GPU exists

42 42 How to balance workloads among GPUs? ● Let threshold = 2 subregion bucket BA primitive

43 43 How to balance workloads among GPUs? ● Let threshold = 2 subregion bucket BA B C subregion A primitive

44 44 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

45 45 Results

46 46 Rendering Performance

47 47 Rendering Time on GPU ● Breakdown of the rendering time on GPU ● Initialization time is relatively short ( Data loading from CPU to GPU )

48 48 Scaled Performance on GPU

49 49 Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion

50 50 Conclusions ● Advantages ● Faster than CPU-based Rendering ● Performance scalability ● Disadvantages ● Geometry scalability ● Motion/focal blur ● Improved in [Hou et al 2010]

51 51 Questions & Answers Q & A

52 52 Finish! Thank You


Download ppt "Siggraph 2009 RenderAnts: Interactive REYES Rendering on GPUs Kun Zhou Qiming Hou Zhong Ren Minmin Gong Xin Sun Baining Guo JAEHYUN CHO."

Similar presentations


Ads by Google