Download presentation
Presentation is loading. Please wait.
Published byAlan Higgins Modified over 9 years ago
1
Interactive Ray Tracing #2 Peter Djeu April 22, 2003
2
Interactive Ray Tracing S. Parker, W. Martin, P. Sloan, P. Shirley, B. Smits, C. Hansen The University of Utah
3
Goals Implement a brute force interactive ray tracer in software for the SGI Origin 2000 –hardware renderers are inflexible, whereas software renderers can be extended and re- tested with new algorithms Study and try to compensate for the problems that come with interactive ray tracing (e.g. lighting, shadows, splines)
4
Why is Ray Tracing Appealing? 1.It scales well (keep throwing processors at the problem until the renderer is interactive) 2.Rendering time is sub-linear in the number of primitives in the scene (unlike rasterization, which is linear) 3.Ray tracing is more flexible and the image quality is better (ex: more primitives are allowed, ray tracing generates shadows, highlights, transparency)
5
Overview of the System (page 1)
6
Overview of the System (page 2)
7
Rendering Mode 1: Conventional Mode Create a static set of ray bundles (called jobs) where the job size spans a range of sizes. Use first-come first-served, assign larger jobs first, and towards the end the smaller job size will cause load-balancing. When all bundles have been processed, display the current frame.
8
Conventional Mode: Diagram
9
The Nitty Gritty of Conventional Mode Use job sizes that are multiples of 128 bytes. This is because the machine has 128 byte cache lines (no false sharing). Use the Origin’s fetch and op instruction for as a fast synchronization tool –61 sec on Origin vs. 6 msec on Irix, a big difference
10
Thoughts on Conventional Mode Good: The algorithm is very simple, but still achieves load balancing. Bad: How many bundles should be created? How large should the largest bundles be? Other limits to a static algorithm… Bad: How are rays grouped into bundles? Do the bundles respect locality (like in Pharr’s paper)?
11
Rendering Mode 2: Frameless Assign a set of pixels to each processor. Each processor will compute its set of pixels as fast as it can, but the screen will be updated at an independent rate. We get guaranteed framerate at the cost of inconsistent image quality.
12
Frameless Rendering: Diagram
13
Thoughts on Frameless Mode The two competing goals for locality are interesting: when assigning pixels, strong locality means better cache utilization, but strong locality also means that an entire portion of the screen may have a noticeable artifact if that particular processor is overburdened. –How do find a balance? –Or does such a tradeoff completely kill frameless rendering? Where are the pictures of frameless mode?
14
The Transition to Interactive A static ray tracer can use a variety of hacks which do not apply to interactive ray tracers (ex: lighting and object’s material hacked based on viewpoint) The authors propose some new ray tracing techniques to improve: lighting, material models, shadows, intersection computations.
15
Whitted Lighting and a Material Model with Categories A robust lighting model with coefficients that determine the category and visual appearance of materials. For efficiency, coefficients that are not needed are set to 0: –Diffuse: no highlights nor specular, just diffuse –Metal: no diffuse, just specular and highlights –Dielectric: (ex: glass, water), formula used for specular coefficients –Polished: complex formula used for overall color
16
Ambient Lighting Problem: Ambient light is usually hacked in to a ray tracer so that points not directly in light are lit up. However, if they face away from the light, these regions appear flat (no real ray tracing occurs). Solution: assign a color to “fully facing the light” and a color to “fully facing away from the light.” For all surfaces, find an interpolated color. No need for additional light rays.
17
Directionally Varying Ambient Lighting Problem: Ambient light is usually hacked in to a ray tracer so that points not directly in light are lit up. However, if they face away from the light, these regions appear flat (no real ray tracing occurs). Solution: assign a color to “fully facing the light” and a color to “fully facing away from the light.” For all surfaces, find an interpolated color. No need for additional light rays.
18
Directionally Varying Ambient Lighting in Action
19
Inner / Outer Object Shadows to Approximate Area Lights Problem: Realistic shadows have an umbra and a penumbra created by area lights, not hard shadows Solution: treat an area light as a point. Based on the size / shape of the light, construct an inner and outer object for the shadow caster. Create two shadow regions, and interpolate the transparency between them to simulate the soft shadow.
20
Diagram of Inner / Outer Objects
21
Picture of Soft Shadows in Practice
22
Subdividing Spline Surfaces Problem: Usually, splines are tessellated in ray tracers, which means there can be an explosion in memory usage and / or pipeline saturation Solution: use a bottom up technique of creating bounding volumes, then compute intersections using Broyden’s method. This is fast (~ 3 iterations / query) and is memory efficient.
23
Results: page 1
24
Results: page 2
25
Results: page 3 Rendering Room scene (small scene): 9.4 Mb/s Rendering the Female Dataset (large scene): 2.1 Mb/s to 8.4 Mb/s, note this is less than before Most scenes fit within 4Mb of secondary cache Dynamic (aka moving) objects: no acceleration, they are processed using the standard algorithm Depth complexity has little effect on rendering speed.
26
Critique of Results (page 1) Why does one chart go up to 64 proc.’s, while the other have a max of 128 proc.’s? How was the ideal performance calculated? Note that the ideal line is NOT the same on both charts: (64, 7x) vs. (64, 10x) Why is there a drop-off in both charts? –Any ideas?
27
Critique of Results (page 2) No attempt was made to explain why a larger scene “ironically” uses less memory bandwidth than a smaller scene. –Coherence? Occlusion? Something else? Will most scenes in the future fit within the secondary cache (4 Mb in this case)? The authors mainly address the making of an interactive ray tracer for static scenes. The results presented seem more like an afterthought.
28
Conclusions There is still much work to be done in the world of ray tracing, including: –anti-aliasing, dynamic scenes, performance guarantees, API creation, hardware Creating (and using) better ray tracers means that we will be better able to focus our efforts for future work –usefulness of soft shadows, BRDF’s, reflect’s
29
State of the Art in Interactive Ray Tracing I. Wald and P. Slusallek Saarland University, Germany
30
Goals Create a survey of contemporary raytracing. Topics include: –the weaknesses of rasterization –different ways to ray trace –ray tracing on different platforms (supercomputers, PC’s, PC clusters) –recent research Talk about their research
31
Problems with Rasterization With respect to the number of polygons in the scene, the complexity if O(n) rather than O(log n) Hard to scale to parallel architectures because of high communication needs Hard to incorporate a shader into the pipeline
32
Another look at Rasterization O(n) rather than O(log n) –since when has O(n) been a problem in terms of scalability? Of course, O(log n) is better, but… Hard to scale to parallel architectures Hard to incorporate a shader –is this still true when we have shader languages such as Cg and MS Cg?
33
Benefits of Ray Tracing Flexible –different types of rays, different primitives O(log n) Shading only done on visible components Shaders easier to add (no pipeline) (?) Correct reflections, refractions Parallel and Scalable Coherent when using a Pharr-like algorithm
34
Raytracing is faster, but…. Almost all tests currently use just primary rays. Shadows and reflections will drop frame rate by a constant factor. Acceleration structures like BSP trees are the source for the speed, but they are heavily dependent on static scenes. They do not (currently) support dynamic objects.
35
Different Forms of Ray Tracing 1.Rasterization-Based – do a quick rasterization pass, and then add ray traced effects (artifacts) 2.Image-Based – kind of like frameless rendering from Parker’s paper (artifacts) 3.Approximation-Based – sample certain regions and interpolate (artifacts) 4.Acceleration-Based – construct fast-culling data structures, exploit coherence, respect the memory hierarchy (no artifacts?)
36
Approximate Ray Tracing Main idea: the visual feedback from interactivity (i.e. frame rate) can often be more important than visual correctness –ex: Sonic 2 and Blast Processing Examples: –Rasterize, then use corrective textures for highlights –the RenderCache reuses rays within error bound (however, this is great for off-line global illumination) –the Holodeck keeps all generated rays on disk, reuses
37
Perceptually Guided Corrective Texturing
38
Ray Tracing Platform 1: Supercomputers Using a 96-proc. SGI PowerChallenge, Muuss was able to ray trace a scene that could not be rasterized (1995) Parker et al. used an SGI Origin 2000 to create a ray tracer that could support triangle and non-triangle scenes (1999)
39
Ray Tracing Platform 2: Desktop PC’s Why? –Supercomputers are rare, while PC’s are everywhere –Work for stand-alone PC’s could lead to efficient ray tracers on cluster PC’s Challenges of using a CPU –reduce branches and complexity, respect the memory hierarchy, reduce memory bandwidth
40
Points to Note on the Desktop PC Implementation Shading takes up far less than 10% of the total rendering time SIMD CPU instructions (aka vector ops) produce only a 2x speedup –Do 4 rays on one tri., not 4 tri.’s on 1 ray Wald’s implementation was compared to freely available POV-Ray and Rayshade –11x – 15x speedup
41
Table 2 (less is more)
42
Ray Tracing vs. Rasterization (on Desktop PC’s) We can already achieve the crossover point (see bottom row of the next slide) SGI Performer (a rasterizer) running on powerful desktops is comparable to ray tracing on a more modest desktop
43
Table 3 (bigger is better)
44
Conclusions from Desktop PC Results Raytracing has a high startup cost per ray, but… It scales well as scenes get more complex, a crossover point exists regardless of screen resolution You can do correct reflections, etc.
45
Figure 8
46
Ray Tracing Platform 3: Clusters of PC’s Because ray tracing is embarrassingly parallel, let’s try to build a cheap PC cluster-based ray tracer Challenges: –no shared memory on PC clusters Setup: a scheduling machine, a display machine, and lots of processing machines
47
Data Management in the Cluster World An NFS based data fetch system blocks on a data miss, and this is too costly Instead, the scene cache is managed in software via an asynchronous loader thread, a ray is suspended until its data arrives Compression-Decompression is used for voxels transferred over the network
48
Other Issues in the Cluster World Preprocess - Create an adaptive BSP tree with small voxels in detailed regions and large voxels in sparse regions, O(n log n) Load balancing – assign voxels to machines that have already done them, only good for small scenes Interconnect – Gigabit ethernet and switch –is this fair?
49
Results for the Cluster Ray Tracer On a 12.5 million tri. power plant model, 3-5 fps almost constantly (8-10 fps with SIMD instructions), comparable to rasterizer Adding reflective rays: the performance hit is proportional to # of traced rays, reduced coherence -> little effect on performance (!) Stress test of a 4x Power Plant (50 million tri’s) found that indoor scenes were not affected (2 extra BSP tree levels), while outside scenes with motion suffered from large voxel transfer over network
50
Network Saturation vs. Scalability
51
Hardware Support in the Future RAYA – simulations say build a ray tracer on a single chip Smart Memories – a programmable and configurable architecture, should be able to get 50 fps at 512 x 512 (!) Saarland’s own architecture – a ray tracing pipeline, coherence is enforced by having rays traversal and intersection on a clock
52
Smart Memories
53
Saarland’s Pipeline
54
Ongoing Ray Tracing Research Dynamic scenes –some work done, Reinhard proposes making large objects live in coarser levels of the hierarchy to maintain constant update cost Ray tracing API –try to make it like OpenGL, like C and Java Interactive Global Illumination –more like an application of ray tracing
55
Reinhard on dynamic hierarchies
56
Parting Thoughts Ray tracing and rasterization are, in a way, converging –Occlusion culling, hierarchical z-buffer, advanced shading Still different in that ray tracing selects only the geometry needed, while rasterization needs to conservatively send all tri’s that might be visible “We strongly believe that what we see today is only the beginning of an exciting new field of computer graphics.”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.