Download presentation
Presentation is loading. Please wait.
Published byRandolf Davidson Modified over 9 years ago
1
Perception-Based Global Illumination, Rendering and Animation Techniques Karol Myszkowski, Max-Planck-Institut für Informatik Karol Myszkowski, Max-Planck-Institut für Informatik
2
Outline Perceptually based Animation Quality MetricPerceptually based Animation Quality Metric AQM applicationsAQM applications –IBR techniques in walkthrough animation –Global illumination for dynamic environments Visual attention driven interactive renderingVisual attention driven interactive rendering The atrium modelThe atrium model Perceptually based Animation Quality MetricPerceptually based Animation Quality Metric AQM applicationsAQM applications –IBR techniques in walkthrough animation –Global illumination for dynamic environments Visual attention driven interactive renderingVisual attention driven interactive rendering The atrium modelThe atrium model
3
Animation Quality Metric
4
The concern is not whether images are the same; rather the concern is whether images appear the same. Questions of Appearance Preservation The concern is not whether images are the same; rather the concern is whether images appear the same. How much computation is enough? How much reduction is too much? An objective metric of image quality which takes into account basic characteristics of the human perception could be of some help to answer these questions without human assistance. How much computation is enough? How much reduction is too much? An objective metric of image quality which takes into account basic characteristics of the human perception could be of some help to answer these questions without human assistance.
5
Motivation In the traditional approach to rendering of high quality animation sequences every frame is considered separately. This precludes accounting for the visual sensitivity to temporal detail.In the traditional approach to rendering of high quality animation sequences every frame is considered separately. This precludes accounting for the visual sensitivity to temporal detail. Our goal is to improve the performance of walkthrough sequences rendering by considering both the spatial and temporal aspects of human perception.Our goal is to improve the performance of walkthrough sequences rendering by considering both the spatial and temporal aspects of human perception. We want to focus computational efforts on those image details that can be readily perceived in the animated sequence.We want to focus computational efforts on those image details that can be readily perceived in the animated sequence. In the traditional approach to rendering of high quality animation sequences every frame is considered separately. This precludes accounting for the visual sensitivity to temporal detail.In the traditional approach to rendering of high quality animation sequences every frame is considered separately. This precludes accounting for the visual sensitivity to temporal detail. Our goal is to improve the performance of walkthrough sequences rendering by considering both the spatial and temporal aspects of human perception.Our goal is to improve the performance of walkthrough sequences rendering by considering both the spatial and temporal aspects of human perception. We want to focus computational efforts on those image details that can be readily perceived in the animated sequence.We want to focus computational efforts on those image details that can be readily perceived in the animated sequence.
6
Modeling important characteristics of the Human Visual System Contrast Sensitivity Function which specifies the detection threshold for a stimulus as a function of its spatial and temporal frequencies.Contrast Sensitivity Function which specifies the detection threshold for a stimulus as a function of its spatial and temporal frequencies. Temporal and spatial mechanisms (channels) which are used to represent the visual information at various scales and orientations as it is believed that primary visual cortex does.Temporal and spatial mechanisms (channels) which are used to represent the visual information at various scales and orientations as it is believed that primary visual cortex does. Visual masking affecting the detection threshold of a stimulus as a function of the interfering background stimulus which is closely coupled in space and time.Visual masking affecting the detection threshold of a stimulus as a function of the interfering background stimulus which is closely coupled in space and time. Contrast Sensitivity Function which specifies the detection threshold for a stimulus as a function of its spatial and temporal frequencies.Contrast Sensitivity Function which specifies the detection threshold for a stimulus as a function of its spatial and temporal frequencies. Temporal and spatial mechanisms (channels) which are used to represent the visual information at various scales and orientations as it is believed that primary visual cortex does.Temporal and spatial mechanisms (channels) which are used to represent the visual information at various scales and orientations as it is believed that primary visual cortex does. Visual masking affecting the detection threshold of a stimulus as a function of the interfering background stimulus which is closely coupled in space and time.Visual masking affecting the detection threshold of a stimulus as a function of the interfering background stimulus which is closely coupled in space and time.
7
Spatiovelocity Contrast Sensitivity Function Contrast sensitivity data for traveling gratings of various spatial frequencies were derived in Kelly’s psychophysical experiments (1960).Contrast sensitivity data for traveling gratings of various spatial frequencies were derived in Kelly’s psychophysical experiments (1960). Daly (1998) extended Kelly’s model to account for target tracking by the eye movements.Daly (1998) extended Kelly’s model to account for target tracking by the eye movements. Contrast sensitivity data for traveling gratings of various spatial frequencies were derived in Kelly’s psychophysical experiments (1960).Contrast sensitivity data for traveling gratings of various spatial frequencies were derived in Kelly’s psychophysical experiments (1960). Daly (1998) extended Kelly’s model to account for target tracking by the eye movements.Daly (1998) extended Kelly’s model to account for target tracking by the eye movements. log visual sensitivity log velocity [deg/sec] log spatial frequency [cycles/deg] Temporal frequency [Hz]
8
Spatial and orientation mechanisms The following filter banks are commonly used: Gabor functions (Marcelja80),Gabor functions (Marcelja80), Steerable pyramid transform (Simoncelli92),Steerable pyramid transform (Simoncelli92), Discrete Cosine Transform (DCT),Discrete Cosine Transform (DCT), Difference of Gaussians (Laplacian) pyramids (Burt83,Wilson91),Difference of Gaussians (Laplacian) pyramids (Burt83,Wilson91), Cortex transform (Watson87, Daly93).Cortex transform (Watson87, Daly93). The following filter banks are commonly used: Gabor functions (Marcelja80),Gabor functions (Marcelja80), Steerable pyramid transform (Simoncelli92),Steerable pyramid transform (Simoncelli92), Discrete Cosine Transform (DCT),Discrete Cosine Transform (DCT), Difference of Gaussians (Laplacian) pyramids (Burt83,Wilson91),Difference of Gaussians (Laplacian) pyramids (Burt83,Wilson91), Cortex transform (Watson87, Daly93).Cortex transform (Watson87, Daly93).
9
Cortex transform: organization of the filter bank
10
orientation bands Cortex transform: orientation bands Input image
11
Visual masking Masking is strongest between stimuli located in the same perceptual channel, and many vision models are limited to this intra-channel masking.Masking is strongest between stimuli located in the same perceptual channel, and many vision models are limited to this intra-channel masking. The following threshold elevation model is commonly used:The following threshold elevation model is commonly used: Masking is strongest between stimuli located in the same perceptual channel, and many vision models are limited to this intra-channel masking.Masking is strongest between stimuli located in the same perceptual channel, and many vision models are limited to this intra-channel masking. The following threshold elevation model is commonly used:The following threshold elevation model is commonly used:
12
Experimental findings on the human perception of animated sequences The requirements imposed on the quality of still images must be higher than for images used in an animated sequence. The quality requirements can usually be relaxed as the velocity of the visual pattern increases.The requirements imposed on the quality of still images must be higher than for images used in an animated sequence. The quality requirements can usually be relaxed as the velocity of the visual pattern increases. The perceived sharpness of blurred visual patterns increases with their motion velocity, which is attributed to the higher level processing in the visual system.The perceived sharpness of blurred visual patterns increases with their motion velocity, which is attributed to the higher level processing in the visual system. The human eye is less sensitive to higher spatial frequencies than to lower frequencies.The human eye is less sensitive to higher spatial frequencies than to lower frequencies. The requirements imposed on the quality of still images must be higher than for images used in an animated sequence. The quality requirements can usually be relaxed as the velocity of the visual pattern increases.The requirements imposed on the quality of still images must be higher than for images used in an animated sequence. The quality requirements can usually be relaxed as the velocity of the visual pattern increases. The perceived sharpness of blurred visual patterns increases with their motion velocity, which is attributed to the higher level processing in the visual system.The perceived sharpness of blurred visual patterns increases with their motion velocity, which is attributed to the higher level processing in the visual system. The human eye is less sensitive to higher spatial frequencies than to lower frequencies.The human eye is less sensitive to higher spatial frequencies than to lower frequencies.
13
Video quality metrics Virtually all state-of-the-art perception-based video quality metrics account for the discussed HVS characteristics.Virtually all state-of-the-art perception-based video quality metrics account for the discussed HVS characteristics. A majority of the existing video quality metrics have been developed to evaluate performance of digital video coding and compression techniques, e.g., Lambrecht (1996), Lubin (1997), and Watson (1998).A majority of the existing video quality metrics have been developed to evaluate performance of digital video coding and compression techniques, e.g., Lambrecht (1996), Lubin (1997), and Watson (1998). The lack of comparative studies make it unclear which metric performs best.The lack of comparative studies make it unclear which metric performs best. We use our own metric that takes advantage of data readily available for synthetic images.We use our own metric that takes advantage of data readily available for synthetic images. Virtually all state-of-the-art perception-based video quality metrics account for the discussed HVS characteristics.Virtually all state-of-the-art perception-based video quality metrics account for the discussed HVS characteristics. A majority of the existing video quality metrics have been developed to evaluate performance of digital video coding and compression techniques, e.g., Lambrecht (1996), Lubin (1997), and Watson (1998).A majority of the existing video quality metrics have been developed to evaluate performance of digital video coding and compression techniques, e.g., Lambrecht (1996), Lubin (1997), and Watson (1998). The lack of comparative studies make it unclear which metric performs best.The lack of comparative studies make it unclear which metric performs best. We use our own metric that takes advantage of data readily available for synthetic images.We use our own metric that takes advantage of data readily available for synthetic images.
14
Deriving pixel flow using Image-Based Rendering techniques
15
Animation Quality Metric (AQM) Perception-based visible differences predictor for still images (Eriksson et al., 1998) was extended.Perception-based visible differences predictor for still images (Eriksson et al., 1998) was extended. Pixel Flow derived via 3D Warping provides velocity data as required by Kelly’s SV-CSF model.Pixel Flow derived via 3D Warping provides velocity data as required by Kelly’s SV-CSF model. Perception-based visible differences predictor for still images (Eriksson et al., 1998) was extended.Perception-based visible differences predictor for still images (Eriksson et al., 1998) was extended. Pixel Flow derived via 3D Warping provides velocity data as required by Kelly’s SV-CSF model.Pixel Flow derived via 3D Warping provides velocity data as required by Kelly’s SV-CSF model. Error pooling Amplitude Compress. Spatio- velocity CSF Global Contrast Cortex Filtering Visual Masking + Image 1 Image 2 Perceptual difference 3D Warp Pixel Flow Amplitude Compress. Spatio- velocity CSF Global Contrast Cortex Filtering Visual Masking Range Data Camera
16
Using IBR techniques to improve the performance of animation rendering Assumptions: - static environments - predefined animation path Assumptions: - static environments - predefined animation path Joint work with P. Rokita and T. Tawara
17
Animation rendering - objectives Use ray tracing to compute all key frames and selected glossy and transparent objects.Use ray tracing to compute all key frames and selected glossy and transparent objects. For inbetween frames, derive as many pixels as possible using computationally inexpensive Image Based Rendering techniques.For inbetween frames, derive as many pixels as possible using computationally inexpensive Image Based Rendering techniques. The animation quality as perceived by the human observer must not be affected.The animation quality as perceived by the human observer must not be affected. Use ray tracing to compute all key frames and selected glossy and transparent objects.Use ray tracing to compute all key frames and selected glossy and transparent objects. For inbetween frames, derive as many pixels as possible using computationally inexpensive Image Based Rendering techniques.For inbetween frames, derive as many pixels as possible using computationally inexpensive Image Based Rendering techniques. The animation quality as perceived by the human observer must not be affected.The animation quality as perceived by the human observer must not be affected.
18
Keyframe placement - our approach Our goal is to find inexpensive and automatic solution, which reduces animation artifacts which can be perceived by the human observer.Our goal is to find inexpensive and automatic solution, which reduces animation artifacts which can be perceived by the human observer. Our solution consists of two stages:Our solution consists of two stages: –initial keyframe placement which reduces the number of pixels which cannot be properly derived using IBR techniques due to occlusion problems, –further refinement of keyframe placement which takes into account perceptual considerations, and is guided by AQM predictions. Our goal is to find inexpensive and automatic solution, which reduces animation artifacts which can be perceived by the human observer.Our goal is to find inexpensive and automatic solution, which reduces animation artifacts which can be perceived by the human observer. Our solution consists of two stages:Our solution consists of two stages: –initial keyframe placement which reduces the number of pixels which cannot be properly derived using IBR techniques due to occlusion problems, –further refinement of keyframe placement which takes into account perceptual considerations, and is guided by AQM predictions.
19
In-between frame generation
20
Examples of final frames Supersampled frame used in traditional animations Corresponding frame derived using our approach In both cases the perceived quality of animation appears to be similar!
21
Exploiting spatial and temporal coherence of indirect lighting in high-quality animation rendering Assumptions: - dynamic environments - predefined animation path Assumptions: - dynamic environments - predefined animation path Joint work with T. Tawara, H. Akamine and HP. Seidel
22
Indirect lighting in animated sequences Basic characteristics: Usually quite costly to compute.Usually quite costly to compute. Usually changes slowly and smoothly both in temporal and spatial domains.Usually changes slowly and smoothly both in temporal and spatial domains. Practical approaches: Compute indirect lighting for every n-th frame and assume that it does not change for inbetween frames. In some global illumination frameworks interpolation between a pair of keyframes is possible. Basic characteristics: Usually quite costly to compute.Usually quite costly to compute. Usually changes slowly and smoothly both in temporal and spatial domains.Usually changes slowly and smoothly both in temporal and spatial domains. Practical approaches: Compute indirect lighting for every n-th frame and assume that it does not change for inbetween frames. In some global illumination frameworks interpolation between a pair of keyframes is possible.
23
Possible problems Popping effects.Popping effects. Improper image appearance in the regions illuminated mostly by indirect lighting.Improper image appearance in the regions illuminated mostly by indirect lighting. Popping effects.Popping effects. Improper image appearance in the regions illuminated mostly by indirect lighting.Improper image appearance in the regions illuminated mostly by indirect lighting.
24
Our framework Density Estimation Particle Tracing algorithm.Density Estimation Particle Tracing algorithm. Illumination reconstruction using the histogram density estimation method (photon bucketing into dense mesh).Illumination reconstruction using the histogram density estimation method (photon bucketing into dense mesh). Photon storage for possible re-use in neighboring frames.Photon storage for possible re-use in neighboring frames. Photons are computed for each frame and “attached” to mesh elements even for moving objects.Photons are computed for each frame and “attached” to mesh elements even for moving objects. Direct lighting computed from scratch for each frame using ray tracing.Direct lighting computed from scratch for each frame using ray tracing. Density Estimation Particle Tracing algorithm.Density Estimation Particle Tracing algorithm. Illumination reconstruction using the histogram density estimation method (photon bucketing into dense mesh).Illumination reconstruction using the histogram density estimation method (photon bucketing into dense mesh). Photon storage for possible re-use in neighboring frames.Photon storage for possible re-use in neighboring frames. Photons are computed for each frame and “attached” to mesh elements even for moving objects.Photons are computed for each frame and “attached” to mesh elements even for moving objects. Direct lighting computed from scratch for each frame using ray tracing.Direct lighting computed from scratch for each frame using ray tracing.
25
Temporal photon processing Contradictory requirement: maximize the number of photons collected in the temporal domain to reduce the noise which is inherent for stochastic solutionsmaximize the number of photons collected in the temporal domain to reduce the noise which is inherent for stochastic solutions minimize the number of neighboring frames for which those photons were traced to avoid collecting invalid photons.minimize the number of neighboring frames for which those photons were traced to avoid collecting invalid photons. Contradictory requirement: maximize the number of photons collected in the temporal domain to reduce the noise which is inherent for stochastic solutionsmaximize the number of photons collected in the temporal domain to reduce the noise which is inherent for stochastic solutions minimize the number of neighboring frames for which those photons were traced to avoid collecting invalid photons.minimize the number of neighboring frames for which those photons were traced to avoid collecting invalid photons.
26
Temporal photon processing: our solution An energy based stochastic error metric is used to guide the photon collection in the temporal domain.An energy based stochastic error metric is used to guide the photon collection in the temporal domain. –The metric is applied to each mesh element and to every frame. –Thus, it must be inexpensive. The perception-based AQM is used for finding the minimal number of photons per frame to reduce the noise level below the visibility threshold.The perception-based AQM is used for finding the minimal number of photons per frame to reduce the noise level below the visibility threshold. An energy based stochastic error metric is used to guide the photon collection in the temporal domain.An energy based stochastic error metric is used to guide the photon collection in the temporal domain. –The metric is applied to each mesh element and to every frame. –Thus, it must be inexpensive. The perception-based AQM is used for finding the minimal number of photons per frame to reduce the noise level below the visibility threshold.The perception-based AQM is used for finding the minimal number of photons per frame to reduce the noise level below the visibility threshold.
27
Algorithm 1.Initialization: determine the initial number of photons per frame 2.Adjust the animation segment length depending on temporal variations of indirect lighting which are measured using energy-based criteria. 3.Adjust the number of photons per frame based on the AQM response to limit the perceivable noise. 4.Spatiotemporal reconstruction of indirect lighting. 5.Spatial filtering step. 1.Initialization: determine the initial number of photons per frame 2.Adjust the animation segment length depending on temporal variations of indirect lighting which are measured using energy-based criteria. 3.Adjust the number of photons per frame based on the AQM response to limit the perceivable noise. 4.Spatiotemporal reconstruction of indirect lighting. 5.Spatial filtering step.
28
Referencesolution Adaptive photoncollection The AQM predicted differences
29
The AQM predictions
30
Distribution of mesh elements for frame K as a function of the number of preceding (negative values) and following frames for which temporal photon processing was possible. The maximum assumed temporal expansion is specified in the legend.
31
The AQM predicted percentage of pixels with perceivable differences as a function of the number of photons per frame.
32
The number of photons per frame: 10,000 25,000 ONOFF Temporal processing ONOFF Temporal processing
33
Visual Attention Driven Progressive Rendering for Interactive Walkthroughs Joint work with J. Haber, H. Yamauchi and HP. Seidel
34
Rendering glossy sufaces in interactive applications Lambertian lighting component is stored in illumination maps:Lambertian lighting component is stored in illumination maps: –High quality radiosity solutions. –Mesh and textures used to reconstruct illumination. The quality of graphics hardware supported solutions is too low.The quality of graphics hardware supported solutions is too low. Ray tracing is too costly to perform for all pixels representing glossy surfaces.Ray tracing is too costly to perform for all pixels representing glossy surfaces. Lambertian lighting component is stored in illumination maps:Lambertian lighting component is stored in illumination maps: –High quality radiosity solutions. –Mesh and textures used to reconstruct illumination. The quality of graphics hardware supported solutions is too low.The quality of graphics hardware supported solutions is too low. Ray tracing is too costly to perform for all pixels representing glossy surfaces.Ray tracing is too costly to perform for all pixels representing glossy surfaces.
35
How to obtain the best image quality as perceived by human observers? Use visual attention models to drive corrective computations for glossy objects that are likely to be “attended”:Use visual attention models to drive corrective computations for glossy objects that are likely to be “attended”: –Consider both the saliency- and task-driven selection of those objects. –Shading artifacts of “unattended” objects are likely to remain unnoticed. Use progressive rendering approach:Use progressive rendering approach: –Hierarchical sample splatting in the image space. –Cache samples and re-use them for similar views. Use multiple processors to increase the sample number.Use multiple processors to increase the sample number. Use visual attention models to drive corrective computations for glossy objects that are likely to be “attended”:Use visual attention models to drive corrective computations for glossy objects that are likely to be “attended”: –Consider both the saliency- and task-driven selection of those objects. –Shading artifacts of “unattended” objects are likely to remain unnoticed. Use progressive rendering approach:Use progressive rendering approach: –Hierarchical sample splatting in the image space. –Cache samples and re-use them for similar views. Use multiple processors to increase the sample number.Use multiple processors to increase the sample number.
36
Visual attention model Bottom-up processing is purely saliency-driven and follows the attention model developed by Itti et al. (1998).Bottom-up processing is purely saliency-driven and follows the attention model developed by Itti et al. (1998). Top-down processing added to account for volition-controlled and task-dependent attention.Top-down processing added to account for volition-controlled and task-dependent attention. Bottom-up processing is purely saliency-driven and follows the attention model developed by Itti et al. (1998).Bottom-up processing is purely saliency-driven and follows the attention model developed by Itti et al. (1998). Top-down processing added to account for volition-controlled and task-dependent attention.Top-down processing added to account for volition-controlled and task-dependent attention.
37
System overview
38
Visual attention processing Open GL rendering Corrective splatting Converged solution: 5s Saliency map
39
… Splat levels: 1-2 1-3 1-6 1-2 1-3 1-6 2 1 by Stamminger and Drettakis
40
Warped old samples Adaptive splatting Zoom in New samples: New samples: levels 1-2 levels 1-3 levels 1-2 levels 1-3 58 samples 188 samples
41
Dynamic load balance Onyx3 with 8 processors: –6 processors for corrective ray tracing. –Colors indicate pixels computed by different processors. Onyx3 with 8 processors: –6 processors for corrective ray tracing. –Colors indicate pixels computed by different processors.
42
Timings measured on Onyx3 Computing the saliency map: less than 0.3s.Computing the saliency map: less than 0.3s. Aging of samples: 0.01s; warping and splatting cached samples: 0.02s (for 50,000 samples).Aging of samples: 0.01s; warping and splatting cached samples: 0.02s (for 50,000 samples). OpenGL rendering: 0.05s for 90,000 triangles.OpenGL rendering: 0.05s for 90,000 triangles. Ray traced samples: 0.05s for 2,500 samples using 6 processors.Ray traced samples: 0.05s for 2,500 samples using 6 processors. This makes it possible the frame rate 8-10fps for scenes composed of less than 100,000.This makes it possible the frame rate 8-10fps for scenes composed of less than 100,000. Computing the saliency map: less than 0.3s.Computing the saliency map: less than 0.3s. Aging of samples: 0.01s; warping and splatting cached samples: 0.02s (for 50,000 samples).Aging of samples: 0.01s; warping and splatting cached samples: 0.02s (for 50,000 samples). OpenGL rendering: 0.05s for 90,000 triangles.OpenGL rendering: 0.05s for 90,000 triangles. Ray traced samples: 0.05s for 2,500 samples using 6 processors.Ray traced samples: 0.05s for 2,500 samples using 6 processors. This makes it possible the frame rate 8-10fps for scenes composed of less than 100,000.This makes it possible the frame rate 8-10fps for scenes composed of less than 100,000.
43
Summary We proposed an Animation Quality Metric suitable for estimating quality of sequences of synthetic images.We proposed an Animation Quality Metric suitable for estimating quality of sequences of synthetic images. We developed a system for animation rendering featuring perception-based guidance of inbetween frames computation which reduces the rendering costs.We developed a system for animation rendering featuring perception-based guidance of inbetween frames computation which reduces the rendering costs. We proposed an animation rendering technique with spatio- temporal photon processing, which makes possible efficient computation of global illumination for dynamic environments.We proposed an animation rendering technique with spatio- temporal photon processing, which makes possible efficient computation of global illumination for dynamic environments. We used a visual attention model to drive corrective computations during walkthrough animation in environments with arbitrary reflectance functions.We used a visual attention model to drive corrective computations during walkthrough animation in environments with arbitrary reflectance functions. We proposed an Animation Quality Metric suitable for estimating quality of sequences of synthetic images.We proposed an Animation Quality Metric suitable for estimating quality of sequences of synthetic images. We developed a system for animation rendering featuring perception-based guidance of inbetween frames computation which reduces the rendering costs.We developed a system for animation rendering featuring perception-based guidance of inbetween frames computation which reduces the rendering costs. We proposed an animation rendering technique with spatio- temporal photon processing, which makes possible efficient computation of global illumination for dynamic environments.We proposed an animation rendering technique with spatio- temporal photon processing, which makes possible efficient computation of global illumination for dynamic environments. We used a visual attention model to drive corrective computations during walkthrough animation in environments with arbitrary reflectance functions.We used a visual attention model to drive corrective computations during walkthrough animation in environments with arbitrary reflectance functions.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.