Part 4: Many-Light Rendering on Mobile Hardware 40 min Markus Billeter VMML University of Zürich.

Slides:



Advertisements
Similar presentations
POST-PROCESSING SET09115 Intro Graphics Programming.
Advertisements

Deferred Shading Optimizations
COMPUTER GRAPHICS SOFTWARE.
CS123 | INTRODUCTION TO COMPUTER GRAPHICS Andries van Dam © 1/16 Deferred Lighting Deferred Lighting – 11/18/2014.
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems.
SDSM SAMPLE DISTRIBUTION SHADOW MAPS The Evolution Of PSSM.
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
High-Quality Parallel Depth-of- Field Using Line Samples Stanley Tzeng, Anjul Patney, Andrew Davidson, Mohamed S. Ebeida, Scott A. Mitchell, John D. Owens.
Fast GPU Histogram Analysis for Scene Post- Processing Andy Luedke Halo Development Team Microsoft Game Studios.
CS6500 Adv. Computer Graphics © Chun-Fa Chang, Spring 2003 Texture Mapping II April 10, 2003.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
IN4151 Introduction 3D graphics 1 Introduction to 3D computer graphics part 2 Viewing pipeline Multi-processor implementation GPU architecture GPU algorithms.
Approximate Soft Shadows on Arbitrary Surfaces using Penumbra Wedges Tomas Akenine-Möller Ulf Assarsson Department of Computer Engineering, Chalmers University.
Advanced Texture Mapping May 10, Today’s Topics Mip Mapping Projective Texture Shadow Map.
Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.
The Graphics Pipeline CS2150 Anthony Jones. Introduction What is this lecture about? – The graphics pipeline as a whole – With examples from the video.
Paper by Alexander Keller
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
4.2. D EFERRED S HADING Exploration of deferred shading (rendering)
Filtering Approaches for Real-Time Anti-Aliasing /
© Copyright Khronos Group, Page 1 Harnessing the Horsepower of OpenGL ES Hardware Acceleration Rob Simpson, Bitboys Oy.
Basics of a Computer Graphics System Introduction to Computer Graphics CSE 470/598 Arizona State University Dianne Hansford.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
Computer Graphics Graphics Hardware
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
09/09/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Event management Lag Group assignment has happened, like it or not.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
Week 2 - Friday.  What did we talk about last time?  Graphics rendering pipeline  Geometry Stage.
OpenGL ES Performance (and Quality) on the GoForce5500 Handheld GPU Lars M. Bishop, NVIDIA Developer Technologies.
Introduction to Parallel Rendering Jian Huang, CS 594, Spring 2002.
3D Graphics for Game Programming Chapter IV Fragment Processing and Output Merging.
CS 480/680 Intro Dr. Frederick C Harris, Jr. Fall 2014.
Shadow Mapping Chun-Fa Chang National Taiwan Normal University.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
GRAPHICS PIPELINE & SHADERS SET09115 Intro to Graphics Programming.
Tiled Forward Shading Johan Medeström. Project Goals Render a scene with lots of lights Learn more OpenGL and shading techniques Learn more about OpenCL/Compute.
Accelerated Stereoscopic Rendering using GPU François de Sorbier - Université Paris-Est France February 2008 WSCG'2008.
Stencil Routed A-Buffer
- Laboratoire d'InfoRmatique en Image et Systèmes d'information
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Based on paper by: Rahul Khardekar, Sara McMains Mechanical Engineering University of California, Berkeley ASME 2006 International Design Engineering Technical.
Emerging Technologies for Games Deferred Rendering CO3303 Week 22.
Mobile Graphics Patrick Cozzi University of Pennsylvania CIS Spring 2012.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Computing & Information Sciences Kansas State University Lecture 12 of 42CIS 636/736: (Introduction to) Computer Graphics CIS 636/736 Computer Graphics.
Maths & Technologies for Games Graphics Optimisation - Batching CO3303 Week 5.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
COMPUTER GRAPHICS CS 482 – FALL 2015 SEPTEMBER 29, 2015 RENDERING RASTERIZATION RAY CASTING PROGRAMMABLE SHADERS.
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
SPONSORED BY SA2014.SIGGRAPH.ORG Efficient Real-Time Shading with Many Lights 2014 Many Light Rendering on Mobile Hardware Part 4/4 Presenter: Markus Billeter.
GLSL Review Monday, Nov OpenGL pipeline Command Stream Vertex Processing Geometry processing Rasterization Fragment processing Fragment Ops/Blending.
Postmortem: Deferred Shading in Tabula Rasa Rusty Koonce NCsoft September 15, 2008.
Image Fusion In Real-time, on a PC. Goals Interactive display of volume data in 3D –Allow more than one data set –Allow fusion of different modalities.
Computer Graphics Graphics Hardware
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Patrick Cozzi University of Pennsylvania CIS Fall 2013
Graphics Processing Unit
Deferred Lighting.
Introduction to OpenGL
Real-time Rendering Shadow Maps
Graphics Processing Unit
UMBC Graphics for Games
Computer Graphics Graphics Hardware
Introduction to OpenGL
Presentation transcript:

Part 4: Many-Light Rendering on Mobile Hardware 40 min Markus Billeter VMML University of Zürich

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Many-Light on Mobiles Outline

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Many-Light on Mobiles Outline –What’s different?

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Many-Light on Mobiles Outline –What’s different? –Review with a twist towards Mobile HW

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Many-Light on Mobiles Outline –What’s different? –Review with a twist towards Mobile HW –Case study: Clustered Implementation

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 M OBILE H ARDWARE Part 1

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Mobile vs. Desktop So far: considered modern high-end systems How is mobile hardware different?

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Mobile vs. Desktop (II) Considerations: –Lower specs, similar amounts of pixels

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Mobile vs. Desktop (II) Considerations: –Lower specs, similar amounts of pixels –Energy consumption (more) important –Fewer features for now…

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 A few numbers… (I) Android OpenGL Version: OpenGL ES VersionPercentage 2.063% 3.035% 3.12% Data from June 2015 Source:

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 A few numbers… (II) Unity Mobile Stats (various platforms): OpenGL ES Version(*)Percentage 2.058% 3.040% 3.1~2% Data from April 2015; (*) = Shader Generations Source:

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 A few numbers… (III) Respectable amount of ES 3.0 devices Very few ES 3.1 devices for now :-( Big chunk of ES 2.0

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 A few numbers… (III) Respectable amount of ES 3.0 devices Very few ES 3.1 devices for now :-( Big chunk of ES 2.0 … will include ES 2.0 some considerations

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 OpenGL ES ES 2.0 : shader-based and FBO support –But no MRT in core

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 OpenGL ES ES 2.0 : shader-based and FBO support –But no MRT in core ES 3.0 “fixes” that

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 GPU Compute What about compute shaders?

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 GPU Compute What about compute shaders? –Tricky

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 GPU Compute (II) OpenCL –Typically not officially supported –(some devices have it anyway)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 GPU Compute (II) OpenCL –Typically not officially supported –(some devices have it anyway) GL Compute >= ES 3.1

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Mobile Architecture

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Mobile Architecture Mobile GPUs are commonly tile based –“TBR” (Tile Based Renderer) Contrast: desktop GPUs are IMR –“IMR” = Immediate Mode Renderer

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Mobile Architecture Mobile GPUs are commonly tile based –“TBR” (Tile Based Renderer) Contrast: desktop GPUs are IMR –“IMR” = Immediate Mode Renderer –IMR mobile GPUs exist too, though.

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Immediate Mode Renderer (IMR) “Normal” rendering Geometry transformed & rasterized immediately

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Immediate Mode Renderer (IMR) “Normal” rendering Geometry transformed & rasterized immediately Framebuffer resides mainly in VRAM

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Immediate Mode Renderer (IMR) “Normal” rendering Geometry transformed & rasterized immediately Framebuffer resides mainly in VRAM Multiple writes to VRAM

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tile Based Renderer (TBR)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tile Based Renderer (TBR) Framebuffer divided into tiles Geometry binned into tiles

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tile Based Renderer (TBR) Framebuffer divided into tiles Geometry binned into tiles Tiles processed later independently

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tile Based Renderer (TBR) Framebuffer divided into tiles Geometry binned into tiles Tiles processed later independently Tile’s portion of the framebuffer stays in fast on-chip memory!

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tile Based Renderer (TBR) (II) Tile’s FB-contents stored to RAM when needed

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tile Based Renderer (TBR) (II) Tile’s FB-contents stored to RAM when needed Best case: once –When all rendering to that tile has finished

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Some Examples By the presented classification –TBR: Mali, Adreno, PowerVR … –IMR: Tegra (and most desktop GPUs)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Intro - Summary Majority of mobile GPUs are TBR –Pick a method that works well with this

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Intro – Summary (II) TBR: keep tile’s portion of the FB on-chip

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Intro – Summary (II) TBR: keep tile’s portion of the FB on-chip –Make sure it stays there –Loads/stores from/to RAM use precious BW

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Intro – Summary (II) TBR: keep tile’s portion of the FB on-chip –Make sure it stays there –Loads/stores from/to RAM use precious BW Goal: keep data on-chip when possible

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Intro – Summary (II) TBR: keep tile’s portion of the FB on-chip –Make sure it stays there –Loads/stores from/to RAM use precious BW Goal: keep data on-chip when possible –Preferably without affecting IMR negatively

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Important! Tile-based renderer (TBR) != Tiled Shading

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Important! TBR: hardware property –It’s out of your hands Tiled Shading: software algorithm –Your choice Src: Wikipedia/Exynos, by User:Köf3, CC-SA3

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Important! It’s perfectly valid to use Tiled Shading on a TBR platform –As we’ll see shortly.

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 M ANY -L IGHT M ETHODS R EVISITED Part 2

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Many-Light Methods Revisited Ola listed a number of methods in his introduction earlier in the course –Quickly revisit these Plus two new methods for mobile arch.

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Forward Traditional Deferred Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled & Clustered I’ll lump tiled and clustered shading together for now

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled & Clustered I’ll lump tiled and clustered shading together for now Basic idea is similar enough –Pick the one that suits your use case better –E.g., tiled for 2D-ish settings, clustered for 3D.. Just Cause 2 (Avalanche Studios) Star Craft 2 (Blizzard) vs

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Forward Traditional Deferred Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Traditional Deferred Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 “Plain” Forward Assign lights per geometry batch Loop over lights in fragment shader

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 “Plain” Forward Assign lights per geometry batch Loop over lights in fragment shader Default method with no extra frills Possible “anywhere”

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 “Plain Forward” (II) Scales badly with number of lights –See Ola’s introduction

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Traditional Deferred Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Traditional Deferred Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Traditional Deferred Render scene to G-Buffers Color Normal Position(Z)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Traditional Deferred Render scene to G-Buffers Render lights using proxy geometry Color Position(Z) Normal

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Traditional Deferred Render scene to G-Buffers Render lights using proxy geometry In shader: Read G-Buffers, compute lighting, blend results. Color Position(Z) Normal

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Traditional Deferred (II) Many reads from G-Buffer + many writes to Framebuffer

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Traditional Deferred (II) Many reads from G-Buffer + many writes to Framebuffer Store G-Buffer to RAM (and potentially restore depth to on-chip)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Clustered Deferred Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred Designed to avoid some of the issues of traditional deferred

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (II) Render scene to G-Buffers Compute light assignment –Use e.g. depth from G-Buffers … … … L0L0 L1L1 L2L2 L3L3 L4L4 L5L5 … L6L6 4 … …

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (II) Render scene to G-Buffers Compute light assignment –Use e.g. depth from G-Buffers … … … L0L0 L1L1 L2L2 L3L3 L4L4 L5L5 … L6L6 4 … …

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (II) Render scene to G-Buffers Compute light assignment –Use e.g. depth from G-Buffers Full screen pass: compute lighting

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (III) Lighting pass: –Read G-Buffer once –Find pixel’s tile/cluster –Loop over assigned lights => compute lighting –Write result to FB once

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (IV) Still requires off-chip G-Buffers –But avoids reading multiple times from them –Also avoids loading back depth to on-chip

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (V) Original clustered method relies on compute shaders to –Identify active clusters –Perform light assignment

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (V) Original clustered method relies on compute shaders to –Identify active clusters –Perform light assignment Avoid with Emil’s Practical Clustered

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Deferred (VI) For tiled shading –Perform light assignment on CPU –Compute per-tile bounds in fragment shader –No compute shaders, but potential read-back

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clustered Forward Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Forward Use depth-pass to determine per-tile Z-bounds or clusters –For light assignment … … … L0L0 L1L1 L2L2 L3L3 L4L4 L5L5 … L6L6 4 … …

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Forward (II) Render scene normally For each sample –Determine sample’s tile/cluster –Read associated light list

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Forward (II) Render scene normally For each sample –Determine sample’s tile/cluster –Read associated light list Tiled Forward a.k.a. Forward+

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Forward (III) No G-Buffers / MRT!

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Forward (III) No G-Buffers / MRT! Depth buffer still transferred to RAM –And potentially back

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Tiled/Clustered Forward (IV) Tiled Forward possible in ES 2.0 Handy extensions: –Render to depth texture –Dynamic looping in fragment shader

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Practical Forward Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓*

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓*

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Practical Clustered Forward Clustered variant presented by Emil

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Practical Clustered Forward Clustered variant presented by Emil Applicable to both deferred and forward –Or a mix of them Mostly interested in forward-only…

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Practical Clustered Forward (II) Perform light-assignment up-front –To dense cluster structure –Potentially on CPU Then render scene in “normally”

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓*

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Deferred Tile Store Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Practical Fwd. Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Deferred with Tile Storage Very similar to traditional deferred.

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Deferred with Tile Storage Very similar to traditional deferred. Stores G-Buffer temporarily on-chip –G-Buffer only available while a tile is being processed!

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Deferred with Tile Storage (II) Relies on OpenGL Extensions –EXT_shader_pixel_local_storage –ARM_shader_framebuffer_fetch –ARM_shader_framebuffer_fetch_depth_stencil

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Deferred with Tile Storage (III) Key: EXT_shader_pixel_local_storage

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Deferred with Tile Storage (III) Key: EXT_shader_pixel_local_storage –Gives small amount of on-chip per-pixel storage –Preserved across fragment shader instances –But not backed by external RAM

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Deferred with Tile Storage (IV) A bit finicky –Writes to ordinary color output destroys storage contents –Incompatible with MSAA See Sam Martin’s original presentation from SIGGRAPH Read Spec.

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Vulkan Note Probably achievable in Vulkan? –E.g., with the transient FB-attachments –Becomes Traditional Deferred on IMR.

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Practical Fwd. Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Practical Fwd. Forward Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓ ✓ Color1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Practical Fwd. Def. Tile Store Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓ ✓ Color1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack Uses on-chip storage for lights instead

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack Uses on-chip storage for lights instead Do depth-only pre-pass Splat lights to build per-pixel light lists –Store using EXT_shader_pixel_local_storage

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack (I) Forward render geometry

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack (II) Limited light list size

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack (II) Limited light list size Incompatible with MSAA

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack (II) Limited light list size Incompatible with MSAA Blending possible –“By hand” –Requires some bytes per per-pixel storage

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Forward with Light Stack (III) Limitations may be relaxed in the future? –MSAA –More storage

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Practical Fwd. Def. Tile Store Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓ ✓ Color1No

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd Trad. Def. Clust. Def. Clust. Fwd. Practical Fwd. Def. Tile Store Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓ ✓ Color1No ✓ Color2PreZ ✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd. Trad. Def. Clust. Def. Clust. Fwd Practical Fwd Def. Tile Store Fwd. Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓ ✓ Color1No ✓ Color2PreZ ✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Methods Plain Fwd. Trad. Def. Clust. Def. Clust. Fwd Practical Fwd Def. Tile Store Fwd. Light Stack Many Light Off-Chip Buffers? geometry passes over- shading MSAABlend.GL|ES Color1Yes ✓✓ ✓ G-Buffers1No ✓ G-Buffers1No ✓ Depth2PreZ ✓*✓* ✓*✓* ✓ Color1Yes ✓✓ ✓ Color1No ✓ Color2PreZ ✓

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Practical Clustered Forward Supported on many devices –TBR: stays on-chip –IMR: nothing special MSAA & Blending

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Practical Clustered Forward Supported on many devices –TBR: stays on-chip –IMR: nothing special MSAA & Blending I’m maybe a bit biased here… ;-)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 C USTOM P RACTICAL C LUSTERED F ORWARD. Part 3

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Clustering, Redux Shown a few different ways of clustering –“original” sparse exponential way –Emil’s up-front dense clustering –2D-Tiling

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Clustering, Redux Shown a few different ways of clustering –“original” sparse exponential way –Emil’s up-front dense clustering –2D-Tiling One more variation

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem Up-front clustering assigns into a dense structure

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem Up-front clustering assigns into a dense structure Possibly a ton of clusters

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem (II) Reduce #clusters… How?

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem (II) Reduce #clusters… How? –Lower resolution?

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem (II) Reduce #clusters… How? –Lower resolution? Observation: Especially problematic close to the camera

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem (II) Reduce #clusters… How? –Lower resolution? Observation: Especially problematic close to the camera

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem (III) Lot’s of unnecessary work during light assignment. –Almost guaranteed to occur…

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 The Problem (IV) Discussed previously –E.g., move back first Z-subdivision –Still get a lot of slices in XY

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Cascaded Clustering Different solution: Cascaded Clustering

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Cascaded Clustering Different solution: Cascaded Clustering Subdivide frustum into cascades and select individual resolution for each

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Cascaded Clustering Different solution: Cascaded Clustering Subdivide frustum into cascades and select individual resolution for each

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Cascaded Clustering (II) Select resolutions to give –Approx. cubical clusters with NxN footprint –But enforcing a minimum size of each cluster

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Cascaded Clustering (II) Select resolutions to give –Approx. cubical clusters with NxN footprint –But enforcing a minimum size of each cluster Currently use 12 cascades –Tweakable…

Quick Results

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Quick Results 192 lights Clustering: 0.35ms –Galaxy Alpha –3500 non-empty clusters –with 3.5 lights avg. (worst = 20 lights)

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Quick Results 30ms per frame average –Worst: ~35ms –Small lights + limited overlap With PreZ

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 Conclusion

Real-Time Many-Light Management and Shadows with Clustered Shading – 2015 References Tiled Deferred Blending, Ladlani, 2014 The Revolution in Mobile Game Graphics, Martin and Bjørge 2014 Challenges with High Quality Mobile Graphics, Martin et al An Evolution of Mobile Graphics, Shebanow 2013 Performance Tuning for Tile-Based Architectures, Merry 2012 Courses from earlier this week: –Moving Mobile Graphics –An Overview of Next-Generation Graphics APIs Forward+: Bringing deferred lighting to the next level, Harada et al Light Indexed Deferred Lighting, Treblico 2007 Our work on Tiled / Clustered Shading: –Clustered Deferred and Forward Shading, Olsson et al –Tiled and Clustered Forward Shading, Olsson et al –Tiled Forward Shading, Billeter et al (This slide show brought to you at approx SPS [slides per second])