Unreal Engine 4 East Coast DevCon 2014 UE4 Mobile Performance Niklas Smedberg Senior Engine Programmer, Epic Games.

Slides:



Advertisements
Similar presentations
Using Graphics Processors for Real-Time Global Illumination UK GPU Computing Conference 2011 Graham Hazel.
Advertisements

Exploration of advanced lighting and shading techniques
International Symposium on Low Power Electronics and Design Qing Xie, Mohammad Javad Dousti, and Massoud Pedram University of Southern California ISLPED.
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Game Programming 09 OGRE3D Lighting/shadow in Action
ADVANCED SKIN SHADING WITH FACEWORKS Nathan Reed — NVIDIA March 24, 2014.
GI 2006, Québec, June 9th 2006 Implementing the Render Cache and the Edge-and-Point Image on Graphics Hardware Edgar Velázquez-Armendáriz Eugene Lee Bruce.
Extensibility in UE4 Customizing Your Games and the Editor Gerke Max Preussner
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit
Fast GPU Histogram Analysis for Scene Post- Processing Andy Luedke Halo Development Team Microsoft Game Studios.
Integrating a Cloud Service Joe Overview Example from our games Support in UE4 to achieve that.
Rasterization and Ray Tracing in Real-Time Applications (Games) Andrew Graff.
Tools for Investigating Graphics System Performance
ATI GPUs and Graphics APIs Mark Segal. ATI Hardware X1K series 8 SIMD vertex engines, 16 SIMD fragment (pixel) engines 3-component vector + scalar ALUs.
Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.
Chapter 3.1 Teams and Processes. 2 Programming Teams In the 1980s programmers developed the whole game (and did the art and sounds too!) Now programmers.
Graphic Architecture introduction and analysis
Master Project Preparation Murtaza Hussain. Unity (also called Unity3D) is a cross-platform game engine with a built-in IDE developed by Unity Technologies.
Presentation by David Fong
The Pentium: A CISC Architecture Shalvin Maharaj CS Umesh Maharaj:
There has never been a better time to build a game that targets PC, tablets, phone and Xbox!
Niklas Smedberg Senior Engine Programmer, Epic Games
Of Bytes, Cycles and Battery Life. Who am [2] [1]
Week 1 Game Design & Development for Mobile Devices.
High Performance in Broad Reach Games Chas. Boyd
Nvidia Tegra 2 The world's first mobile super chip.
CSU0021 Computer Graphics © Chun-Fa Chang CSU0021 Computer Graphics September 10, 2014.
© Copyright Khronos Group, Page 1 Harnessing the Horsepower of OpenGL ES Hardware Acceleration Rob Simpson, Bitboys Oy.
Unit 70 Task 1 Matthew Wilson, Gary Rich, Callum Tracey, Luke Scrannage.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
Codeplay CEO © Copyright 2012 Codeplay Software Ltd 45 York Place Edinburgh EH1 3HP United Kingdom Visit us at The unique challenges of.
Introducing To 3D Modeling George Atanasov Telerik Corporation
Havok. ©Copyright 2006 Havok.com (or its licensors). All Rights Reserved. HavokFX Next Gen Physics on ATI GPUs Andrew Bowell – Senior Engineer Peter Kipfer.
NVIDIA PROPRIETARY AND CONFIDENTIAL Occlusion (HP and NV Extensions) Ashu Rege.
Computer Graphics Graphics Hardware
Nick Sims Like a motherboard, a graphics card is a printed circuit board that houses a processor and RAM.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
4/23/2017 4:23 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Advanced Computer Graphics Depth & Stencil Buffers / Rendering to Textures CO2409 Computer Graphics Week 19.
OpenGL ES Performance (and Quality) on the GoForce5500 Handheld GPU Lars M. Bishop, NVIDIA Developer Technologies.
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
1 Perception and VR MONT 104S, Fall 2008 Lecture 21 More Graphics for VR.
Development and Debugging Tools for Windows Phone 7 Series Cullen Waters Software Development Engineer II Advanced Technology Group, Microsoft Corporation.
Advanced Computer Graphics Shadow Techniques CO2409 Computer Graphics Week 20.
Introduction to Level Optimization Sai-Keung Wong Chiao Tung University, Taiwan, R.O.C. Reference: Mastering Unreal Technology (MUT)
 Introduction to SUN SPARC  What is CISC?  History: CISC  Advantages of CISC  Disadvantages of CISC  RISC vs CISC  Features of SUN SPARC  Architecture.
Stencil Routed A-Buffer
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Tegra K1 is the first-ever console-class mobile processor, giving you full support for PC-class gaming technologies like DirectX 12, OpenGL.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
Advanced Programmable Shading: Beyond Per-vertex and Per-pixel Shading.
Unity при побудові 3D ігор для Windows 8 та Windows Phone Олег Прiдюк Технічний євангеліст, Unity Technologies.
Postmortem: Deferred Shading in Tabula Rasa Rusty Koonce NCsoft September 15, 2008.
Image Fusion In Real-time, on a PC. Goals Interactive display of volume data in 3D –Allow more than one data set –Allow fusion of different modalities.
© ExplorNet’s Centers for Quality Teaching and Learning 1 Explain hardware and display components for laptop configuration. Objective Course Weight.
How to use a Pixel Shader CMT3317. Pixel shaders There is NO requirement to use a pixel shader for the coursework though you can if you want to You should.
Computer Graphics Graphics Hardware
P2 Understand hardware technologies for game platforms
Graphics Processor Graphics Processing Unit
Chapter 1 An overview on Computer Graphics
Tooling Breakout Session
Chapter 1 An overview on Computer Graphics
Stepping in the world of
UMBC Graphics for Games
Computer Graphics Graphics Hardware
UE4 Vulkan Updates & Tips
Presentation transcript:

Unreal Engine 4 East Coast DevCon 2014 UE4 Mobile Performance Niklas Smedberg Senior Engine Programmer, Epic Games

Unreal Engine 4 East Coast DevCon 2014 Content Part 1: Understanding mobile performance – Mobile GPU Hardware – Thermal limits – Performance guidelines Part 2: Adapt and conquer – Cross-platform profiling – Platform-specific profiling – Scaling your game based on device

Unreal Engine 4 East Coast DevCon 2014 Part 1: Understanding Mobile Performance Mobile hardware is evolving at a crazy rapid rate Next-generation mobile GPUs: – Fully featured (DirectX 11) – Peak performance comparable to Xbox 360 and PS GFLOPS and 26 GB/s – Able to run full UE4 desktop high-end rendering pipeline (e.g. NVIDIA K1) Phone users upgrade hardware very frequently – But tablet users don’t – Also, new large low-price markets are opening up – Result: Extremely wide performance range

Unreal Engine 4 East Coast DevCon 2014 Performance Trends (FP16 GFLOPS) 2010 SGX SGX 543MP SGX 543MP G Adreno, K1, GX6650

Unreal Engine 4 East Coast DevCon 2014 Common Mobile GPU Families Qualcomm Snapdragon Adreno Old: Adreno 2xxNow: Adreno 3xxSoon: Adreno 4xx ARM Mali Old: 400Now: T604, T628Soon: T720, T760 Imagination Technologies Old: SGX 5xxNow: Series 6Soon: Series 6XT NVIDIA Tegra Old: Tegra 3, 4Now: K1Soon: …

Unreal Engine 4 East Coast DevCon 2014 Tile-based Mobile GPU Mobile GPUs are usually tile-based (next-gen too) Tile-based: ImgTec, Qualcomm*, ARM Direct: NVIDIA, Intel, Qualcomm*, Vivante * Qualcomm Adreno can render either tile-based or direct to frame buffer – Extension: GL_QCOM_binning_control

Unreal Engine 4 East Coast DevCon 2014 Tile-Based Mobile GPU Summary: Split the screen into tiles – E.g. 32x32 pixels (ImgTec) or 300x300 (Qualcomm) The whole tile fits within GPU, on chip Process all drawcalls for one tile – Write out final tile results to RAM Repeat for each tile to fill the image in RAM

Unreal Engine 4 East Coast DevCon 2014 ImgTec Tile-based Rendering Process GameVertex Processing Tile Data (RAM) Pixel Processing (Top-most only) Frame Buffer (RAM) Cmd Buffer (RAM) Hidden Surface Removal Tile Memory Per Tile:

Unreal Engine 4 East Coast DevCon 2014 Framebuffer Resolve/Restore Expensive to switch Frame Buffer Object on Tile-based GPUs – Saves the current FBO to RAM – Reloads the new FBO from RAM Best performance: – A single rendertarget for the entire frame – No post-processing passes Does not apply to NVIDIA Tegra GPUs! – This made it simpler for us to use our full desktop rendering pipeline on K1 – “Rivalry” tech demo (showing 5:00 pm today)

Unreal Engine 4 East Coast DevCon 2014 Thermal Limits Hardware CPU and GPU clock frequencies change all the time! – Many times per milli-second! – To save battery – To prevent overheating Qualcomm Trepn Profiler – mobile-development/ increase-app-performance/ trepn-profiler

Unreal Engine 4 East Coast DevCon 2014 Thermal Limits Check your performance when device is cool Check again when it’s hot CPU uses much more power and heat than the GPU – Also, memory bandwidth generates a lot of heat Avoid unnecessary CPU usage – Spin-loops – Frequently waking up threads just to put them to sleep again

Unreal Engine 4 East Coast DevCon 2014 Performance Guidelines Always make sure lighting has been built before looking at performance Use as little post-process effects as you can get away with Make sure precomputed visibility has been set up properly Minimize overdraw (translucent or masked materials) Target draw calls per frame Use as few texture lookups as possible in your materials Documentation: –

Unreal Engine 4 East Coast DevCon 2014 Performance Tier 1 – 2 1. LDR (Low Dynamic Range) – Fastest mode – Use when you don’t need lighting or post-process effects – Disable “Mobile HDR” in Rendering section in your Project Settings 2. Basic Lighting – Allows HDR lighting and some post-process effects – Use only static lights – Use only fully rough materials, not shiny (specular) – Disable Bloom and anti-aliasing

Unreal Engine 4 East Coast DevCon 2014 Performance Tier 3 – 4 3. Full HDR Lighting – High-quality lighting with best support for normal maps – Realistic specular reflections on surfaces with per-pixel roughness – Use only static lights – Bloom and anti-aliasing are recommended – Place reflection captures carefully for best results 4. Full HDR Lighting with per-pixel lighting from the Sun – Specify one directional light as stationary (the Sun) – All other lights are static – High-quality distance field shadows

Unreal Engine 4 East Coast DevCon 2014 Interlude: End of Part 1 Questions? Keep going? Coffee break? Ready for more?

Unreal Engine 4 East Coast DevCon 2014 Part 2: Adapt and Conquer Very difficult to scale on CPU performance – Gameplay features can’t easy be switched off – Also, CPUs aren’t as different as GPUs are – Make sure you are never gamethread-bound on any device Scale your game purely based on GPU performance – Primarily resolution and post-process effects – Ship it!

Unreal Engine 4 East Coast DevCon 2014 Cross-platform Console Commands Common commands: – Stat Unit – Stat UnitGraph – Stat FPS – Stat SceneRendering – Stat Slow – ViewMode ShaderComplexity Documentation: – PerformanceProfiling/StatCommands/index.html

Unreal Engine 4 East Coast DevCon 2014 Console Command: Stat Unit Always the first step when checking performance

Unreal Engine 4 East Coast DevCon 2014 Console Command: Stat SceneRendering Shows Renderthread CPU performance and drawcalls

Unreal Engine 4 East Coast DevCon 2014 Console Commmand: ViewMode ShaderComplexity Visualize expensive materials in the PC ES2 previewer Shows approximate performance cost per material Green is good, red is bad. Pink or white is extremely expensive!

Unreal Engine 4 East Coast DevCon 2014 iOS Performance New Metal graphics API in iOS 8 – Much faster on CPU – Up to 20x faster on renderthread – Allows for thousands of drawcalls on iOS devices with A7 processors Scale graphics quality based on exact device model – Still very few different device models, easy to target each one specifically – Resolution (MobileContentScaleFactor) – Post-process features – Etc…

Unreal Engine 4 East Coast DevCon 2014 Platform-Specific Profiling Each GPU family has their own profiling tools – Apple: Xcode GL Debugger (and Metal) – Qualcomm: Adreno Profiler – NVIDIA: Tegra Graphics Debugger – ImgTec: PVRTune, PVRTrace – ARM: Mali Graphics Debugger For CPU profiling – Apple: Instruments (Time Profiler) – NVIDIA: Tegra System Profiler – ARM: DS-5

iOS Performance Profiling Screenshot from Xcode, which shows: – How we clear FBO at the beginning of every render pass – Other important performance info

Unreal Engine 4 East Coast DevCon 2014 Qualcomm Adreno Profiler

Unreal Engine 4 East Coast DevCon 2014 NVIDIA Tegra Graphics Debugger

Unreal Engine 4 East Coast DevCon 2014 ImgTec PVRTune and PVRTrace

Unreal Engine 4 East Coast DevCon 2014 ARM Mali Graphics Debugger

Unreal Engine 4 East Coast DevCon 2014 Device Profiles UE4 selects one device profile at startup – Detects device model and capabilities Tweak each device profile for your game – Config/DefaultDeviceProfiles.ini – Each Device Profile can customize engine features, like: +CVars=r.MobileContentScaleFactor=2 +CVars=r.BloomQuality=1 +CVars=r.DepthOfFieldQuality=1 +CVars=r.LightShaftQuality=1 Documentation: –

Unreal Engine 4 East Coast DevCon 2014 UE4 Mobile Performance Questions? Documentation, Tutorials and Help at: AnswerHub: Engine Documentation: Official Forums: Community Wiki: YouTube Videos: Community IRC: Unreal Engine 4 Roadmap lmgtfy.com/?q=Unreal+engine+Trello #unrealengine on FreeNode