Concurrency & Parallelism in UE4

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

Epic’s Build Tools & Infrastructure
Synchronization and Deadlocks
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
EEE 435 Principles of Operating Systems Interprocess Communication Pt II (Modern Operating Systems 2.3)
Game Framework & Sample Projects
Extensibility in UE4 Customizing Your Games and the Editor Gerke Max Preussner
Integrating a Cloud Service Joe Overview Example from our games Support in UE4 to achieve that.
CS 162 Discussion Section Week 3. Who am I? Mosharaf Chowdhury Office 651 Soda 4-5PM.
Engine Overview A Programmer’s Glimpse at UE4 Gerke Max Preussner
1 Threads CSCE 351: Operating System Kernels Witawas Srisa-an Chapter 4-5.
5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.
CS220 Software Development Lecture: Multi-threading A. O’Riordan, 2009.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Game Framework & Sample Projects
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
Computer Science 162 Discussion Section Week 3. Agenda Project 1 released! Locks, Semaphores, and condition variables Producer-consumer – Example (locks,
Niklas Smedberg Senior Engine Programmer, Epic Games
Discussion Week 3 TA: Kyle Dewey. Overview Concurrency overview Synchronization primitives Semaphores Locks Conditions Project #1.
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic,
CS510 Concurrent Systems Introduction to Concurrency.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
The University of Adelaide, School of Computer Science
1162 JDK 5.0 Features Christian Kemper Principal Architect Borland.
Concurrent Programming. Concurrency  Concurrency means for a program to have multiple paths of execution running at (almost) the same time. Examples:
Practical OOP using Java Basis Faqueer Tanvir Ahmed, 08 Jan 2012.
1 Web Based Programming Section 8 James King 12 August 2003.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Lecture 8 Page 1 CS 111 Online Other Important Synchronization Primitives Semaphores Mutexes Monitors.
Consider the program fragment below left. Assume that the program containing this fragment executes t1() and t2() on separate threads running on separate.
1 Interprocess Communication (IPC) - Outline Problem: Race condition Solution: Mutual exclusion –Disabling interrupts; –Lock variables; –Strict alternation.
Java Thread and Memory Model
Concurrency Control 1 Fall 2014 CS7020: Game Design and Development.
4.1 Introduction to Threads Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Unit 4: Processes, Threads & Deadlocks June 2012 Kaplan University 1.
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
L6: Threads “the future is parallel” COMP206, Geoff Holmes and Bernhard Pfahringer.
Concurrency (Threads) Threads allow you to do tasks in parallel. In an unthreaded program, you code is executed procedurally from start to finish. In a.
Tuning Threaded Code with Intel® Parallel Amplifier.
CHAPTER 6 Threads, Handlers, and Programmatic Movement.
Big Picture Lab 4 Operating Systems C Andras Moritz
Fiber Based Job Systems Seth England. Preemptive Scheduling Competition for resources Use of synchronization primitives to prevent race conditions in.
Java Thread Programming
Segments Introduction: slides minutes
Background on the need for Synchronization
Jim Fawcett CSE681 – Software Modeling & Analysis Fall 2002
Multithreading Tutorial
Computer Engg, IIT(BHU)
Jim Fawcett CSE681 – Software Modeling and Analysis Fall 2005
Threads and Locks.
Concurrency & Parallelism in UE4
Multithreading Chapter 23.
Multithreading Tutorial
Threading And Parallel Programming Constructs
Conditions for Deadlock
Multithreading Tutorial
CSCI1600: Embedded and Real Time Software
Multithreading Tutorial
Userspace Synchronization
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CSE 153 Design of Operating Systems Winter 2019
CSCI1600: Embedded and Real Time Software
More concurrency issues
Presentation transcript:

Concurrency & Parallelism in UE4 Tips for programming with many CPU cores Gerke Max Preussner max.preussner@epicgames.com *** Prepared and presented by Gerke Max Preussner for West Coast MiniDevCon 2014, July 21-24th Hi, my name is Max. I work as a Sr. Engine Programmer at Epic Games and would like to give you a quick overview of multi-threading capabilities in Unreal Engine 4. Games and applications that perform concurrent tasks on one or multiple CPUs or CPU cores are generally hard to write, proof correct, debug and maintain. Not only are there many concepts and mechanisms you have to learn, but also a wide range of very subtle issues that are often dependent on the operating system and the processor it is running on. Unreal Engine provides a range of features, helpers and tools to develop for multi-core computers and mobile devices more easily.

Synchronization Primitives Atomics Locking Signaling Waiting FPlatformAtomics InterlockedAdd InterlockedCompareExchange (-Pointer) InterlockedDecrement (-Increment) InterlockedExchange (-Pointer) 64- and 128-bit overloads on supported platforms At the lowest level, the Engine provides a number of synchronization primitives used to synchronize memory access across different threads. Commonly used primitives are atomic operations, mechanisms for locking and signaling threads, as well as for waiting on events from other threads. Atomic operations allow the CPU to read and write a memory location in the same indivisible bus operation. Although this seems to be a rather trivial feature, it is the foundation of many higher level multi-threading mechanisms. The main advantage of atomics is that they are very quick compared to locks, and they avoid the common problems of mutual exclusion and deadlocks. Unreal Engine exposes various atomic operations, such as incrementing and compare-and-exchange in a platform agnostic API. Most atomic operations operate on 32-bit values, but we also provide APIs for 64- and 128-bit values, if the particular platform supports them.

Synchronization Primitives Atomics Locking Signaling Waiting // Example class FThreadSafeCounter { public: int32 Add( int32 Amount ) { return FPlatformAtomics::InterlockedAdd(&Counter, Amount); } private: volatile int32 Counter; }; Here is an example from our code base that uses atomics to implement a thread-safe counter. A normal increment instruction for integers may involve multiple bus operations that read the current value, increment it and write it back to memory. During these multiple steps it is possible – in principle and in practice – that the operating system suspends our thread, and another thread that also wishes to increment the counter modifies our value, causing that increment to be lost when our thread resumes. The use of ‘interlocked add’ in this example ensures that the increment operation cannot be interrupted. Because the counter value can change in asynchronous ways that the compiler cannot foresee, we use the volatile keyword to prevent optimizations of code that accesses the value.

Synchronization Primitives Atomics Locking Signaling Waiting Critical Sections FCriticalSection implements synchronization object FScopeLock for scope level locking using a critical section Fast if the lock is not activated Spin Locks FSpinLock can be locked and unlocked Sleeps or spins in a loop until unlocked Default sleep time is 0.1 seconds The main locking mechanisms used in Unreal Engine are Critical Sections and Spin Locks. Critical Sections provide mutually exclusive access to a single resource, usually the internals of a thread-safe class. In combination with Scope Locks they ensure that a given block of code can be executed by only one thread at a time. Compared to atomic operations, they can be a lot slower when the lock is accessed by more than one thread. Spin Locks also provide exclusive access, but instead of using the mutex facilities provided by the OS they sleep in a loop until the lock is removed. The sleep time is configurable (0.1 seconds by default), and for very small durations they may actually just do a no-op spin loop under the hood.

Synchronization Primitives Atomics Locking Signaling Waiting Semaphores Like mutex with signaling mechanism Only implemented for Windows and hardly used API will probably change Use FEvent instead Semaphores are similar to mutual exclusion locks, but they also contain a signaling mechanism. There is currently an implementation in FPlatformProcess, but it is not quite what one would expect, only implemented for Windows and only used in Lightmass right now. A better choice for signaling between threads would be the FEvent class…

Synchronization Primitives Atomics Locking Signaling Waiting FEvent Blocks a thread until triggered or timed out Frequently used to wake up worker threads FScopedEvent Wraps an FEvent that blocks on scope exit // Example for scoped events { FScopedEvent Event; DoWorkOnAnotherThread(Event.Get()); // stalls here until other thread triggers Event } … which also uses a mutex under the hood. We frequently use events in worker threads to notify them about newly added work items or that we wish to shut them down. If you just want to send a one-shot signal from another thread, you can pass an FScopedEvent into it, and your current thread will stall when the event exits its scope until it is triggered.

High Level Constructs Containers Helpers General Thread-safety Most containers (TArray, TMap, etc.) are not thread-safe Use synchronization primitives in your own code where needed TLockFreePointerList Lock free, stack based and ABA resistant Used by Task Graph system TQueue Uses a linked list under the hood Lock and contention free for SPSC Lock free for MPSC TDisruptor (currently not part of UE4) Lock free MPMC queue using a ring buffer Most container types and classes in the Engine, including TArray and TMap are not thread-safe. We usually just manually add atomics and mutual exclusion locks in our code on an as-needed basis. Of course, locks are not always desirable, so we are also starting to implement basic lock-free containers that are thread-safe. A very useful lock-free container is TQueue, which is lock- and contention-free for SPSC and lock-free for MPSC. It is used in many places that transfer data from one thread to another. (We also implemented a lock-free MPMC queue using the Disruptor pattern at some point – it is not part of UE4 right now, but if you need it, send me an email!)

High Level Constructs Containers Helpers FThreadSafeCounter FThreadSingleton Singleton that creates an instance per thread FMemStack Fast, temporary per-thread memory allocation TLockFreeClassAllocator, TLockFreeFixedSizeAllocator Another fast allocator for instances of T FThreadIdleStats Measures how often a thread is idle We also provide some high level helper objects that are built on top of the synchronization primitives, some of which are shown here. The thread-safe counter we have already seen in the code example earlier. Thread-safe singletons are special singletons that create one instance of a class per thread instead of per process. Memory stacks allow for ultra-fast per-thread memory allocations. We also have a way to measure thread idle times that is always enabled, even in Shipping builds.

Parallelization Threads Task Graph Processes Messaging FRunnable Platform agnostic interface Implement Init(), Run(), Stop() and Exit() in your sub-class Launch with FRunnableThread::Create() FSingleThreadRunnable when multi-threading is disabled FQueuedThreadPool Carried over from UE3 and still works the same way Global general purpose thread pool in GThreadPool Not lock free Traditionally, the main work horse of multi-threaded programming are operating system threads. We expose those in the form of the FRunnable base class that you can inherit from. If your thread object inherits from FSingleThreadRunnable, it can also be used when multi-threading is disabled or unavailable. The thread pool from Unreal Engine 3 is still available as well. You can create your own thread pools or just use the global one in GThreadPool.

Parallelization Threads Task Graph Processes Messaging Game Thread All game code, Blueprints and UI UObjects are not thread-safe! Render Thread Proxy objects for Materials, Primitives, etc. Stats Thread Engine performance counters Unreal Engine 4 uses threads for a number of things, but the most important ones are GameThread, RenderThread and StatsThread. All code that interfaces with UObjects must run on the GameThread, because UObjects are not thread-safe. The RenderThread allows us to perform rendering related tasks in parallel with the game logic. We recently also added the StatsThread that collects and processes a number of perforce counters for profiling.

Parallelization Threads Task Graph Processes Messaging Task Based Multi-Threading Small units of work are pushed to available worker threads Tasks can have dependencies to each other Task Graph will figure out order of execution Used by an increasing number of systems Animation evaluation Message dispatch and serialization in Messaging system Object reachability analysis in garbage collector Render commands in Rendering sub-system Various tasks in Physics sub-system Defer execution to a particular thread Threads are very heavyweight. Task Based Multi-Threading allows for more fine grained, lightweight use of multiple cores. A task is a small, parallelizable unit of work that can be scheduled to available worker threads that are managed by our Task Graph system. As the name suggests, tasks can also have dependencies on each other, so that certain tasks are not executed until other tasks completed successfully. The Task Graph is used by an increasing number of systems in the Engine, and we encourage you to use it, too.

The Profiling Tools in the Editor come with a handy feature for visually debugging threads. It shows the cycle counter based statistics, as well as Task Graph usage.

Parallelization Threads Task Graph Processes Messaging FPlatformProcess CreateProc() executes an external program LaunchURL() launches the default program for a URL IsProcRunning() checks whether a process is still running Plus many other utilities for process management FMonitoredProcess Convenience class for launching and monitoring processes Event delegates for cancellation, completion and output Unreal Engine 4 also has platform agnostic APIs for creating and monitoring external processes. You can start programs, launch web URLs or open files in other applications. If you need fine control over creating and monitoring external processes, you can use the FMonitoredProcess class.

Parallelization Threads Task Graph Processes Messaging Unreal Message Bus (UMB) Zero configuration intra- and inter-process communication Request-Reply and Publish-Subscribe patterns supported Messages are simple UStructs Transport Plug-ins Seamlessly connect processes across machines Only implemented for UDP right now (prototype) Messaging is another important architectural pattern for multi-threaded and distributed applications. Unreal Message Bus implements a message passing infrastructure that can be used to send commands, events or data to other threads, processes and even computers. Our main goal was to unify all the different network communication technologies we used for our tools in UE3, and to allow users to create their own distributed applications and tools without having to deal with low level technicalities, such as network protocols and how exactly messages get from A to B. Messages are simple Unreal structures (that currently require a tiny bit of boilerplate, but we will fix that soon). They can be sent between devices, even across the internet - if desired – and all our Unreal Frontend tools heavily rely on it. Messaging is a great way to get information out of your code classes in a loosely coupled way, and future use cases may include achievement systems, event processing for AI and real-time remote editing on game consoles and mobile phones.

Debugging the flow of messages tends to be quite cumbersome in code, so the Messaging system comes with a visual debugging tool as well. It is not quite finished yet, but already functional.

Upcoming Features Critical sections & events Better debugging and profiling support Task Graph Improvements and optimizations UObjects Thread-safe construction and destruction Parallel Rendering Implemented inside renderers, not on RHI API level Messaging UDP v2 (“Hammer”), BLOB attachments, more robust, breakpoints Named Pipes and other transport plug-ins More lock-free containers We are continuously working on improving and extending Unreal Engine 4. That UObjects are not thread-safe is sometimes a limitation, and we are currently working on allowing their construction and destruction on non-Game threads. We are also working on parallelizing the Renderer. This will be implemented not in the RHI API, but in the renderers themselves and utilize the Task Graph. The Messaging System will become very important for the new Editor tools we have scheduled for next year, so we will make it more powerful and easier to use as well. Much of our thread-safe code is currently lock based, and we have gotten good results with it, but we are also planning to add more generic lock-free containers.

Questions? Documentation, Tutorials and Help at: AnswerHub: Engine Documentation: Official Forums: Community Wiki: YouTube Videos: Community IRC: Unreal Engine 4 Roadmap lmgtfy.com/?q=Unreal+engine+Trello+ http://answers.unrealengine.com http://docs.unrealengine.com http://forums.unrealengine.com http://wiki.unrealengine.com http://www.youtube.com/user/UnrealDevelopmentKit #unrealengine on FreeNode With that I would like to conclude this talk. Make sure to check out our extensive materials on the internet, all of which are available for free – even if you don’t have a subscription yet. Are there any questions I can answer right now?