Fiber Based Job Systems Seth England. Preemptive Scheduling Competition for resources Use of synchronization primitives to prevent race conditions in.

Slides:



Advertisements
Similar presentations
Processes Management.
Advertisements

CSCC69: Operating Systems
Mutual Exclusion.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Day 10 Threads. Threads and Processes  Process is seen as two entities Unit of resource allocation (process or task) Unit of dispatch or scheduling (thread.
Processes CSCI 444/544 Operating Systems Fall 2008.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Advanced OS Chapter 3p2 Sections 3.4 / 3.5. Interrupts These enable software to respond to signals from hardware. The set of instructions to be executed.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Processes.
1 Process Description and Control Chapter 3 = Why process? = What is a process? = How to represent processes? = How to control processes?
Operating Systems (CSCI2413) Lecture 3 Processes phones off (please)
Process Description and Control A process is sometimes called a task, it is a program in execution.
OPERATING SYSTEMS DESIGN AND IMPLEMENTATION Third Edition ANDREW S. TANENBAUM ALBERT S. WOODHULL Yan hao (Wilson) Wu University of the Western.
1 Advanced Computer Programming Concurrency Multithreaded Programs Copyright © Texas Education Agency, 2013.
CE Operating Systems Lecture 5 Processes. Overview of lecture In this lecture we will be looking at What is a process? Structure of a process Process.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
Process Management. Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication.
Nachos Phase 1 Code -Hints and Comments
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
Chapter 41 Processes Chapter 4. 2 Processes  Multiprogramming operating systems are built around the concept of process (also called task).  A process.
Processes and Threads CS550 Operating Systems. Processes and Threads These exist only at execution time They have fast state changes -> in memory and.
Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating Kernel.
1 Confidential Enterprise Solutions Group Process and Threads.
Java Threads. What is a Thread? A thread can be loosely defined as a separate stream of execution that takes place simultaneously with and independently.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Current Assignments Homework 2 is available and is due in three days (June 19th). Project 1 due in 6 days (June 23 rd ) Write a binomial root solver using.
1 Threads Chapter 11 from the book: Inter-process Communications in Linux: The Nooks & Crannies by John Shapley Gray Publisher: Prentice Hall Pub Date:
Processes – Part I Processes – Part I. 3.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Review on OSs Upon brief introduction of OSs,
11/13/20151 Processes ICS 240: Operating Systems –William Albritton Information and Computer Sciences Department at Leeward Community College –Original.
CPS110: Implementing threads Landon Cox. Recap and looking ahead Hardware OS Applications Where we’ve been Where we’re going.
ITCS 3181 Logic and Computer Systems 2015 B. Wilkinson Slides4-2.ppt Modification date: March 23, Procedures Essential ingredient of high level.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Processes and Threads.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Chapter 7 - Interprocess Communication Patterns
Programming Fundamentals. Topics to be covered Today Recursion Inline Functions Scope and Storage Class A simple class Constructor Destructor.
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
Processes. Process Concept Process Scheduling Operations on Processes Interprocess Communication Communication in Client-Server Systems.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Chapter 3: Processes. 3.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 3: Processes Process Concept Process Scheduling Operations.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
CPS110: Implementing threads on a uni-processor Landon Cox January 29, 2008.
1 Module 3: Processes Reading: Chapter Next Module: –Inter-process Communication –Process Scheduling –Reading: Chapter 4.5, 6.1 – 6.3.
7/9/ Realizing Concurrency using Posix Threads (pthreads) B. Ramamurthy.
Tutorial 2: Homework 1 and Project 1
Multiprogramming. Readings r Chapter 2.1 of the textbook.
Design issues for Object-Oriented Languages
Threads Some of these slides were originally made by Dr. Roger deBry. They include text, figures, and information from this class’s textbook, Operating.
Processes and threads.
Chapter 3: Process Concept
Operating Systems (CS 340 D)
Process Management Presented By Aditya Gupta Assistant Professor
Intro to Processes CSSE 332 Operating Systems
Operating Systems (CS 340 D)
Threads and Cooperation
Lecture 2: Processes Part 1
Process & its States Lecture 5.
Operating Systems.
Dr. Mustafa Cem Kasapbaşı
PROCESSES & THREADS ADINA-CLAUDIA STOICA.
Chapter 3: Processes.
CSE 153 Design of Operating Systems Winter 2019
CS703 – Advanced Operating Systems
Computer Graphics Matrix Hierarchies / Animation
Presentation transcript:

Fiber Based Job Systems Seth England

Preemptive Scheduling Competition for resources Use of synchronization primitives to prevent race conditions in shared data

Preemptive Scheduling

There are some problems with this model Locks are slow Very hard to reason about the state that our data is in We want to map multiple threads to a single engine system if necessary.

Cooperative Scheduling We can improve upon all of these things What if we could schedule the tasks in our engine in such a way that they were never try to be modifying the same thing at the same time? This would eliminate locks and make multithreading much easier to reason about

Cooperative Scheduling Instead of letting our program compete for data, we can explicitly schedule our tasks such that they don’t need to anymore

Cooperative Scheduling It’s hard to multithread an engine this way if it looks like this

Cooperative Scheduling Need to break down task in the engine in order to schedule them effectively These broken-down tasks are usually referred to as jobs

Jobs A common solution for multithreading modern game engines is a job system A job is a grouping of data and a transformation

Job System Collection of worker threads Assigns jobs to those threads from various queues

Job System In my job system there are 3 types of queues (in order of priority) The stalled queue contains jobs that have run but are not finished The immediate queue The deferred queue I’ve seen other systems with low, medium, and high but I find the decision is pretty binary (now or later?)

Fibers Units of execution that are manually scheduled by the user They are not pre-empted by other fibers via the OS They are manually scheduled by the application The run on threads, assuming the identity of the thread they’re running on by swapping out a few offsets (stack pointer, instruction pointer, ect) Operating system specific, though almost all OS support this feature They can do wonderful things that we will get to later

Windows Functions CreateFiber – Takes the size of the stack you want the fiber to have, the starting address (function pointer), and a void pointer you want to be passed to the function. I pass the job to this function. Does not have near as much overhead and creating a fiber, however, the stack needs to be allocated and there’s no way (on Windows) to specify a location for the stack This can be circumvented with a fiber pool, which we will go over later Returns what is effectively a handle that allows us to schedule that fiber and delete it using DeleteFiber

Windows Functions ConvertThreadToFiber – Since fibers can only be called from other fibers, once worker threads are created they need to call this SwitchToFiber – Makes the calling thread assume the identity of the fiber, swapping out its instruction and stack pointer

Worker Threads Job System uses a condition variable to signal when the thread needs to wake up

Execute Job Queue

Example Job

Declare Job Just a macro that produces the declaration that windows requires to be passed in to CreateFiber

Start Job Basically sets up the “environment” of the job Takes note of where the stack pointer is Declares the start of the inner scope

Start Job

End Job Close the scope Define a symbol to skip to end of job Mark the job as finished Switch to worker thread

End Job

Memory Leaks This is a memory leak The data structure never goes out of scope This is why we declare an inner scope and when we want to exit a job early, we jump to the end of the scope The compiler will throw an error if this goto won’t initialize things (non-PODS)

Reusing Fibers Windows does not allow us to pass our own memory to use as the stack To get around this, when a job ends we simply goto StartJobSymbol resetting the instruction pointer, and we reset the stack pointer Save the stack pointer when the fiber first runs Now we can reuse fibers over and over

Enqueuing Jobs When SwitchToFiber is called, the state of the current fiber is saved When SwitchToFiber is called again on the same fiber, the fiber continues running at the same place with the same stack as before! This is the flexibility of using fibers as opposed to threads. In a job system built on threads when the job returns it will begin from the start of the job function when next called We’ll use the flexibility to great effect later

Enqueuing Jobs When inside a job, we can enqueue other jobs to run The Job is then put in a stalled queue The other jobs run When all the queued jobs run the job that enqueued them will start running from where it left off

Enqueuing Jobs Jobs are created on the stack This is ok because the stack of that job is in valid memory space so long as it has not gone out of scope

Enqueuing Jobs

Execution Manager A collection of “Execution Nodes” An Execution Node is a collection of one or more jobs that are queued simultaneously A directional dependency graph of execution nodes, where a parent is connected to one or more children and vice-versa. Connections represent a data dependency

Execution Manager

Register Execution Node Takes a config (name, type) Takes an array of transformations Takes an array of data Each data pointer will be passed to each job every time it runs Groups these arrays and creates jobs out of them Puts these new jobs into execution nodes

Add Dependency Add reference to child to parent Increment child counter

Register Execution Node

Execute Root The starting point for the execution manager Decrements a counter on each of its children If that counter reaches 0, that child is queued Each of its children do this, and then this happens recursively This recursion will progress and then “unwind”, when we get back to the root node, we’re done Note this recursion does not result in a large runtime stack, while nodes are waiting they aren’t actually running

Execute Root

Execute Job Node A job Queues up jobs in the node After the jobs finish, we queue up the children if their counter has reached zero (no more dependencies exist)

Execute Job Node

Developer Workflow

Write code as you normally would Put that code in job(s) Put those job(s) inside of execution node(s) Schedule the execution node(s) in the dependency graph Basically the only difference between the single-threaded and multi- threaded work flow is that you have to schedule your code

Developer Workflow Write a visual tool, it will make your life much easier When the game exe runs, export the registered execution nodes to a text file Tool then reads the execution nodes from that file Edit connections in the tool and export a file When the game runs, it reads the file that was exported from the tool as the dependencies Could probably make this more robust with good reflection

Developer Workflow

Uses Core engine systems (graphics, physics, AI) are basically divided into 3 phases An extraction phase where each system creates a copy of shared data An update phase A sync phase where shared data is written back

Physics Uses Extract transform position, rotation Update rigidbody/collider matrices Multi-threaded SAP Collision detection multi-threading is trivial Constraints can be divided into non-dependent islands Integration Sync write results back to transform

Graphics Uses Extract position, rotation, scale from transform Organized into “Render Lanes” of items with shared state (particles, static models, animated models,) More info on multithreaded rendering can be found elsewhere Rendering

AI uses AI extracts data from the transform (position, orientation) and from physics (velocity, forces) AI calculates the deltas that need to be applied (forces) AI waits until physics is done integrating to apply these deltas

Postmortem Very few multithreading issues Easily identifiable and fixable issues No game logic multithreading Very easy to use