A Pipeline for Lockless Processing of Sound Data David Thall Insomniac Games.

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
The Performance of Spin Lock Alternatives for Shared-Memory Microprocessors Thomas E. Anderson Presented by David Woodard.
6/13/20151 CS 160: Lecture 13 Professor John Canny Fall 2004.
Concurrent Processes Lecture 5. Introduction Modern operating systems can handle more than one process at a time System scheduler manages processes and.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.
Concurrency CS 510: Programming Languages David Walker.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
CS533 - Concepts of Operating Systems
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Processes.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
Chapter 1 Computer System Overview Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
Next-Gen Asset Streaming Using Runtime Statistics David Thall Insomniac Games.
Chapter 3: Processes. 3.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts - 7 th Edition, Feb 7, 2006 Process Concept Process – a program.
Operating systems, lecture 4 Team Viewer Tom Mikael Larsen, Thursdays in D A look at assignment 1 Brief rehearsal from lecture 3 More about.
The Functions of Operating Systems Interrupts. Learning Objectives Explain how interrupts are used to obtain processor time. Explain how processing of.
CE Operating Systems Lecture 11 Windows – Object manager and process management.
4061 Session 23 (4/10). Today Reader/Writer Locks and Semaphores Lock Files.
1 Web Based Programming Section 8 James King 12 August 2003.
1 Announcements The fixing the bug part of Lab 4’s assignment 2 is now considered extra credit. Comments for the code should be on the parts you wrote.
Recent Software Issues L3 Review of SM Software, 28 Oct Recent Software Issues Occasional runs had large numbers of single-event files. INIT message.
Game Programming Patterns Event Queue From the book by Robert Nystrom
Processes CSCI 4534 Chapter 4. Introduction Early computer systems allowed one program to be executed at a time –The program had complete control of the.
414/415 Review Session Emin Gun Sirer. True/False Multiprogramming offers increased response time Instructions to access a raw disk device need to be.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 4 Computer Systems Review.
Monitors and Blocking Synchronization Dalia Cohn Alperovich Based on “The Art of Multiprocessor Programming” by Herlihy & Shavit, chapter 8.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Processes and Threads.
4P13 Week 12 Talking Points Device Drivers 1.Auto-configuration and initialization routines 2.Routines for servicing I/O requests (the top half)
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Consider the Java code snippet below. Is it a legal use of Java synchronization? What happens if two threads A and B call get() on an object supporting.
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Operating System Concepts
© 2004 Microsoft Corporation. All rights reserved. 1 Processing IO Operations.
Process Control Management Prepared by: Dhason Operating Systems.
Lecture 5 Page 1 CS 111 Summer 2013 Bounded Buffers A higher level abstraction than shared domains or simple messages But not quite as high level as RPC.
Jim Fawcett CSE 691 – Software Modeling and Analysis Fall 2000
Jonathan Walpole Computer Science Portland State University
Process concept.
Chapter 3: Process Concept
Basic Processor Structure/design
Concurrency: Deadlock and Starvation
Chapter 9: Virtual Memory
Chapter 5: Process Synchronization
Chapter 9: Virtual-Memory Management
Processor Fundamentals
Operating Systems Chapter 5: Input/Output Management
Overview Continuation from Monday (File system implementation)
Operating System Concepts
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Top Half / Bottom Half Processing
Lecture 31: The IO Model 2 Repacking
Chapter 3: Processes.
CSE 153 Design of Operating Systems Winter 2019
Chapter 6: Synchronization Tools
January 15, 2004 Adrienne Noble
EEC 688/788 Secure and Dependable Computing
Chapter 4: Threads.
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Honnappa Nagarahalli Arm
Presentation transcript:

A Pipeline for Lockless Processing of Sound Data David Thall Insomniac Games

or: How I Learned to Stop Worrying and Love Concurrent Programming David Thall Insomniac Games

Our Goal Remove fixed pipeline optimizations from the sound engine –Stop packing runtime sound assets from dependency graphs in the builders Dependency graphs only describe ‘what’ to load (an expensive proposition) –Loose-load sound assets using runtime statistics Precache sounds that require low latency and might play soon Load sounds on-demand if they can withstand greater latency Learn more from “Next-Gen Asset Streaming Using Runtime Statistics” –

Problems Loose-loading should require us to: –Load asynchronously to the main update –Keep file system I/O contention at a minimum –Defragment loaded data often enough to handle new requests –Relocate during playback (many sounds are indefinite length) Unfortunately, our middleware sound API doesn’t properly handle asynchronous handling of sound data. –We can perform relocation during playback But the call is blocking on a sync-point …so they ask the client to perform the move/fixup in a background thread. –But the API must lock every time sound bank data is updated Implementation uses a doubly-linked list to manage loaded sound bank data. –Therefore, update will break if a load or unload request occurs at the same time as a relocation request (viz., must be synchronous)

Solutions Attempt #1: Polling API Is it safe to move? –No… someone is loading or updating… skip it –Yes… move the data to a duplicate location »Tell sound API about the new fixup locations »And wait for the blocking call to return… But the state is changing in the mixing thread –So… the sound API can still crash anyway! DOESN’T WORK

Solutions Attempt #2: Sync-point Callback API –But now we need to lock on our end to make sure we don’t relocate while they are still processing a load or unload request –DOESN’T WORK Attempt #3: Synchronous Updates from a Background Thread –COULD WORK

Our Solution A solution that works –No blind ‘lock-and-hope’ semantics Is designed to be malleable –Sound API is an inherently sequential system And can run concurrent updates on data –Such as loads and unloads behind playback

Staged Pipeline Updates Each stage represents a job to be completed Each subsequent stage’s counter checks whether or not it has pending jobs –while (g_counters[LOAD_COMPLETED] < g_counters[LOAD_REQUESTED]) Do some work on request… then… Increment LOAD_COMPLETED counter (must guarantee this happens last) Jobs can run concurrently in separate threads without locks And if we have a system that must run sequentially, we can manage that too.

Sound Loading Algorithm

Write load requests to a command queue (or set of low and high latency queues), to be processed later...

Sound Loading Algorithm If the staging buffer is empty, begin loading a request

Sound Loading Algorithm Once the file has been loaded into the staging buffer, signal that the load is complete

Sound Loading Algorithm Register the loaded sound file with the sound API. Flag the request as ready for playback.

Sound Loading Algorithm Copy the file from the staging buffer to the main buffer

Sound Unloading Algorithm

Write unload requests to a command queue, to be processed later...

Sound Unloading Algorithm If an unload request’s file is already loaded, flag the file as ready for an unload

Sound Unloading Algorithm Copy the file from the main buffer to the staging buffer Free allocated memory… Defrag the entire main buffer

Sound Unloading Algorithm Begin unloading the sound file

Sound Unloading Algorithm When the sound file is completely unloaded, flag the request as completed and the staging buffer as empty

Lockless API Restrictions Message-based API –No state queries (immediate queries are meaningless) –No accessors –No handles to memory –Asynchronous –Unidirectional –Pass by value –Errors are deferred / propagated –No required client synchronization However, client may request a message to its input queue for synching its own state data.

Results Updates are modular, fast and scalable Solution is general enough to be exported for use in other staged data processing applications

Questions?

Thank you!