James the GIANT killer: evaluating locking schemes in 2010-02-27 james francis toy iv David Hemmendinger.

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

1 Interprocess Communication 1. Ways of passing information 2. Guarded critical activities (e.g. updating shared data) 3. Proper sequencing in case of.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Secure Operating Systems Lesson 5: Shared Objects.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,
Bilgisayar Mühendisliği Bölümü GYTE - Bilgisayar Mühendisliği Bölümü Multithreading the SunOS Kernel J. R. Eykholt, S. R. Kleiman, S. Barton, R. Faulkner,
Concurrency in Shared Memory Systems Synchronization and Mutual Exclusion.
Race Conditions. Isolated & Non-Isolated Processes Isolated: Do not share state with other processes –The output of process is unaffected by run of other.
Computer Systems/Operating Systems - Class 8
CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman
Chapter 4: Threads. Overview Multithreading Models Threading Issues Pthreads Windows XP Threads.
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer.
Avishai Wool lecture Priority Scheduling Idea: Jobs are assigned priorities. Always, the job with the highest priority runs. Note: All scheduling.
Semaphores. Announcements No CS 415 Section this Friday Tom Roeder will hold office hours Homework 2 is due today.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CS533 - Concepts of Operating Systems 1 CS533 Concepts of Operating Systems Class 8 Synchronization on Multiprocessors.
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Concurrency Recitation – 2/24 Nisarg Raval Slides by Prof. Landon Cox.
This module was created with support form NSF under grant # DUE Module developed by Martin Burtscher Module B1 and B2: Parallelization.
1 Thread Synchronization: Too Much Milk. 2 Implementing Critical Sections in Software Hard The following example will demonstrate the difficulty of providing.
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
Fast Multi-Threading on Shared Memory Multi-Processors Joseph Cordina B.Sc. Computer Science and Physics Year IV.
Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
Games Development 2 Concurrent Programming CO3301 Week 9.
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
Kernel Locking Techniques by Robert Love presented by Scott Price.
Discussion Week 2 TA: Kyle Dewey. Overview Concurrency Process level Thread level MIPS - switch.s Project #1.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Threads-Process Interaction. CONTENTS  Threads  Process interaction.
Martin Kruliš by Martin Kruliš (v1.0)1.
Distributed Mutual Exclusion Synchronization in Distributed Systems Synchronization in distributed systems are often more difficult compared to synchronization.
1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.
Critical Section Tools (HW interface) locks implemented in ISA – T&S, CAS (O.S. )Semaphores – Counting and Binary (a.k.a a mutex lock) (Augmented O.S.)
1 Threads in Java Jingdi Wang. 2 Introduction A thread is a single sequence of execution within a program Multithreading involves multiple threads of.
Kernel Synchronization David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
CPS110: Reader-writer locks
Why Events Are A Bad Idea (for high-concurrency servers)
Background on the need for Synchronization
Threads, Events, and Scheduling
Kernel Synchronization II
Multithreading.
COP 4600 Operating Systems Fall 2010
Semaphore Originally called P() and V() wait (S) { while S <= 0
Threads Chapter 4.
Threads, Events, and Scheduling
Parallelism and Concurrency
Kernel Synchronization I
Kernel Synchronization II
Threads, Events, and Scheduling
Why Threads Are A Bad Idea (for most purposes)
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CS333 Intro to Operating Systems
Threads and Multithreading
Chapter 2 Operating System Overview
Foundations and Definitions
Why Threads Are A Bad Idea (for most purposes)
Operating System Overview
Why Threads Are A Bad Idea (for most purposes)
CSE 153 Design of Operating Systems Winter 2019
Operating Systems Concepts
CS 5204 Operating Systems Lecture 5
Don Porter Portions courtesy Emmett Witchel
Presentation transcript:

james the GIANT killer: evaluating locking schemes in james francis toy iv David Hemmendinger

Purpose Evaluate current locking scheme in FreeBSD See if the locking methods can be improved Evaluate both methods and form conclusions

WAIT! why do we need locking? Race conditions – “bad” “dog” – Threads race each other on context switches – Possible incorrect result : “bda dog” (interleaving) – Alternative correct result: “dog” “bad”

when it matters The most important things is correct results – Incorrect code on kernel level (close to the metal) can result in other userland applications yielding incorrect results. In some cases incorrect code can lead to death – Therac: – typing command sequences so fast input data was being corrupted – resulted in overexposure to radiation.

OK; so what is a GIANT_LOCK? What the kernel currently uses for locking!

GIANT_LOCK GIANT_LOCKs only allow one thread in the kernel at a time This is the simple solution; however, it inhibits concurrency! Why concurrency is important with Symmetric Multi-Processing – Logical concurrency w/o SMP Is there a better solution? kernel log sysctl scheduler virtual mem

Locks only go around shared data structures : “critical sections” in subsystems Seldom do threads bombard a specific subsystem (kernel design issue) Developer communities currently always favor FGL implementation – If everything is FGL then concurrency is promoted from the smallest subsystem to the largest Fine Grained Locking (FGL) mem sche d sysctl kernel log

GIANT_LOCK vs. FGL GIANT_LOCK is safe! – Guarantees no race and no deadlocks – Problem: Inhibits concurrency! Fine grained locking promotes concurrency – Problem: Complexity? – Possible Problem: Deadlocks? (subsystems mutually exclusive) – Possible problem: locking overhead (how long FGL takes) Will this pay off? – May depend on the subsystem Will the FGL code present a maintenance problem?

DESIGN: locking in BSD land Tools of the trade (sleep locks, read locks, read write locks) Two separate branches in a version control system Specific subsystems being targeted and why – Klog prints kernel events -- low traffic – Sysctl manages kernel variables -- higher traffic Fine Grained Locking GIANT_LOCK

DESIGN: method of evaluation (comparison of FGL and GIANT) (control) – Tests designed to hit specific subsystems (FGL a win?) multi-threaded make sysctl Kernel event character device klog (locking profiler) 1.Set sysctl variable 2.Running thread(s) 3.Unset sysctl variable – TOO MUCH data

Correctness and how I achieved it Correctness – Race conditions arise from writing to a shared resource – Readers share a lock – Writers pick up exclusive lock Process New branch of code Replace GIANT entry with fine grained locks Rebuild kernels in both configurations cron runs build test scripts (automated builds and profiling)

RESULTS : SYSCTL tests μS kernel builds with 128 threads

Conclusions Sysctl did produce data that strongly support the initial hypothesis. – 1/2% system increase of throughput (one small step) – Max time saved ~30 seconds (2 hrs build) Klog test did not produce useful data because the locking mechanisms are around a device node not a msg_buffer. Extending FGLs to more subsystems means more throughput.

:: what really matters :: Goal was to determine if: – FGL is detrimental to the system (complication) – FGL is significantly faster than GIANT_LOCKING in a small subsystem like sysctl. Locking methods are a very important to SMP – Running as close to the metal as you can get – If this fails to be correct or efficient the rest of the programs run on the system fail too! The FGL implementation should scale well.