Advanced Topics in Module Design: Threadsafety and Portability Aaron Bannert

Slides:



Advertisements
Similar presentations
Threads Today Why threads Thread model & usage Implementing threads Scheduler activations Making single-threaded code multithreaded Next time CPU Scheduling.
Advertisements

Chapter 6: Process Synchronization
Threads. Readings r Silberschatz et al : Chapter 4.
The Environment of a UNIX Process. Introduction How is main() called? How are arguments passed? Memory layout? Memory allocation? Environment variables.
Chapter 5 Threads os5.
CSE 451: Operating Systems Section 6 Project 2b; Midterm Review.
Threads Section 2.2. Introduction to threads A thread (of execution) is a light-weight process –Threads reside within processes. –They share one address.
Processes CSCI 444/544 Operating Systems Fall 2008.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
Server Architecture Models Operating Systems Hebrew University Spring 2004.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
3.5 Interprocess Communication
8-1 JMH Associates © 2004, All rights reserved Windows Application Development Chapter 10 - Supplement Introduction to Pthreads for Application Portability.
Lecture 18 Threaded Programming CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.
 2004 Deitel & Associates, Inc. All rights reserved. Chapter 4 – Thread Concepts Outline 4.1 Introduction 4.2Definition of Thread 4.3Motivation for Threads.
1 School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Ch 4: Threads Dr. Mohamed Hefeeda.
Threads© Dr. Ayman Abdel-Hamid, CS4254 Spring CS4254 Computer Network Architecture and Programming Dr. Ayman A. Abdel-Hamid Computer Science Department.
CS 3013 & CS 502 Summer 2006 Threads1 CS-3013 & CS-502 Summer 2006.
Apache Architecture. How do we measure performance? Benchmarks –Requests per Second –Bandwidth –Latency –Concurrency (Scalability)
OPERATING SYSTEMS DESIGN AND IMPLEMENTATION Third Edition ANDREW S. TANENBAUM ALBERT S. WOODHULL Yan hao (Wilson) Wu University of the Western.
CSE 451: Operating Systems Autumn 2013 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
CS252: Systems Programming Ninghui Li Final Exam Review.
Programming Network Servers Topic 6, Chapters 21, 22 Network Programming Kansas State University at Salina.
Process Management. Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication.
APACHE 2.0 A Look Under the Hood CHUUG, June 2002 by Cliff Woolley
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
SB Implementing ScriptBasic Multi- Thread How to embed ScriptBasic multi-thread?
The HDF Group Multi-threading in HDF5: Paths Forward Current implementation - Future directions May 30-31, 2012HDF5 Workshop at PSI 1.
Thread-Safe Programming Living With Linux. Thread-Safe Programming Tommy Reynolds Fedora Documentation Project Steering Committee
 2004 Deitel & Associates, Inc. All rights reserved. 1 Chapter 4 – Thread Concepts Outline 4.1 Introduction 4.2Definition of Thread 4.3Motivation for.
Operating Systems Lecture 7 OS Potpourri Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of Software.
June-Hyun, Moon Computer Communications LAB., Kwangwoon University Chapter 26 - Threads.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
Threads and Thread Control Thread Concepts Pthread Creation and Termination Pthread synchronization Threads and Signals.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
CE Operating Systems Lecture 10 Processes and process management in Linux.
Source: Operating System Concepts by Silberschatz, Galvin and Gagne.
CSE 451: Operating Systems Section 5 Midterm review.
Kernel Locking Techniques by Robert Love presented by Scott Price.
CE Operating Systems Lecture 13 Linux/Unix interprocess communication.
Chapter 4 – Threads (Pgs 153 – 174). Threads  A "Basic Unit of CPU Utilization"  A technique that assists in performing parallel computation by setting.
CSE 451: Operating Systems Winter 2015 Module 5 1 / 2 User-Level Threads & Scheduler Activations Mark Zbikowski 476 Allen Center.
Operating Systems Process Creation
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
Discussion Week 2 TA: Kyle Dewey. Overview Concurrency Process level Thread level MIPS - switch.s Project #1.
Processes and Virtual Memory
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Processes and Threads.
CSC 360, Instructor: Kui Wu Thread & PThread. CSC 360, Instructor: Kui Wu Agenda 1.What is thread? 2.User vs kernel threads 3.Thread models 4.Thread issues.
Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
Low Overhead Real-Time Computing General Purpose OS’s can be highly unpredictable Linux response times seen in the 100’s of milliseconds Work around this.
Chapter 4: Threads 羅習五. Chapter 4: Threads Motivation and Overview Multithreading Models Threading Issues Examples – Pthreads – Windows XP Threads – Linux.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Threads prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University 1July 2016Processes.
Chapter 4 – Thread Concepts
Threads Some of these slides were originally made by Dr. Roger deBry. They include text, figures, and information from this class’s textbook, Operating.
Process Management Process Concept Why only the global variables?
Chapter 4 – Thread Concepts
Threads Threads.
Process Management Presented By Aditya Gupta Assistant Professor
Advanced Topics in Module Design: Threadsafety and Portability
Threads and Cooperation
CSE 451: Operating Systems Spring 2012 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
CSE 451 Autumn 2003 Section 3 October 16.
Chapter 6: Synchronization Tools
Processes Creation and Threads
Threads CSE 2431: Introduction to Operating Systems
Presentation transcript:

Advanced Topics in Module Design: Threadsafety and Portability Aaron Bannert

Thread-safe From The Free On-line Dictionary of Computing (09 FEB 02) [foldoc]: thread-safe A description of code which is either re-entrant or protected from multiple simultaneous execution by some form of mutual exclusion. ( )

APR The Apache Portable Runtime

The APR Libraries APR –System-level “glue” APR-UTIL –Portable routines built upon APR APR-ICONV –Portable international character support APR-SERF –Portable HTTP client library

“Glue” vs. Portable Runtime Currently Under Debate “Glue” unifies system-level calls –eg. db2, db3, db4, gdbm,... Routines that embody portability –eg. Bucket Brigades, URI routines,...

What Uses APR? Apache HTTPD Apache Modules Subversion Flood JXTA-C Various Internal Projects...

The Basics Some APR Primitive Types

A Who’s Who of Mutexes apr_thread_mutex_t apr_proc_mutex_t apr_global_mutex_t –apr_foo_mutex_lock() Grab the lock, or block until available –apr_foo_mutex_unlock() Release the current lock

Normal vs. Nested Mutexes Normal Mutexes (aka Non-nested) –Deadlocks when same thread locks twice Nested Mutexes –Allows multiple locks with same thread (still have to unroll though)

Reader/Writer Locks apr_thread_rwlock_t –apr_thread_rwlock_rdlock() Grab the shared read lock, blocks for any writers –apr_thread_rwlock_wrlock() Grab the exclusive write lock, blocking new readers –apr_thread_rwlock_unlock() Release the current lock

Condition Variables apr_thread_cond_t –apr_thread_cond_wait() Sleep until any signal arrives –apr_thread_cond_signal() Send a signal to one waiting thread –apr_thread_cond_broadcast() Send a signal to all waiting threads

Threads apr_thread_t –apr_thread_create() Create a new thread (with specialized attributes) –apr_thread_exit() Exit from the current thread (with a return value) –apr_thread_join() Block until a particular thread calls apr_thread_exit()

One-time Calls apr_thread_once_t –apr_thread_once_init() Initialize an apr_thread_once_t variable –apr_thread_once() Execute the given function once

Apache 2.0 Architecture A quick MPM overview

What’s new in Apache 2.0? Filters MPMs SSL lots more…

What is an MPM? “Multi-processing Module” –Different HTTP server process models –Each give us Platform-specific features Admin may chose suitable: –Reliability –Performance –Features

Child Prefork MPM Classic Apache 1.3 model 1 connection per Child Pros: –Isolates faults –Performs well Cons: –Scales poorly (high memory reqts.) Parent Child … (100s)

Child Worker MPM Hybrid Process/Thread 1 connection per Thread Many threads per Child Pros: –Efficient use of memory –Highly Scalable Cons: –Faults destroy all threads in that Child –3rd party libraries must be threadsafe Parent Child … (10s) 10s of threads

WinNT MPM Single Parent/Single Child 1 connection per Thread Many threads per Child Pros: –Efficient use of memory –Highly Scalable Cons: –Faults destroy all threads Parent Child 100s of threads

The MPM Breakdown MPMMulti-processMultithreaded PreforkYesNo WorkerYes WinNTNo*Yes * The WinNT MPM has a single parent and a single child.

Other MPMs BeOS Netware Threadpool –Similar to Worker, experimental Leader-Follower –Similar to Worker, also experimental

Apache 2.0 Hooks Using mutexes within the Apache framework

Useful APR Primitives for Apache mutexes reader/writer locks condition variables shared memory...

Global Mutex Creation Create it in the Parent: –Usually in post_config hook Attach to it in the Child: – This is the child_init hook

Example: Create a Global Mutex static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { int rv; shm_counter_scfg_t *scfg; /* Get the module configuration */ scfg = ap_get_module_config(s->module_config, &shm_counter_module); /* Create a global mutex from the config directive */ rv = apr_global_mutex_create(&scfg->mutex, scfg->shmcounterlockfile, APR_LOCK_DEFAULT, pconf);

Example: Attach Global Mutex static void shm_counter_child_init(apr_pool_t *p, server_rec *s) { apr_status_t rv; shm_counter_scfg_t *scfg = ap_get_module_config( s->module_config, &shm_counter_module); /* Now that we are in a child process, we have to * reconnect to the global mutex. */ rv = apr_global_mutex_child_init(&scfg->mutex, scfg->shmcounterlockfile, p);

Common Pitfall The double DSO-load problem –Apache loads each module twice: First time to see if it fails at startup Second time to actually load it (They are loaded again upon each restart…)

Avoiding the Double DSO-load Solution: –Don’t create mutexes during the first load 1.First time in post_config we set a userdata flag 2.Next time through we look for that userdata flag –if it is set, we create the mutex

What is Userdata? Just a hash table Associated with each pool Same lifetime as its pool Key/Value entries

Example: Double DSO-load static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { apr_status_t rv; void *data = NULL; const char *userdata_key = "shm_counter_post_config"; apr_pool_userdata_get(&data, userdata_key, s->process->pool); if (data == NULL) { /* WARNING: This must *not* be apr_pool_userdata_setn(). */ apr_pool_userdata_set((const void *)1, userdata_key, apr_pool_cleanup_null, s->process->pool); return OK; /* This would be the first time through */ } /* Proceed with normal mutex and shared memory creation... */

Summary 1.Create in the Parent ( post_config ) 2.Attach in the Child ( child_init ) This works for these types: –mutexes –condition variables –reader/writer locks –shared memory –etc…

Shared Memory Efficient and portable shared memory for your Apache module

Types of Shared Memory Anonymous –Requires process inheritance –Created in the parent –Automatically inherited in the child Name-based –Associated with a file –Processes need not be ancestors –Must deal with file permissions

1. Parent creates special Anonymous shared segment Anonymous Shared Memory Parent Child Shared Segment 2. Parent calls fork() 3. Children inherit the shared segment.

Example: Anonymous Shmem static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { int rv;... /* Create an anonymous shared memory segment by passing * a NULL as the shared memory filename */ rv = apr_shm_create(&scfg->counters_shm, sizeof(*scfg->counters), NULL, pconf);

Accessing the Segment scfg->counters = apr_shm_baseaddr_get(scfg->counters_shm); Segment is mapped as soon as it is created –It has a start address –You can query that start address Reminder: The segment may not be mapped to the same address in all processes.

Windows Portability Windows can’t inherit shared memory –it has no fork() call! Stupid Windows Solution: –Just like with mutexes: Attach from the “child” process (hint: to be portable to Windows, we can only use name-based shared memory.)

file system Name-based Shared Memory 1. Process creates file First Process Shared Segment 3. Second process opens same file 4. Second process then maps the same shared segment. 2. File is mapped to segment Second Process

Sharing with external apps Must use name-based Associate it with a file Other programs attach to that file Beware of race conditions Beware of permission problems (note recent security problem in Apache scoreboard)

Example: Name-based Shmem static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { int rv; shm_counter_scfg_t *scfg;... /* Get the module configuration */ scfg = ap_get_module_config(s->module_config, &shm_counter_module); /* Create a name-based shared memory segment using the filename * out of our config directive */ rv = apr_shm_create(&scfg->counters_shm, sizeof(*scfg->counters), scfg->shmcounterfile, pconf);

Example: Name-based Shmem (cont) static void shm_counter_child_init(apr_pool_t *p, server_rec *s) { apr_status_t rv; shm_counter_scfg_t *scfg = ap_get_module_config(s->module_config, &shm_counter_module); rv = apr_shm_attach(&scfg->counters_shm, scfg->shmcounterfile, p); scfg->counters = apr_shm_baseaddr_get(scfg->counters_shm);

RMM (Relocatable Memory Manager) Provides malloc() and free() Works with any block of memory Estimates overhead Threadsafe Usable on shared memory segments

Efficiency Tricks of the Trade

Questions to ask yourself: Do we really need a mutex? Can we design so that we don’t need it? How often will this mutex be locked? What is the largest segment size we’ll need?

R/W Locks vs. Mutexes Reader/Writer locks allow parallel reads Mutexes use less overhead Reader/Writer locks tend to scale better

Comparison of Lock Types

APR Atomics Very Fast Operations Can implement a very fast mutex Pros: –Can be very efficient (sometimes it becomes just one instruction) Cons: –Produces non-portable binaries (e.g. a Solaris 7 binary may not work on Solaris 8)

Threads Adding threads to your Apache modules

Why use threads in Apache? background work better asynchronous event handling pseudo-event-driven models

Thread Libraries Three major types –1:1 one kthread = one userspace thread –1:N one kthread = many userspace threads –N:M many kthreads ~= many userspace threads

1:1 Thread Libraries E.g. –Linuxthreads –NPTL (linux 2.6?) –Solaris 9’s threads –etc... Good with an O(1) scheduler Can span multiple CPUs Resource intensive Userspace Kernel kthread1kthread4kthread5kthread6kthread3kthread2 thread1 Process thread2thread3

1:N Thread Libraries E.g. –GnuPth –FreeBSD <4.6? –etc... Shares one kthread Can NOT span multiple CPUs Not Resource Intensive Poor with compute-bound problems Userspace Kernel kthread1kthread4kthread5kthread6kthread3kthread2 thread1 Process thread2thread3

M:N Thread Libraries E.g. –NPTL (from IBM) –Solaris 6, 7, 8 –AIX –etc... Shares one or more kthreads Can span multiple CPUs Complicated Impl. Good with crappy schedulers Userspace Kernel kthread1kthread4kthread5kthread6kthread3kthread2 thread1 Process thread2thread3

Pitfalls pool association cleanup registration proper shutdown async signal handling signal masks

Bonus: apr_reslist_t Resource Lists

Resource Pooling List of Resources Created/Destroyed as needed Useful for –persistent database connections –request servicing threads –...

Reslist Parameters min –min allowed available resources smax –soft max allowed available resources hmax –hard max on total resources ttl –max time an available resource may idle

Constructor/Destructor Registered Callbacks Create called for new resource Destroy called to expunge old Implementer must ensure threadsafety

Using a Reslist 1.Set up constructor/destructor 2.Set operating parameters 3.Main Loop 1.Retrieve Resource 2.Use 3.Release Resource 4.Destroy reslist

Thank You The End