Extended Memory Semantics for Thread Synchronization Sheng Li, Ying Zhou Operating System Progress Report Nov 1 st, 2007 Sheng Li, Ying Zhou Operating.

Slides:



Advertisements
Similar presentations
Chapter 6: Process Synchronization
Advertisements

WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
MACHINE-INDEPENDENT VIRTUAL MEMORY MANAGEMENT FOR PAGED UNIPROCESSOR AND MULTIPROCESSOR ARCHITECTURES R. Rashid, A. Tevanian, M. Young, D. Golub, R. Baron,
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.
Review: Multiprocessor Systems (MIMD)
Intro to Threading CS221 – 4/20/09. What we’ll cover today Finish the DOTS program Introduction to threads and multi-threading.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)
OPERATING SYSTEM OVERVIEW
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Processes.
CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.
Process Concept An operating system executes a variety of programs
Synchron. CSE 4711 The Need for Synchronization Multiprogramming –“logical” concurrency: processes appear to run concurrently although there is only one.
© 2004, D. J. Foreman 2-1 Concurrency, Processes and Threads.
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories
Dual Stack Virtualization: Consolidating HPC and commodity workloads in the cloud Brian Kocoloski, Jiannan Ouyang, Jack Lange University of Pittsburgh.
More on Locks: Case Studies
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Chapter 1. Introduction What is an Operating System? Mainframe Systems
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
A Heterogeneous Lightweight Multithreaded Architecture Sheng Li, Amit Kashyap, Shannon Kuntz, Jay Brockman, Peter Kogge, Paul Springer, and Gary Block.
Operating System 4 THREADS, SMP AND MICROKERNELS
Three fundamental concepts in computer security: Reference Monitors: An access control concept that refers to an abstract machine that mediates all accesses.
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 6.
Håkan Sundell, Chalmers University of Technology 1 NOBLE: A Non-Blocking Inter-Process Communication Library Håkan Sundell Philippas.
© 2004, D. J. Foreman 2-1 Concurrency, Processes and Threads.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 7 OS System Structure.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
1 Chapter 9 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
CS399 New Beginnings Jonathan Walpole. Virtual Memory (1)
Lecture 8 Page 1 CS 111 Online Other Important Synchronization Primitives Semaphores Mutexes Monitors.
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
1 Memory Management. 2 Fixed Partitions Legend Free Space 0k 4k 16k 64k 128k Internal fragmentation (cannot be reallocated) Divide memory into n (possible.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
Silberschatz, Galvin and Gagne  Operating System Concepts Process Concept An operating system executes a variety of programs:  Batch system.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Computer Network Lab. Korea University Computer Networks Labs Se-Hee Whang.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Programmability Hiroshi Nakashima Thomas Sterling.
Processes & Threads Introduction to Operating Systems: Module 5.
SYNAR Systems Networking and Architecture Group CMPT 886: The Art of Scalable Synchronization Dr. Alexandra Fedorova School of Computing Science SFU.
CS533 Concepts of Operating Systems Jonathan Walpole.
System Architecture Directions for Networked Sensors.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
1 Why Threads are a Bad Idea (for most purposes) based on a presentation by John Ousterhout Sun Microsystems Laboratories Threads!
Background Computer System Architectures Computer System Software.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.
Kendo: Efficient Deterministic Multithreading in Software M. Olszewski, J. Ansel, S. Amarasinghe MIT to be presented in ASPLOS 2009 slides by Evangelos.
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
1 Chapter 5: Threads Overview Multithreading Models & Issues Read Chapter 5 pages
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Chapter 4: Threads.
Speculative Lock Elision
Threads Cannot Be Implemented As a Library
CS510 Operating System Foundations
Chapter 4: Threads.
Chapter 4: Threads.
Designing Parallel Algorithms (Synchronization)
Yiannis Nikolakopoulos
Operating System 4 THREADS, SMP AND MICROKERNELS
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Chapter 4: Threads.
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Presentation transcript:

Extended Memory Semantics for Thread Synchronization Sheng Li, Ying Zhou Operating System Progress Report Nov 1 st, 2007 Sheng Li, Ying Zhou Operating System Progress Report Nov 1 st, 2007

2 Problems Hardware multithreading is no longer a privilege of supercomputing, it is already part of the major microprocessors.  E.g. In Sun Niagara 2 has 64 threads/chip and 256 threads/server. Concurrency management is one of the biggest challenges in multithreaded system  Key requirement: Low overhead and scalable thread synchronization Synchronization mechanisms  Atomic primitives (Test-and-Set, Compare-and-Swap, LL-SC)  Software routines built on them have poor performance and scalability  Empty/Full bits, using extension bit for each memory location to denote the empty/full state.  Better performance [1], but still not enough

Nov 1 st, Our Goal Solve the synchronization bottleneck by using Extended Memory Semantics  Better performance and scalability Quantify the performance gain when using EMS, compared to other synchronization mechanisms (e.g Empty/Full bits)

Nov 1 st, Extended Memory Semantics Memory instructions are characterized synchronization behavior.  Load.ff, Load.fe, Store.xf, Store.ef, Store.xe. (F--- Full, e--- empty, x---don’t care) 64 bits of data/metadata Extension bit

Nov 1 st, EMS handler There is no free lunch… EMS handler has overhead  Creating the handler threads  To queue up memory requests, to build the data structure

Nov 1 st, What we have done so far Build the EMS model on both architecture and OS aspects in the Structural Simulation Toolkit (SST)  SST is the simulation environment for massively lightweight multithreading, developed at Notre Dame and Sandia Lab Modified the glibc to use EMS  Especially pthread library Design benchmarks for different categories Run the simulations to evaluate EMS performance

Nov 1 st, Tightly Coupled Parallel Each thread competes with the others for the only lock before updating the counter Very high contention, worst case

Nov 1 st, Loosely Coupled Parallel Each thread competes locks with the others before updating the counters. Mild contention

Nov 1 st, Embarrassingly Parallel No contention, no locks

Nov 1 st, Embarrassingly parallel and loosely coupled parallel Low synchronization overhead--- guaranteed by EMS EMS shows very good scalability Synchronization distribution

Nov 1 st, Tightly Coupled Parallel Bad performance for EMS in the worst case Most of threads are used for synchronization, not for real job

Nov 1 st, The Road Ahead Build/complete other synchronization mechanisms (e.g. Empty/Full bits and etc) into SST Modify glibc to make it support for other synchronization mechanisms Compare performance between EMS and other synchronization mechanisms

Nov 1 st, Thank you! Questions?

Nov 1 st, Bibliography [1] Performance and Programming Experience on the Tera MTA, Larry Carter, John Feo, Allan Snavely, PPSC, 1999

Nov 1 st, Back up Slides

Nov 1 st, Lightweight Threads Thread context (frame) is 32 double words (256 bytes)  Two double words are reserved for the thread status; 30 general purpose registers.  No other per thread state, easy for multithreading. Frames are stored in memory (No Register File)  Registers are aliases for memory locations