Bronis R. de Supinski and John May Center for Applied Scientific Computing March 18, 1999 Benchmarking pthreads.

Slides:



Advertisements
Similar presentations
CS Lecture 4 Programming with Posix Threads and Java Threads George Mason University Fall 2009.
Advertisements

Chapter 2 Processes and Threads
The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.
Threads. What do we have so far The basic unit of CPU utilization is a process. To run a program (a sequence of code), create a process. Processes are.
Distributed systems Programming with threads. Reviews on OS concepts Each process occupies a single address space.
Extensibility, Safety and Performance in the SPIN Operating System Department of Computer Science and Engineering, University of Washington Brian N. Bershad,
MPI and C-Language Seminars Seminar Plan  Week 1 – Introduction, Data Types, Control Flow, Pointers  Week 2 – Arrays, Structures, Enums, I/O,
Chapter 5 Processes and Threads Copyright © 2008.
Distributed systems Programming with threads. Reviews on OS concepts Each process occupies a single address space.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
July Terry Jones, Integrated Computing & Communications Dept Fast-OS.
Performance Analysis of MPI Communications on the SGI Altix 3700 Nor Asilah Wati Abdul Hamid, Paul Coddington, Francis Vaughan Distributed & High Performance.
Bronis R. de Supinski Center for Applied Scientific Computing Lawrence Livermore National Laboratory June 2, 2005 The Most Needed Feature(s) for OpenMP.
Scheduler Activations Effective Kernel Support for the User-Level Management of Parallelism.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Nor Asilah Wati Abdul Hamid, Paul Coddington, Francis Vaughan School of Computer Science, University of Adelaide IPDPS - PMEO April 2006 Comparison of.
Threads© Dr. Ayman Abdel-Hamid, CS4254 Spring CS4254 Computer Network Architecture and Programming Dr. Ayman A. Abdel-Hamid Computer Science Department.
Pthread II. Outline Join Mutex Variables Condition Variables.
ThreadsThreads operating systems. ThreadsThreads A Thread, or thread of execution, is the sequence of instructions being executed. A process may have.
User-Level Process towards Exascale Systems Akio Shimada [1], Atsushi Hori [1], Yutaka Ishikawa [1], Pavan Balaji [2] [1] RIKEN AICS, [2] Argonne National.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Chapter 4: Threads. 4.2CSCI 380 Operating Systems Chapter 4: Threads Overview Multithreading Models Threading Issues Pthreads Windows XP Threads Linux.
Introduction to Pthreads. Pthreads Pthreads is a POSIX standard for describing a thread model, it specifies the API and the semantics of the calls. Model.
CASC This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Chapter 4: Threads.
Sam Uselton Center for Applied Scientific Computing Lawrence Livermore National Lab October 25, 2001 Challenges for Remote Visualization: Remote Viz Is.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Today’s topic Pthread Some materials and figures are obtained from the POSIX threads Programming tutorial at
Threads and Thread Control Thread Concepts Pthread Creation and Termination Pthread synchronization Threads and Signals.
Programming with POSIX* Threads Intel Software College.
ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
Professor: Shu-Ching Chen TA: Samira Pouyanfar.  An independent stream of instructions that can be scheduled to run  A path of execution int a, b; int.
Threads CSCE Thread Motivation Processes are expensive to create. Context switch between processes is expensive Communication between processes.
1 Pthread Programming CIS450 Winter 2003 Professor Jinhua Guo.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
CS 838: Pervasive Parallelism Introduction to pthreads Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
Department of Computer Science and Software Engineering
12/22/ Thread Model for Realizing Concurrency B. Ramamurthy.
Lawrence Livermore National Laboratory BRdeS-1 Science & Technology Principal Directorate - Computation Directorate How to Stop Worrying and Learn to Love.
Operating Systems Inter-Process Communications. Lunch time in the Philosophy Department. Dining Philosophers Problem (1)
Bronis R. de Supinski and Jeffrey S. Vetter Center for Applied Scientific Computing August 15, 2000 Umpire: Making MPI Programs Safe.
Code Motion for MPI Performance Optimization The most common optimization in MPI applications is to post MPI communication earlier so that the communication.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
1 OS Review Processes and Threads Chi Zhang
CSC Multiprocessor Programming, Spring, 2012 Chapter 8 – Applying Thread Pools Dr. Dale E. Parson, week 10.
The Globus eXtensible Input/Output System (XIO): A protocol independent IO system for the Grid Bill Allcock, John Bresnahan, Raj Kettimuthu and Joe Link.
Threads A thread is an alternative model of program execution
PThread Synchronization. Thread Mechanisms Birrell identifies four mechanisms commonly used in threading systems –Thread creation –Mutual exclusion (mutex)
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
NCHU System & Network Lab Lab #6 Thread Management Operating System Lab.
Low Overhead Real-Time Computing General Purpose OS’s can be highly unpredictable Linux response times seen in the 100’s of milliseconds Work around this.
Thread Basic Thread operations include thread creation, termination, synchronization, data management Threads in the same process share:  Process address.
OpenMP Runtime Extensions Many core Massively parallel environment Intel® Xeon Phi co-processor Blue Gene/Q MPI Internal Parallelism Optimizing MPI Implementation.
Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.
CMSC 421 Spring 2004 Section 0202 Part II: Process Management Chapter 5 Threads.
Community Grids Laboratory
Threads Threads.
Multithreading Tutorial
Performance Evaluation of Adaptive MPI
L21: Putting it together: Tree Search (Ch. 6)
Chapter 4: Threads.
Multithreading Tutorial
Threads and Concurrency
Multithreading Tutorial
Multithreading Tutorial
CS 5204 Operating Systems Lecture 5
Presentation transcript:

Bronis R. de Supinski and John May Center for Applied Scientific Computing March 18, 1999 Benchmarking pthreads

2 CASC Measuring thread performance Threading options —pthreads —OpenMP —Threaded libraries How to choose —Ease of use —Performance No threads benchmark suites!!

3 CASC Using SKaMPI Advantages —Extensible interface —Data collection and test management facilities —Will support mixed models benchmarks Drawbacks —Times single measurement action —Relies on MPI —Design flaws Challenges —Termination —CPU bindings

4 CASC Thread creation Measuring thread creation cost is difficult —Number of threads is limited —Overhead —Solution: recursive thread creation Using default attributes:  s Eventually will vary: —Detached state —Contention scope —Stack size?

5 CASC Context switch Bind threads to same CPU Repeatedly call sched_yield Running thread alternates quickly Measured time: 4.4  s

6 CASC Thread communication LogP model —Latency (wire time) —Overheads —Gap Measure —Ping-pong time: 2*(o S + L + o R ) —Repeatedly “sending”: o S —Repeatedly “receiving”: o R —Calculate L —Applicability of gap? oSoS L oRoR g

7 CASC Conditions Operations —pthread_cond_signal —pthread_cond_wait Associated mutex Ping-pong measurements —Unbound: 48.9  s (Sun: 25.0  s) —Same CPU: 29.2  s (Sun: 25.7  s) —Different CPUs: 74.3  s (Sun: 25.2  s) Overheads —Signal:  s —Wait: Different CPUs: 45.7  s

8 CASC Mutexes Ping-pong obstacles —Non-determinism —Idempotency required Measurements —Unbound: 3.7  s —Same CPU: 37.8  s —Different CPUs: 3.7  s —No contention:  s Thread 0 Unlock 0 Thread 1 Lock 0 Unlock 1Lock 1 Unlock 2Lock 2 Unlock 3Lock 3 Unlock 1Lock 1 Unlock 0Lock 0 Unlock 3Lock 3 Unlock 2Lock 2

9 CASC Summary SKaMPI modifications —Fills threads microbenchmark void —Can extend to mixed models Basic aspects of pthreads performance —Creation —Context switch —Timeslice —Communication IBM’s implementations —Mutexes are good —Conditions could be improved Plan to add other tests, variations

10 CASC Work performed under the auspices of the U. S. Department of Energy by Lawrence Livermore National Laboratory under Contract W-7405-Eng-48 UCRL-MI