CSCE 513 Computer Architecture

Slides:



Advertisements
Similar presentations
Threads. Readings r Silberschatz et al : Chapter 4.
Advertisements

Pthreads & Concurrency. Acknowledgements  The material in this tutorial is based in part on: POSIX Threads Programming, by Blaise Barney.
Lecture 4: Concurrency and Threads CS 170 T Yang, 2015 Chapter 4 of AD textbook.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
SMP threads an Introduction to Posix Threads. Technical Definition 1.Independent stream of instructions that can be scheduled to run by an operating system.
Jonathan Walpole Computer Science Portland State University
PRINCIPLES OF OPERATING SYSTEMS Lecture 6: Processes CPSC 457, Spring 2015 May 21, 2015 M. Reza Zakerinasab Department of Computer Science, University.
The University of Adelaide, School of Computer Science
POSIX Threads Programming The following is extracted from a tutorial by Blaise Barney at Livermore Computing Blaise Barney (Lawrence Livermore National.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
POSIX Threads Programming Operating Systems. Processes and Threads In shared memory multiprocessor architectures, such as SMPs, threads can be used to.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
Source: Operating System Concepts by Silberschatz, Galvin and Gagne.
CS333 Intro to Operating Systems Jonathan Walpole.
Professor: Shu-Ching Chen TA: Samira Pouyanfar.  An independent stream of instructions that can be scheduled to run  A path of execution int a, b; int.
Pthreads: A shared memory programming model
1 Pthread Programming CIS450 Winter 2003 Professor Jinhua Guo.
POSIX Threads HUJI Spring 2011.
Lecture 7: POSIX Threads - Pthreads. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
IT 325 Operating systems Chapter6.  Threads can greatly simplify writing elegant and efficient programs.  However, there are problems when multiple.
Pthreads.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 14 Threads 2 Read Ch.
Posix Threads Topics PthreadsReadings January 12, 2012 CSCE 713 Advanced Computer Architecture.
Unix Internals Concurrent Programming. Unix Processes Processes contain information about program resources and program execution state, including: Process.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
Copyright ©: Nahrstedt, Angrave, Abdelzaher
POSIX Threads Loren Stroup EEL 6897 Software Development for R-T Engineering Systems.
NCHU System & Network Lab Lab #6 Thread Management Operating System Lab.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
CISC2200 Threads Fall 09. Process  We learn the concept of process  A program in execution  A process owns some resources  A process executes a program.
Thread Programming 김 도 형김 도 형. 2 Table of Contents  What is a Threads?  Designing Threaded Programs  Synchronizing Threads  Managing Threads.
回到第一頁 What are threads n Threads are often called "lightweight processes” n In the UNIX environment a thread: u Exists within a process and uses the process.
7/9/ Realizing Concurrency using Posix Threads (pthreads) B. Ramamurthy.
Tutorial 4. In this tutorial session we’ll see Threads.
Chapter 4 – Thread Concepts
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Threads Some of these slides were originally made by Dr. Roger deBry. They include text, figures, and information from this class’s textbook, Operating.
PThreads.
Chapter 4 – Thread Concepts
Threads in C Caryl Rahn.
Threads Threads.
Lecture 7 : Multithread programming
CS399 New Beginnings Jonathan Walpole.
Thread Programming.
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
Chapter 4: Threads.
PTHREADS These notes are from LLNL Pthreads Tutorial
Multithreading Tutorial
Threads and Cooperation
Chapter 4: Threads Overview Multithreading Models Thread Libraries
Realizing Concurrency using Posix Threads (pthreads)
Operating Systems Lecture 13.
Jonathan Walpole Computer Science Portland State University
Realizing Concurrency using the thread model
Multithreading Tutorial
Thread Programming.
CS510 Operating System Foundations
Pthread Prof. Ikjun Yeom TA – Mugyo
Operating System Concepts
Multithreading Tutorial
Programming with Shared Memory
Jonathan Walpole Computer Science Portland State University
Multithreading Tutorial
Realizing Concurrency using Posix Threads (pthreads)
Programming with Shared Memory
Realizing Concurrency using Posix Threads (pthreads)
Tutorial 4.
Programming with Shared Memory - 2 Issues with sharing data
Computer Architecture Multi-threaded Matrix Multiply
Shared Memory Programming with Pthreads
Presentation transcript:

CSCE 513 Computer Architecture Lecture 14 Speculation One more time and Coarse Grain Thread Parallelism (Posix Threads) Topics Readings: Chapter 3 – ROB, Branch target buffers Gustaphson’s Law Chapter 4 Data Parallelism Posix threads October 30, 2017

Overview Last Time Today’s Lecture Readings: Chapter 3 Reorder Buffers Previous slides on the web on ROB … Test 2 – November 23 !!!!! Readings: Chapter 3 ROB one more time Gustaphson’s Law Speculation revisited Branch Target Buffers : Chapter 3 Interleaved memory Thread Level parallelism – POSIX - https://computing.llnl.gov/tutorials/pthreads/ 4.1-4.2

Gustafson's Law - motivation With P processors just how much speedup can you obtain? Amdahl’s Law But this is not what we are really interested in? A better way to look at it is with my new parallel capability how much bigger of a problem can I solve in the same amount of time? http://en.wikipedia.org/wiki/Gustafson%27s_law

Gustafson's Law - Fenhanced – fraction enhanced = fraction parallelizable Other portion the serial fraction α= (1 - Fparallelizable) P – number of processors Time for typical problem In same time what is the number of operations that can be performed with P processors http://en.wikipedia.org/wiki/Gustafson%27s_law

Gustafson's Law “It is based on the idea that if the problem size is allowed to grow monotonically with P, then the sequential fraction of the workload would not ultimately come to dominate.” Suppose you are multiplying n x n matrices Freqparallel = 80% - assume it is constant as n increases Note since I/O is O(n2) and multiplying is O(n3) it is better than this If a 100 x 100 matrix can be multiplied in 1 second with one core. What size matrix can be multiplied in 1 second with 100 processors assuming no communication overhead ? http://en.wikipedia.org/wiki/Gustafson%27s_law

Branch Prediction for Speculation Multiple issue – delivering 4 to 8 instructions every cycle Increase instruction delivery bandwidth Multiple paths, multiplexers for selecting … Handling branches Review Classical 5 stage pipeline Predict branch not taken Calculate target address and branch decision in Execute Cycle On misprediction turn instructions in pipeline into “NOPs” = bubbles Extra adder for target address in Decode (saved 2 cycles) Branch delay slot – schedule something that will be done to follow the branch 2 bit branch history table – indexed by address of branch (not target) Correlating branch predictors

Creating a Data Memory Trace in C

Links: Threads, Unix Processes, https://computing.llnl.gov/tutorials/pthreads/ http://en.wikipedia.org/wiki/POSIX_Threads http://download.oracle.com/javase/tutorial/essential/concurrency/procthread.html http://www.cis.temple.edu/~ingargio/cis307/readings/system-commands.html http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html

Unix System Related Commands ps, kill, top, nice, jobs, fg, bg lscpu /dev/proc https://computing.llnl.gov/tutorials/pthreads/

What is a Thread? . https://computing.llnl.gov/tutorials/pthreads/

Threads in the Unix Environment Exists within a process and uses the process resources Has its own independent flow of control as long as its parent process exists and the OS supports it Duplicates only the essential resources it needs to be independently schedulable May share the process resources with other threads that act equally independently (and dependently) Dies if the parent process dies - or something similar Is "lightweight" because most of the overhead has already been accomplished through the creation of its process. Because threads within the same process share resources: Changes made by one thread to shared system resources (such as closing a file) will be seen by all other threads. Two pointers having the same value point to the same data. Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer. https://computing.llnl.gov/tutorials/pthreads/

Threads Sharing of Data Because threads within the same process share resources: Changes made by one thread to shared system resources (such as closing a file) will be seen by all other threads. Two pointers having the same value point to the same data. Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer. https://computing.llnl.gov/tutorials/pthreads/

Sharing of Data between Processes

What are Pthreads? Why Pthreads? What? POSIX is an acronym for Portable Operating System Interface Why? The primary motivation for using Pthreads is to realize potential program performance gains. https://computing.llnl.gov/tutorials/pthreads/

Example /* http://en.wikipedia.org/wiki/POSIX_Threads*/ #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <assert.h> #define NUM_THREADS 5 https://computing.llnl.gov/tutorials/pthreads/

void. TaskCode(void. argument) { int tid; tid =. ((int void *TaskCode(void *argument) { int tid; tid = *((int *) argument); printf("Hello World! It's me, thread %d!\n", tid); /* optionally: insert more useful stuff here */ return NULL; } int main (int argc, char *argv[]) pthread_t threads[NUM_THREADS]; int thread_args[NUM_THREADS]; int rc, i; https://computing.llnl.gov/tutorials/pthreads/

/* create all threads */ for (i=0; i<NUM_THREADS; ++i) { thread_args[i] = i; printf("In main: creating thread %d\n", i); rc = pthread_create(&threads[i], NULL, TaskCode, (void *) &thread_args[i]); assert(0 == rc); } /* wait for all threads to complete */ rc = pthread_join(threads[i], NULL); exit(EXIT_SUCCESS); https://computing.llnl.gov/tutorials/pthreads/

Designing Threaded Programs What type of parallel programming model to use? Problem partitioning Load balancing Communications Data dependencies Synchronization and race conditions Memory issues I/O issues Program complexity Programmer effort/costs/time https://computing.llnl.gov/tutorials/pthreads/

Scheduling Independent Routines https://computing.llnl.gov/tutorials/pthreads/

Programs suitable for Multithreading Work that can be executed, or data that can be operated on, by multiple tasks simultaneously Block for potentially long I/O waits Use many CPU cycles in some places but not others Must respond to asynchronous events Some work is more important than other work (priority interrupts) https://computing.llnl.gov/tutorials/pthreads/

Client-Server Applications

Common models for threaded programs Manager/worker: a single thread, the manager assigns work to other threads, the workers. Typically, the manager handles all input and parcels out work to the other tasks. At least two forms of the manager/worker model are common: static worker pool and dynamic worker pool. Pipeline: a task is broken into a series of suboperations, each of which is handled in series, but concurrently, by a different thread. Peer: similar to the manager/worker model, but after the main thread creates other threads, it participates in the work. https://computing.llnl.gov/tutorials/pthreads/

Shared Memory Model https://computing.llnl.gov/tutorials/pthreads/

Thread-safeness Thread-safeness: in a nutshell, refers an application's ability to execute multiple threads simultaneously without "clobbering" shared data or creating "race" conditions. For example, suppose that your application creates several threads, each of which makes a call to the same library routine: This library routine accesses/modifies a global structure or location in memory. As each thread calls this routine it is possible that they may try to modify this global structure/memory location at the same time. If the routine does not employ some sort of synchronization constructs to prevent data corruption, then it is not thread-safe. https://computing.llnl.gov/tutorials/pthreads/

The Pthreads API Routine Prefix Functional Group pthread_ Threads themselves and miscellaneous subroutines pthread_attr_ Thread attributes objects pthread_mutex_ Mutexes pthread_mutexattr_ Mutex attributes objects. pthread_cond_ Condition variables pthread_condattr_ Condition attributes objects pthread_key_ Thread-specific data keys pthread_rwlock_ Read/write locks pthread_barrier_ Synchronization barriers https://computing.llnl.gov/tutorials/pthreads/

Compiling Threaded Programs .

Creating and Terminating Threads https://computing.llnl.gov/tutorials/pthreads/

Thread Termination https://computing.llnl.gov/tutorials/pthreads/

Mutex Variables pthread_mutex_lock (mutex) pthread_mutex_trylock (mutex) pthread_mutex_unlock (mutex) https://computing.llnl.gov/tutorials/pthreads/

The pthread_mutex_lock() routine is used by a thread to acquire a lock on the specified mutex variable. If the mutex is already locked by another thread, this call will block the calling thread until the mutex is unlocked. https://computing.llnl.gov/tutorials/pthreads/

pthread_mutex_trylock() will attempt to lock a mutex pthread_mutex_trylock() will attempt to lock a mutex. However, if the mutex is already locked, the routine will return immediately with a "busy" error code. This routine may be useful in preventing deadlock conditions, as in a priority-inversion situation. pthread_mutex_unlock() will unlock a mutex if called by the owning thread. Calling this routine is required after a thread has completed its use of protected data if other threads are to acquire the mutex for their work with the protected data. An error will be returned if: If the mutex was already unlocked If the mutex is owned by another thread https://computing.llnl.gov/tutorials/pthreads/

Pthread1.c #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <assert.h> #define NUM_THREADS 5 void *TaskCode(void *argument) { int tid; tid = *((int *) argument); printf("Hello World! It's me, thread %d!\n", tid); /* optionally: insert more useful stuff here */ return NULL; }

int main (int argc, char *argv[]) { pthread_t threads[NUM_THREADS]; int thread_args[NUM_THREADS]; int rc, i; /* create all threads */ for (i=0; i<NUM_THREADS; ++i) { thread_args[i] = i; printf("In main: creating thread %d\n", i); rc = pthread_create(&threads[i], NULL, TaskCode, (void *) &thread_args[i]); assert(0 == rc); } /* wait for all threads to complete */ rc = pthread_join(threads[i], NULL); exit(EXIT_SUCCESS);

saluda> ./thread1 In main: creating thread 0 In main: creating thread 1 Hello World! It's me, thread 0! Hello World! It's me, thread 1! In main: creating thread 2 In main: creating thread 3 Hello World! It's me, thread 2! In main: creating thread 4 Hello World! It's me, thread 3! Hello World! It's me, thread 4!

. https://computing.llnl.gov/tutorials/pthreads/