Multi-core Programming: Basic Concepts. Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered.

Slides:



Advertisements
Similar presentations
Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Advertisements

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Intel Software College Tuning Threading Code with Intel® Thread Profiler for Explicit Threads.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
Mutual Exclusion.
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
INTEL CONFIDENTIAL Deadlock Introduction to Parallel Programming – Part 7.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman
Concurrent Processes Lecture 5. Introduction Modern operating systems can handle more than one process at a time System scheduler manages processes and.
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
Synchronization (other solutions …). Announcements Assignment 2 is graded Project 1 is due today.
INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6.
INTEL CONFIDENTIAL OpenMP for Task Decomposition Introduction to Parallel Programming – Part 8.
INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.
Thinking in Parallel – Pipelining New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
INTEL CONFIDENTIAL Reducing Parallel Overhead Introduction to Parallel Programming – Part 12.
INTEL CONFIDENTIAL Parallel Decomposition Methods Introduction to Parallel Programming – Part 2.
INTEL CONFIDENTIAL Finding Parallelism Introduction to Parallel Programming – Part 3.
Concurrency Recitation – 2/24 Nisarg Raval Slides by Prof. Landon Cox.
1 Advanced Computer Programming Concurrency Multithreaded Programs Copyright © Texas Education Agency, 2013.
Programming Models using Windows* Threads Intel Software College.
Rechen- und Kommunikationszentrum (RZ) Parallelization at a Glance Christian Terboven / Aachen, Germany Stand: Version 2.3.
This module was created with support form NSF under grant # DUE Module developed by Martin Burtscher Module B1 and B2: Parallelization.
DATA STRUCTURES OPTIMISATION FOR MANY-CORE SYSTEMS Matthew Freeman | Supervisor: Maciej Golebiewski CSIRO Vacation Scholar Program
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
© 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.
1 Multithreaded Programming Concepts Myongji University Sugwon Hong 1.
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
Games Development 2 Concurrent Programming CO3301 Week 9.
Programming with POSIX* Threads Intel Software College.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
Correcting Threading Errors with Intel® Parallel Inspector.
CY2003 Computer Systems Lecture 04 Interprocess Communication.
INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.
Processor Architecture
Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
Copyright © Curt Hill Concurrent Execution An Overview for Database.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
Slides created by: Professor Ian G. Harris Operating Systems  Allow the processor to perform several tasks at virtually the same time Ex. Web Controlled.
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
Concurrency and Performance Based on slides by Henri Casanova.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Agenda  Quick Review  Finish Introduction  Java Threads.
Tuning Threaded Code with Intel® Parallel Amplifier.
1 Parallel Processing Fundamental Concepts. 2 Selection of an Application for Parallelization Can use parallel computation for 2 things: –Speed up an.
6/27/20161 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam King,
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Background on the need for Synchronization
Prof. Chih-Hung Wu Dept. of Electrical Engineering
EE 193: Parallel Computing
Designing Parallel Algorithms (Synchronization)
Lecture 21: Synchronization and Consistency
Shared Memory Programming
Lecture: Coherence and Synchronization
Dr. Mustafa Cem Kasapbaşı
Lecture 2 The Art of Concurrency
Lecture: Coherence and Synchronization
Lecture 18: Coherence and Synchronization
CS 144 Advanced C++ Programming May 7 Class Meeting
EECE.4810/EECE.5730 Operating Systems
CSC Multiprocessor Programming, Spring, 2011
Presentation transcript:

Multi-core Programming: Basic Concepts

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 2 Multi-core Programming: Basic Concepts Objectives After completion of this module you will be familiar with the basic concepts of: threads multithreaded programming Note: Threads will be “agents” doing work No code examples are used

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 3 Multi-core Programming: Basic Concepts Agenda Basics Design concepts Correctness concepts Performance concepts

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 4 Multi-core Programming: Basic Concepts Agenda Basics Processes and threads Parallelism and concurrency Design concepts Correctness concepts Performance concepts

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 5 Multi-core Programming: Basic Concepts Why Use Threads Benefits Increased performance Easy method to take advantage of multi-core Better resource utilization Reduce latency (even on single processor systems) Efficient data sharing Sharing data through memory more efficient than message-passing Risks Increases complexity of application Difficult to debug (data races, deadlocks, etc.) Processes & Threads

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 6 Multi-core Programming: Basic Concepts Processes and Threads Modern operating systems load programs as processes Resource holder Execution A process starts executing at its entry point as a thread Threads can create other threads within the process Each thread gets its own stack All threads within a process share code & data segments Processes & Threads Code segment Data segment thread main() … thread Stack

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 7 Multi-core Programming: Basic Concepts Concurrency vs. Parallelism Concurrency: two or more threads are in progress at the same time: Parallelism: two or more threads are executing at the same time Multiple cores needed Thread 1 Thread 2 Thread 1 Thread 2 Concurrency vs. Parallelism

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 8 Multi-core Programming: Basic Concepts Agenda Basics Design concepts Threading for functionality or performance? Threading for throughput or turnaround? Decomposing the work Correctness concepts Performance concepts

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 9 Multi-core Programming: Basic Concepts Threading for Functionality Assign threads to separate functions done by application Easiest method since overlap is unlikely Example: Building a house Bricklayer, carpenter, roofer, plumber,… Threading for Functionality or Performance?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 10 Multi-core Programming: Basic Concepts Threading for Performance Increase the performance of computations Thread in order to improve turnaround or throughput Examples Automobile assembly line Each worker does an assigned function Searching for pieces of Skylab Divide up area to be searched US Postal Service Post office branches, mail sorters, delivery Threading for Functionality or Performance?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 11 Multi-core Programming: Basic Concepts Turnaround Complete single task in the smallest amount of time Example: Setting a dinner table One to put down plates One to fold and place napkins One to place utensils Spoons, knives, forks One to place glasses Threading for Throughput or Turnaround?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 12 Multi-core Programming: Basic Concepts Throughput Complete the most tasks in a fixed amount of time Example: Setting up banquet tables Multiple waiters each do separate tables Specialized waiters for plates, glasses, utensils, etc. Threading for Throughput or Turnaround?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 13 Multi-core Programming: Basic Concepts Task Decomposition Divide computation based on natural set of independent tasks Assign data for each task as needed Example: Paint-by-Numbers Painting a single color is a single task Number of tasks = number of colors Two artists: one does even, other odd Task Decomposition

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 14 Multi-core Programming: Basic Concepts Data Decomposition Large data sets whose elements can be computed independently Divide data and associated computation among threads Example: Grading test papers Multiple graders with same key What if different keys are needed? Data Decomposition

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 15 Multi-core Programming: Basic Concepts Agenda Basics Design concepts Correctness concepts Race Conditions and Synchronization Deadlock Performance concepts

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 16 Multi-core Programming: Basic Concepts Race Conditions Threads “race” against each other for resources Execution order is assumed but cannot be guaranteed Storage conflict is most common Concurrent access of same memory location by multiple threads At least one thread is writing Example: Musical Chairs Race Conditions and Synchronization

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 17 Multi-core Programming: Basic Concepts Mutual Exclusion Critical Region Portion of code that accesses (reads & writes) shared variables Mutual Exclusion Program logic to enforce single thread access to critical region Enables correct programming structures for avoiding race conditions Example: Safe Deposit box Attendants ensure mutual exclusion Race Conditions and Synchronization

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 18 Multi-core Programming: Basic Concepts Synchronization Synchronization objects used to enforce mutual exclusion Lock, semaphore, critical section, event, condition variable, atomic One thread “holds” sync. object; other threads must wait When done, holding thread releases object; some waiting thread given object Example: Library book One patron has book checked out Others must wait for book to return Race Conditions and Synchronization

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 19 Multi-core Programming: Basic Concepts 9 Barrier Synchronization Threads pause at execution point Threads waiting are idle; overhead When all threads arrive, all are released Example: Race starting line Race Conditions and Synchronization

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 20 Multi-core Programming: Basic Concepts Deadlock Threads wait for some event or condition that will never happen Example: Traffic jam at intersection Cars unable to turn or back up What is Livelock? Threads change state in response to each other Example: Robin Hood and Little John on log bridge Deadlock

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 21 Multi-core Programming: Basic Concepts Agenda Basics Design concepts Correctness concepts Performance concepts Speedup and Efficiency Granularity and load balance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 22 Multi-core Programming: Basic Concepts Speedup (Simple) Measure of how much faster the computation executes versus the best serial code Serial time divided by parallel time Example: Painting a picket fence 30 minutes of preparation (serial) One minute to paint a single picket 30 minutes of cleanup (serial) Thus, 300 pickets takes 360 minutes (serial time) Speedup and Efficiency

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 23 Multi-core Programming: Basic Concepts Computing Speedup What if fence owner uses spray gun to paint 300 pickets in one hour? Better serial algorithm If no spray guns are available for multiple workers, what is maximum parallel speedup? Number of painters TimeSpeedup = X = X = 904.0X = 635.7X Infinite = 606.0X Illustrates Amdahl’s Law Potential speedup is restricted by serial portion Speedup and Efficiency

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 24 Multi-core Programming: Basic Concepts Efficiency Measure of how effectively computation resources (threads) are kept busy Speedup divided by number of threads Expressed as average percentage of non-idle time Number of painters TimeSpeedupEfficiency X100% = X85% = 904.0X40% = 635.7X5.7% Infinite = 606.0Xvery low Speedup and Efficiency

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 25 Multi-core Programming: Basic Concepts Granularity Loosely defined as the ratio of computation to synchronization Be sure there is enough work to merit parallel computation Example: Two farmers divide a field. How many more farmers can be added? Granularity and Load Balance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 26 Multi-core Programming: Basic Concepts Load Balance Most effective distribution is to have equal amounts of work per thread Threads that finish first sit idle Threads should finish close to same time Example: Busing banquet tables Better to assign same number of tables to each bus person Granularity and Load Balance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 27 Multi-core Programming: Basic Concepts Multi-core Programming: Basic Concepts What’s Been Covered Processes and threads Parallelism and concurrency Design concepts Threading for throughput or turnaround? Threading for functionality or performance? Decomposing the work Correctness concepts Race Conditions and Synchronization Deadlock Performance concepts Speedup and Efficiency Granularity and load balance

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 28 Multi-core Programming: Basic Concepts

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 29 Multi-core Programming: Basic Concepts Alternative Slides

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 30 Multi-core Programming: Basic Concepts Turnaround Complete single task in the smallest amount of time Example: Decorating a cake Frosting One to frost top, one to frost sides Decoration One to do borders One to place flowers One to write “Happy Birthday” Threading for Throughput or Turnaround?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 31 Multi-core Programming: Basic Concepts Throughput Complete the most tasks in a fixed amount of time Example: Decorating cakes Multiple bakers doing entire decoration Assembly line of frosters and decorators Threading for Throughput or Turnaround?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 32 Multi-core Programming: Basic Concepts Turnaround Complete single task in the smallest amount of time Example: Parents to arrive in one hour Hugues cleans living room Otto cleans bathroom Thomas cleans kitchen Threading for Throughput or Turnaround?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 33 Multi-core Programming: Basic Concepts Throughput Complete the most tasks in a fixed amount of time Example: Produce pins “One man draws out the wire, another straights it, a third cuts it, a fourth points it, a fifth grinds it at the top for receiving the head; …” (A. Smith 1776) Threading for Throughput or Turnaround?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 34 Multi-core Programming: Basic Concepts Threading for Functionality Assign threads to separate functions done by application Easiest method since overlap is unlikely Example: Coaching a football team Head coach, position coaches, special teams coach Threading for Functionality or Performance?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 35 Multi-core Programming: Basic Concepts Deadlock Threads wait for some event or condition that will never happen Example: Traffic jam at intersection Cars unable to turn or back up What is Livelock? Threads change state in response to each other Example: Robin Hood and Little John on log bridge Deadlock