Parallel Programming 0024 Mergesort Spring Semester 2010.

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

CS 206 Introduction to Computer Science II 02 / 27 / 2009 Instructor: Michael Eckmann.
50.003: Elements of Software Construction Week 6 Thread Safety and Synchronization.
1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
Previously… Processes –Process States –Context Switching –Process Queues Threads –Thread Mappings Scheduling –FCFS –SJF –Priority scheduling –Round Robin.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Mutual Exclusion.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Multithreaded Programs in Java. Tasks and Threads A task is an abstraction of a series of steps – Might be done in a separate thread – Java libraries.
Concurrency 101 Shared state. Part 1: General Concepts 2.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
CS 206 Introduction to Computer Science II 12 / 09 / 2009 Instructor: Michael Eckmann.
Introduction to Analysis of Algorithms
29-Jun-15 Java Concurrency. Definitions Parallel processes—two or more Threads are running simultaneously, on different cores (processors), in the same.
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
CS 206 Introduction to Computer Science II 10 / 08 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 12 / 08 / 2008 Instructor: Michael Eckmann.
Unit 1. Sorting and Divide and Conquer. Lecture 1 Introduction to Algorithm and Sorting.
Parallel Processing (CS526) Spring 2012(Week 5).  There are no rules, only intuition, experience and imagination!  We consider design techniques, particularly.
50.003: Elements of Software Construction Week 5 Basics of Threads.
ALGORITHM ANALYSIS AND DESIGN INTRODUCTION TO ALGORITHMS CS 413 Divide and Conquer Algortihms: Binary search, merge sort.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
C++ / G4MICE Course Session 3 Introduction to Classes Pointers and References Makefiles Standard Template Library.
A Bridge to Your First Computer Science Course Prof. H.E. Dunsmore Concurrent Programming Threads Synchronization.
This module was created with support form NSF under grant # DUE Module developed by Martin Burtscher Module B1 and B2: Parallelization.
Quick overview of threads in Java Babak Esfandiari (extracted from Qusay Mahmoud’s slides)
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 3 (26/01/2006) Instructor: Haifeng YU.
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Multithreading Chapter Introduction Consider ability of human body to ___________ –Breathing, heartbeat, chew gum, walk … In many situations we.
Executing Parallel Programs with Potential Bottlenecks Efficiently Yoshihiro Oyama Kenjiro Taura Akinori Yonezawa {oyama, tau,
Threads Doing Several Things at Once. Threads n What are Threads? n Two Ways to Obtain a New Thread n The Lifecycle of a Thread n Four Kinds of Thread.
Concurrency Control 1 Fall 2014 CS7020: Game Design and Development.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Data Parallelism Task Parallel Library (TPL) The use of lambdas Map-Reduce Pattern FEN 20141UCN Teknologi/act2learn.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
Comunication&Synchronization threads 1 Programación Concurrente Benemérita Universidad Autónoma de Puebla Facultad de Ciencias de la Computación Comunicación.
Sorting. Sorting Sorting is important! Things that would be much more difficult without sorting: –finding a telephone number –looking up a word in the.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
CSCI-455/552 Introduction to High Performance Computing Lecture 21.
Semaphores Chapter 6. Semaphores are a simple, but successful and widely used, construct.
Concurrency (Threads) Threads allow you to do tasks in parallel. In an unthreaded program, you code is executed procedurally from start to finish. In a.
Concurrent Programming in Java Based on Notes by J. Johns (based on Java in a Nutshell, Learning Java) Also Java Tutorial, Concurrent Programming in Java.
Synchronization Questions answered in this lecture: Why is synchronization necessary? What are race conditions, critical sections, and atomic operations?
Mergesort example: Merge as we return from recursive calls Merge Divide 1 element 829.
Concurrency 2 CS 2110 – Spring 2016.
Introduction to OpenMP
Memory Consistency Models
Atomic Operations in Hardware
Activity Diagram.
Atomic Operations in Hardware
Memory Consistency Models
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
Atomicity CS 2110 – Fall 2017.
143a discussion session week 3
Synchronization Lecture 23 – Fall 2017.
Java Concurrency 17-Jan-19.
Semaphores Chapter 6.
Concurrency: Mutual Exclusion and Process Synchronization
Computer Science 2 06A-Java Multithreading
Java Concurrency.
Java Concurrency.
Java Concurrency 29-May-19.
Problems with Locks Andrew Whitaker CSE451.
Threads CSE451 Andrew Whitaker TODO: print handouts for AspectRatio.
Presentation transcript:

Parallel Programming 0024 Mergesort Spring Semester 2010

2 Outline  Discussion of last assignment  Presentation of new assignment Introduction to Merge-Sort Code Skeletons (see homepage)‏ Issues on Parallelizing Merge-Sort Performance measurements  Questions/Comments?

3 Discussion of Homework 3

4 Part2 – First question  Why is it not sufficient to add the 'synchronized' keyword to the read() and write() methods to guarantee the specified behavior of the producer/consumer problem?

5 Part2 – First question  Why is it not sufficient to add the 'synchronized' keyword to the read() and write() methods to guarantee the specified behavior of the producer/consumer problem?  Solution: Synchronization ensures that the producer and the consumer can not access the buffer at the same time. But it does not prevent the consumer to read a value more than one time or the producer to overwrite a value that was not read.

6 Part2 – Second Question  Would it be safe to use a boolean variable as a "guard" within the read() and write() methods instead of using the synchronized keyword?

7 Part2 – Second Question  Would it be safe to use a boolean variable as a "guard" within the read() and write() methods instead of using the synchronized keyword?  Solution: No, reading and writing a value is not atomic! – Can you tell me why, e.g., i++ is not atomic?

8 Part3 – First Question  Would it suffice to use a simple synchronized(this) within the run()-method of each, the producer and the consumer to guard the updating of the buffer?

9 Part3 – First Question  Would it suffice to use a simple synchronized(this) within the run()-method of each the producer and the consumer to guard the updating of the buffer?  No, since Producer and Consumer are different objects with different locks  no mutual exclusion guaranteed

10 Part3 – Second Question  What is the object that should be used as the shared monitor and (the object upon which the threads are synchronized())?  Solution: The shared instance of UnsafeBuffer.  Question: What could you have used instead?

11 Part 3 – Third Question  What are the potential advantages/disadvantages of synchronizing the producer/consumer over synchronizing the buffer?

12 Part 3 – Third Question  What are the potential advantages/disadvantages of synchronizing the producer/consumer over synchronizing the buffer?  Advantages: You can use arbitrary (also unsafe!) buffers You can do things in the Producer/Consumer that need to be done before the other thread can use the buffer. (For example print something to the console).  Disadvantages: More work to do :-)‏ More error-prone

13 Presentation of Homework 4

14 MergeSort  Problem: Sort a given list 'l' of 'n' numbers  Example: Input: Output:  Algorithm: Divide l into two sublists of size n/2 Sort each sublist recursively by re-applying MergeSort Merge the two sublists back into one sorted list  End of recursion: Size of the sublist becomes 1 If size of a sublist > 1 => other sorting needed

15 Example: Divide into sublists 5, 4, 3, 2, 1, 0 5, 4, 32, 1, 0 5, , 10 21

16 Merging  Combine two sorted lists into one sorted list  Example: List 1: 0, 5 List 2: 3, 4, 45 Output: 0, 3, 4, 5, 45  Merging example: Create a list Output of size 5 0, 5 and 3, 4, 450 < 3 → insert 0 in Output 0, 5 and 3, 4, 453 < 5 → insert 3 in Output 0, 5 and 3, 4, 454 < 5 → insert 4 in Output 0, 5 and 3, 4, 455 < 45 → insert 5 in Output Finally, insert 45 in Output

17 Example: Merging Sorted Sublists 0, 1, 2, 3, 4, 5 3, 4, 50, 1, 2 4, , 20 21

18 The Code Skeletons (Eclipse)‏

19 A Parallel MergeSort  Which operations can be done in parallel?

20 A Parallel MergeSort  Which operations can be done in parallel? Sorting Each sub-list can be sorted by a separate thread

21 A Parallel MergeSort  Which operations can be done in parallel? Sorting Each sub-list can be sorted by a separate thread Merging Two ordered sub-lists can be merged by a thread

22 A Parallel MergeSort  Which operations can be done in parallel? Sorting Each sub-list can be sorted by a separate thread Merging Two ordered sub-lists can be merged by a thread  Synchronization issues Limitations in parallelization?

23 A Parallel MergeSort  Which operations can be done in parallel? Sorting Each sub-list can be sorted by a separate thread Merging Two ordered sub-lists can be merged by a thread  Synchronization issues Limitations in parallelization? Merge can only happen if two sublists are sorted  Performance issues Number of threads? Size of array to sort?

24 Load balancing What if: size of array % numThreads != 0? Simple (proposed) solution – Assign remaining elements to one thread Balanced (more complicated) solution – Distribute remaining elements to more threads

25 Performance Measurement # of threads /array size …1024? 100,000x 500,000x … 10,000,000?

26 How to Measure Time? System.currentTimeMillis() might not be exact Granularity might be higher than a millisecond Might be slightly inaccurate System.nanoTime()‏ Nanosecond precision, but not nanosecond accuracy For our measurements System.currentMillis() is good enough

27 How to Measure Time? long start, end; start = System.currentMillis(); // some action end = System.currentMillis(); System.out.println("Time elapsed: “ + (end - start));

28 Questions to be answered Is the parallel version faster? How many threads give the best performance? What is the influence of the CPU model/CPU frequency?

29 The Harsh Realities of Parallelization  Ideally upgrading from uniprocessor to n-way multiprocessor should provide an n-fold increase in computational power  Real world most computations cannot be efficiently parallelized Sequential code, synchronization, communication‏  Speedup – time(single processor) / time(n concurrent processors)‏

Mein Tipp Thread t = … t.start(); … try { t.join(); // Warten bis t fertig ist } catch (InterruptedException e) { e.printStackTrace(); }

Mein Tipp int[] array System.out.println(Arrays.toString(array)); Oder für die ersten paar Einträge System.out.println(Arrays.toString(array).substring(0,20)); StringBuffer und StringBuilder sind schneller als String.

32 Any Questions?