Accelerated Cascading Advanced Algorithms & Data Structures Lecture Theme 16 Prof. Dr. Th. Ottmann Summer Semester 2006.

Slides:



Advertisements
Similar presentations
General algorithmic techniques: Balanced binary tree technique Doubling technique: List Ranking Problem Divide and concur Lecture 6.
Advertisements

Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.
1 Parallel Algorithms (chap. 30, 1 st edition) Parallel: perform more than one operation at a time. PRAM model: Parallel Random Access Model. p0p0 p1p1.
Parallel Algorithms.
Introduction to Algorithms Quicksort
Advanced Topics in Algorithms and Data Structures
Introduction to Computer Science Theory
Optimal PRAM algorithms: Efficiency of concurrent writing “Computer science is no more about computers than astronomy is about telescopes.” Edsger Dijkstra.
Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Lecture 3: Parallel Algorithm Design
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Greedy Algorithms Amihood Amir Bar-Ilan University.
BY Lecturer: Aisha Dawood. Heapsort  O(n log n) worst case like merge sort.  Sorts in place like insertion sort.  Combines the best of both algorithms.
Deterministic Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms.
Spring 2015 Lecture 5: QuickSort & Selection
Advanced Topics in Algorithms and Data Structures Lecture 7.1, page 1 An overview of lecture 7 An optimal parallel algorithm for the 2D convex hull problem,
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Advanced Topics in Algorithms and Data Structures Classification of the PRAM model In the PRAM model, processors communicate by reading from and writing.
PRAM Models Advanced Algorithms & Data Structures Lecture Theme 13 Prof. Dr. Th. Ottmann Summer Semester 2006.
Orthogonal Range Searching 3Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
Advanced Topics in Algorithms and Data Structures Lecture 6.1 – pg 1 An overview of lecture 6 A parallel search algorithm A parallel merging algorithm.
Parallel Prefix Computation Advanced Algorithms & Data Structures Lecture Theme 14 Prof. Dr. Th. Ottmann Summer Semester 2006.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Uzi Vishkin.  Introduction  Objective  Model of Parallel Computation ▪ Work Depth Model ( ~ PRAM) ▪ Informal Work Depth Model  PRAM Model  Technique:
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Orthogonal Range Searching-1Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
More sorting algorithms: Heap sort & Radix sort. Heap Data Structure and Heap Sort (Chapter 7.6)
Heapsort. 2 Why study Heapsort? It is a well-known, traditional sorting algorithm you will be expected to know Heapsort is always O(n log n) Quicksort.
Advanced Topics in Algorithms and Data Structures 1 Lecture 4 : Accelerated Cascading and Parallel List Ranking We will first discuss a technique called.
A Binary Tree root leaf. A Binary Tree root leaf descendent of root parent of leaf.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
CS 280 Data Structures Professor John Peterson. Invariants Back to Invariants! Recall the insertion sort invariant – how can we turn this into debugging.
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
Lecture 5: Master Theorem and Linear Time Sorting
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 - Part I Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
Advanced Topics in Algorithms and Data Structures Page 1 An overview of lecture 3 A simple parallel algorithm for computing parallel prefix. A parallel.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Advanced Topics in Algorithms and Data Structures 1 Two parallel list ranking algorithms An O (log n ) time and O ( n log n ) work list ranking algorithm.
Basic PRAM algorithms Problem 1. Min of n numbers Problem 2. Computing a position of the first one in the sequence of 0’s and 1’s.
Advanced Topics in Algorithms and Data Structures 1 An example.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
CSE 373 Data Structures Lecture 15
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
Binary Trees. Binary Tree Finite (possibly empty) collection of elements A nonempty binary tree has a root element The remaining elements (if any) are.
Design of Algorithms using Brute Force Approach. Primality Testing (given number is n binary digits)
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
Fall 2015 Lecture 4: Sorting in linear time
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Parallel Algorithms. Parallel Models u Hypercube u Butterfly u Fully Connected u Other Networks u Shared Memory v.s. Distributed Memory u SIMD v.s. MIMD.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
Foundation of Computing Systems
FALL 2005CENG 351 Data Management and File Structures1 External Sorting Reference: Chapter 8.
Counting and Distributed Coordination BASED ON CHAPTER 12 IN «THE ART OF MULTIPROCESSOR PROGRAMMING» LECTURE BY: SAMUEL AMAR.
Dictionaries CS 110: Data Structures and Algorithms First Semester,
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-3.
CENG 3511 External Sorting. CENG 3512 Outline Introduction Heapsort Multi-way Merging Multi-step merging Replacement Selection in heap-sort.
Merge Sort 1/12/2018 5:48 AM Merge Sort 7 2   7  2  2 7
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Lecture 3: Parallel Algorithm Design
CS200: Algorithm Analysis
Unit –VIII PRAM Algorithms.
Heaps.
Presentation transcript:

Accelerated Cascading Advanced Algorithms & Data Structures Lecture Theme 16 Prof. Dr. Th. Ottmann Summer Semester 2006

2 Fast computation of maximum Input: An array A holding p elements from a linearly ordered universe S. We assume that all the elements in A are distinct. Output: The maximum element from the array A. We use a boolean array M such that M ( k )=1 if and only if A ( k ) is the maximum element in A. Initialization: We allocate p processors to set each entry in M to 1.

3 Fast computation of maximum: Step1 Step 1: Assign p processors for each element in A, p 2 processors overall. Consider the p processors allocated to A ( j ). We name these processors as P 1, P 2,..., P i,..., P p. P i compares A ( j ) with A ( i ) : If A ( i ) > A ( j ) then M ( j ) := 0 else do nothing.

4 Fast computation of maximum: Step2 Step 2: At the end of Step 1, M ( k ), 1  k  p will be 1 if and only if A ( k ) is the maximum element. We allocate p processors, one for each entry in M. If the entry is 0, the processor does nothing. If the entry is 1, it outputs the index k of the maximum element.

5 Processor requirement and PRAM model Complexity: The processor requirement is p 2 and the time complexity is O (1). We need concurrent write facility and hence the Common CRCW PRAM model.

6 Optimal computation of maximum This is the same algorithm which we used for adding n numbers.

7 Optimal computation of maximum: Analysis This algorithm takes O ( n ) processors and O (log n ) time. We can reduce the processor complexity to O ( n / log n ). Hence the algorithm does optimal O ( n ) work.

8 An O (log log n ) time algorithm (1) Instead of a binary tree, we use a more complex tree. Assume that. The root of the tree has children. Each node at the i -th level has children for. Each node at level k has two children.

9 An O (log log n ) time algorithm (2) Some Properties The depth of the tree is k. Since The number of nodes at the i -th level is Prove this by induction.

10 An O (log log n ) time algorithm (3) The Algorithm The algorithm proceeds level by level, starting from the leaves. At every level, we compute the maximum of all the children of an internal node by the O (1) time algorithm. The time complexity is O (log log n ) since the depth of the tree is O (log log n ).

11 An O (log log n ) time algorithm: Work complexity Total Work: Recall that the O (1) time algorithm needs O ( p 2 ) work for p elements. Each node at the i -th level has children. So the total work for each node at the i -th level is.

12 Total Work Total Work: There are nodes at the i -th level. Hence the total work for the i -th level is: For O (log log n ) levels, the total work is O ( n log log n ). This is suboptimal.

13 Accelerated cascading: Idea The first algorithm which is based on a binary tree, is optimal but slow. The second algorithm is suboptimal, but very fast. We combine these two algorithms through the accelerated cascading strategy. We start with the optimal algorithm until the size of the problem is reduced to a certain value. Then we use the suboptimal but very fast algorithm.

14 Accelerated cascading: Phase 1 Phase 1. We apply the binary tree algorithm, starting from the leaves and upto log log log n levels. The number of candidates reduces to The total work done so far is O ( n ) and the total time is O (log log log n ).

15 Accelerated cascading: Phase 2 Phase 2. In this phase, we use the fast algorithm on the remaining candidates. The total work is The total time is Theorem: Maximum of n elements can be computed in O (log log n ) time and O ( n ) work on the Common CRCW PRAM.