1 Sorting We have actually seen already two efficient ways to sort:

Slides:



Advertisements
Similar presentations
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Advertisements

Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
Lower bound for sorting, radix sort COMP171 Fall 2005.
Quicksort CSE 331 Section 2 James Daly. Review: Merge Sort Basic idea: split the list into two parts, sort both parts, then merge the two lists
Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
1 Introduction to Randomized Algorithms Md. Aashikur Rahman Azim.
Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner.
Lower bound for sorting, radix sort COMP171 Fall 2006.
Data Structures: Sorts, CS, TAU 1 שמושים ביישומים רבים יש n רשומות, לכל רשומה מפתח: K 1, …..,K n רוצים לסדר את הרשומות כך שהמפתחות לא בסדר יורד (יתכנו.
Sorting Heapsort Quick review of basic sorting methods Lower bounds for comparison-based methods Non-comparison based sorting.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
E.G.M. Petrakissorting1 Sorting  Put data in order based on primary key  Many methods  Internal sorting:  data in arrays in main memory  External.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
ערמות ; מבני נתונים 09 מבוסס על מצגות של ליאור שפירא, חיים קפלן, דני פלדמן וחברים.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
1 שמושים ביישומים רבים יש n רשומות, לכל רשומה מפתח: K 1,…..,K n רוצים לסדר את הרשומות כך שהמפתחות לא בסדר יורד (יתכנו כפולים) קריטריונים ליעילות: לא תמיד.
Sorting Lower Bound Andreas Klappenecker based on slides by Prof. Welch 1.
Analysis of Algorithms CS 477/677
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
מיון (Sorting) קלט : מערך בן n מספרים. פלט : מערך ובו המספרים אותם מאוחסנים בסדר עולה
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
The Complexity of Algorithms and the Lower Bounds of Problems
Sorting Lower Bound1. 2 Comparison-Based Sorting (§ 4.4) Many sorting algorithms are comparison based. They sort by making comparisons between pairs of.
- אמיר רובינשטיין מיונים - Sorting משפט : חסם תחתון על מיון ( המבוסס על השוואות בלבד ) של n מפתחות הינו Ω(nlogn) במקרה הגרוע ובממוצע. ניתן לפעמים.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Chapter 7 Quicksort Ack: This presentation is based on the lecture slides from Hsu, Lih- Hsing, as well as various materials from the web.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
Chapter 7: Sorting Algorithms Insertion Sort. Sorting Algorithms  Insertion Sort  Shell Sort  Heap Sort  Merge Sort  Quick Sort 2.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
Sorting What makes it hard? Chapter 7 in DS&AA Chapter 8 in DS&PS.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Deterministic and Randomized Quicksort Andreas Klappenecker.
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
Data Structures Hanoch Levi and Uri Zwick March 2011 Lecture 3 Dynamic Sets / Dictionaries Binary Search Trees.
Data Structures Haim Kaplan & Uri Zwick December 2013 Sorting 1.
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs.
1 Programming for Engineers in Python Autumn Lecture 9: Sorting, Searching and Time Complexity Analysis.
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
1 Chapter 7 Quicksort. 2 About this lecture Introduce Quicksort Running time of Quicksort – Worst-Case – Average-Case.
Sorting.
Fundamental Data Structures and Algorithms
Quick Sort Divide: Partition the array into two sub-arrays
CPSC 411 Design and Analysis of Algorithms
Sorting We have actually seen already two efficient ways to sort:
Sorting We have actually seen already two efficient ways to sort:
Quick-Sort 11/19/ :46 AM Chapter 4: Sorting    7 9
(2,4) Trees 12/4/2018 1:20 PM Sorting Lower Bound Sorting Lower Bound.
Data Structures Review Session
CS200: Algorithm Analysis
Data Structures Sorting Haim Kaplan & Uri Zwick December 2014.
CS 583 Analysis of Algorithms
Lower bound for sorting, radix sort
CPSC 411 Design and Analysis of Algorithms
CS 583 Analysis of Algorithms
The Selection Problem.
Sorting We have actually seen already two efficient ways to sort:
Presentation transcript:

1 Sorting We have actually seen already two efficient ways to sort:

2 A kind of “insertion” sort Insert the elements into a red-black tree one by one Traverse the tree in in-order and collect the keys Takes O(nlog(n)) time

3 Heapsort (Willians, Floyd, 1964) Put the elements in an array Make the array into a heap Do a deletemin and put the deleted element at the last position of the array

4 Quicksort (Hoare 1961)

5 quicksort Input: an array A[p, r] Quicksort (A, p, r) if (p < r) then q = Partition (A, p, r) //q is the position of the pivot element Quicksort (A, p, q-1) Quicksort (A, q+1, r)

i j i j i j i j i j p r Last element Pivot = 4 On i and left elements smaller than pivot j explores to right exchange Between i and j greater than pivot

i j i j i j i j i j pivot > <=

p r Partition(A, p, r) x ← A[r] i ← p-1 for j ← p to r-1 do if A[j] ≤ x then i ← i+1 exchange A[i] ↔ A[j] exchange A[i+1] ↔ A[r] return i+1 Partition point

9 Analysis Running time is proportional to the number of comparisons Each pair is compared at most once  O(n 2 ) In fact for each n there is an input of size n on which quicksort takes cn 2  Ω(n 2 )

10 But Assume that the split is even in each iteration

11 T(n) = 2T(n/2) + n How do we solve linear recurrences like this ? (read Chapter 4)

12 Recurrence tree T(n/2) n

13 Recurrence tree n/2 n T(n/4)

14 Recurrence tree n/2 n T(n/4) logn In every level we do bn comparisons So the total number of comparisons is O(nlogn)

15 Analysis of 1:9 split

16 Analysis of 1:9 split

17 Observations We can’t guarantee good splits But intuitively on random inputs we will get good splits

18 Randomized quicksort Use randomized-partition rather than partition Randomized-partition (A, p, r) i ← random(p,r) exchange A[r] ↔ A[i] return partition(A,p,r)

19 On the same input we will get a different running time in each run ! Look at the average for one particular input of all these running times

20 Expected # of comparisons Let X be the # of comparisons This is a random variable Want to know E(X)

21 Expected # of comparisons Let z 1,z 2,.....,z n the elements in sorted order Let X ij = 1 if z i is compared to z j and 0 otherwise So, All elements are compared to pivot. At the end of phase the partition puts them in proper sides so will not compare with pivot again.

22 by linearity of expectation

23 by linearity of expectation

24 Consider z i,z i+1, ,z j ≡ Z ij Claim: z i and z j are compared  either z i or z j is the first chosen (pivot) in Z ij Proof: 3 cases: –{z i, …, z j }Compared on this partition, and never again. –{z i, …, z j }the same –{z i, …, z k, …, z j }Not compared on this partition. Partition separates them, so no future partition uses both.

25 = 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1) Pr{z i is compared to z j } = Pr{z i or z j is first pivot chosen from Z ij } just explained = Pr{z i is first pivot chosen from Z ij } + Pr{z j is first pivot chosen from Z ij } mutually exclusive possibilities

26 Simplify with a change of variable, k=j-i+1. Simplify and overestimate, by adding terms.

27 Sum 1/k

28 Lower bound for sorting in the comparison model Cannot deal with an algorithm Must deal with the PROBLEM

29 A lower bound Comparison model: We assume that the operation from which we deduce order among keys are comparisons Then we prove that we need Ω(nlogn) comparisons on the worst case

Model the algorithm as a decision tree דוגמה: מיון הכנסה Insertion Sort - איטרציה ה i דואגים שהאלמנטים [A[1],….,A[i נמצאים בסדר יחסי תקין (על ידי החלפות)

Insertion sort 1:2 2:3 < < 1:3 > A[1] < A[2] < A[3] A[2] < A[1] < A[3] 1:3 > 2:3 > < > A[1] < A[3] < A[2] A[3] < A[1] < A[2] < > A[2] < A[3] < A[1] A[3] < A[2] < A[1] A[1] < A[2] Finds the right order A[1] < A[3] A[2] < A[3]

Quicksort 1:3 2:3 < < > A[1] < A[3] < A[2]A[2] < A[3] < A[1] 1:2 > 2:3 > < > A[1] < A[2] < A[3] A[2] < A[1] < A[3] < > A[3] < A[1] < A[2] A[3] < A[2] < A[1] <

33 Important Observations Every comparison algorithm can be represented as a (binary) tree like this Assume that for every node v there is an input on which the algorithm reaches v Then the # of leaves is n!

34 Important Observations Each path corresponds to a run on some input The worst case # of comparisons corresponds to the longest path

35 The lower bound Let d be the length of the longest path #leaves ≤ 2 d n! ≤  log 2 (n!) ≤d Perhaps some orders represented more than once

36 Lower Bound for Sorting Any sorting algorithm based on comparisons between elements requires  (N log N) comparisons.

- אפשר להראות שגם הוא (  (nlogn - צורת ההוכחה: להראות שעץ בינארי k עלים - עומק מסלול ממוצע לפחות logk. הוכחה בשלילה: יהי T הקטן ביותר כך שלא מתקיים. אז ל T בן אחד או שניים: א) אם בן בודד  סתירה לקטן ביותר n 1n עומק ממוצע קטן מ logk k עלים 1n2n n ב) אם שני בנים: מספר העלים בהם הוא k 1 ו- k 2 k 1 <kk-k 1 =k 2 חסם תחתון לזמן ממוצע

אזי ממוצע אורך מסלולים לעלים ב T הוא: - מציאת חסם תחתון לביטוי, ע”י מציאת מינימום שלו (תחת אילוץ k 1 +k 2 =k) פתרון הבעיה נותן מינימום ב k 1 =k 2 

39 Beating the lower bound We can beat the lower bound if we can deduce order relations between keys not by comparisons Examples: Count sort Radix sort

מיונים שראינו עד כה: (O(nlogn האם אפשר לבצע בפחות מ (O(nlogn? ראינו: (  (nlogn אם לא יודעים כלום על המספרים בשלב זה: אם יודעים - אפשר לרדת דוגמה 1: אם יודעים שבמערך [A[1,…,n נמצאים המפתחות n,….,1 אזי מיון לתוך B ב (O(n: B[A[i].key] = A[i] דוגמה 2: Count Sort מערך [A[1],…,A[n, איברים 1,…,k כל איבר מופיע מספר פעמים: מיון המערך ע”י: 1. ספירת האיברים מכל סוג 2. כתיבתם במערך תוצאה פרטים: Cormen BIN/RADIX SORTING

בתנאים של דוגמה 1, מיון בתוך A: אם A[i].key = j החלף [A[i עם [A[j From i = 1 to n do while A[i].key <> i do swap(A[i], A[A[i].key]) פעולות:(O(n צעדים (O(n החלפות (איבר שנחת במקומו לא יוחלף יותר!) BIN SORTING הינו מיון האיברים לתוך תאים (BINS) ולבסוף- שרשור התאים - דוגמה 1 היא BIN-SORT פשוט גודל BIN קבוע (1) במקרה הכללי גודל משתנה פעולות שנרצה: א) הכנס איבר לתוך BIN ב) חבר שני BIN-ים דוגמה 3

פתרון: 1) כל BIN רשימה מקושרת 2) HEADERS מצביעים על תחילת הרשומה H1 E1 H2 E2 הכנסה: (O(1 שרשור: (O(1 כעת ניתן לשרשר מספר שרירותי של רשימות לתוך n סלים

אנליזה: m - מספר הערכים האפשריים (מספר הסלים) n - מספר המפתחות nסיבוכיות הכנסות = (O(n סיבוכיות שרשורים = (O(m O(m+n) אם מספר המפתחות גדול ממספר הסלים (m < n) O(n) אם מספר המפתחות קטן ממספר הסלים (m > n) למשל m = n 2 O(n 2 ) דוגמה: מיין את המספרים i 2 כאשר i=1,2,..,10 כלומר מיין את 100,....,0,1,4 Hanoch: Sort exams of class with grades xx.yy Hanoch: Sort exams of class with grades xx.yy

פתרון: - הכן n סלים - מיין לפי הספרה הפחות משמעותית - מיין לפי הספרה היותר משמעותית Bin איברים 0 1, , , , 49 Bin איברים 0, 1, 4, , 1, 81, 64, 4, 25, 36, 16, 9, 49 שרשור

למה עובד? נניח:i = 10a + b, j = 10c + d נניח:i < j  ברור ש - אם a < c אזי שלב שני ישים בסלים המתאימים, והמיון תקין. למה טוב BIN SORT? - תחומים שידועה עליהם אינפורמציה כמו 1,…,n k (קבוע k) - מחרוזת באורך k - אם a = c אזי b < d ולכן: מיון ראשון ימיין בסדר לכן i יכנס לסל לפני j בשלב השני b

האם תמיד טוב? לא אם k מאוד גדול!! דוגמה: n = 100, k = 100 nk :BIN SORT (100 מחזורים של 100 פעולות) מיון אחר: nlogn ו- nk > nlogn אבל… זהירות בהשוואות! במיון רגיל- השוואה = (O(k ולכן יש חשיבות למודל החישובי!!!

- נתונים k מפתחות f 1,…,f k - רוצים למיין בסדר לכסיקוגרפי כלומר: (a 1,…,a k ) < (b 1,…,b k ) אמ”ם: 1) a 1 < b 1 2) או a 1 = b 1, a 2 < b 2 a 1 = b 1,…., a k-1 = b k-1, a k = b k (k דומה ל BIN SORT מוכלל, רק צריך לכל סוג מפתחות את תחום הסלים שלו. RADIX SORT

48 Linear time sorting Or assume something about the input: random, “almost sorted”

49 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs a i,a j such that i a j

50 Example 1, 4, 5, 8, 3 I=3 8, 7, 5, 3, 1 I=10

51 Think of “insertion sort” using a list When we insert the next item a k, how deep it gets into the list? As the number of inversions a i,a k for i < k lets call this I k

52 Analysis The running time is:

53 Thoughts When I=Ω(n 2 ) the running time is Ω(n 2 ) But we would like it to be O(nlog(n)) for any input, and faster when I is small Slides got updated

54 Finger red black trees

55 Finger tree Take a regular search tree and reverse the direction of the pointers on the rightmost spine We go up from the last leaf until we find the subtree containing the item and we descend into it

56 Finger trees Say we search for a position at distance d from the end Then we go up to height O(log(d)) Insertions and deletions still take O(log n) worst case time But: Amortized time : Tree modification = O(1) Search = O(log d) ( contribution of this transaction) So search for the d th position takes O(log(d)) time

57 Back to sorting Suppose we implement the insertion sort using a finger search tree Insert one by one from the input If most elements are sorted – then elements enter at right corner. When we insert item k then d=O(I k ) and it take O(log(I k )) time to search

58 Overall cost d=O(I k ) and it take O(log(I k )) time to search modifications search

59 Analysis The running time is: Since ∑ I j = I this is at most