Analysis of Algorithms

Slides:



Advertisements
Similar presentations
Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.
Advertisements

Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting in Linear Time Comp 550, Spring Linear-time Sorting Depends on a key assumption: numbers to be sorted are integers in {0, 1, 2, …, k}. Input:
Linear Sorts Counting sort Bucket sort Radix sort.
CSE 3101: Introduction to the Design and Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
Mudasser Naseer 1 5/1/2015 CSC 201: Design and Analysis of Algorithms Lecture # 9 Linear-Time Sorting Continued.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
Counting Sort Non-comparison sort. Precondition: n numbers in the range 1..k. Key ideas: For each x count the number C(x) of elements ≤ x Insert x at output.
Lower bound for sorting, radix sort COMP171 Fall 2005.
CSCE 3110 Data Structures & Algorithm Analysis
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Lower bound for sorting, radix sort COMP171 Fall 2006.
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Lecture 5: Master Theorem and Linear Time Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
Sorting Fun1 Chapter 4: Sorting     29  9.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2012.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Analysis of Algorithms CS 477/677
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
1 Radix Sort. 2 Classification of Sorting algorithms Sorting algorithms are often classified using different metrics:  Computational complexity: classification.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
Quick sort, lower bound on sorting, bucket sort, radix sort, comparison of algorithms, code, … Sorting: part 2.
1 Algorithms CSCI 235, Fall 2015 Lecture 17 Linear Sorting.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
IT 60101: Lecture #181 Foundation of Computing Systems Lecture 18 Sorting Algorithms III.
Sorting & Lower Bounds Jeff Edmonds York University COSC 3101 Lecture 5.
19 March More on Sorting CSE 2011 Winter 2011.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Advanced Sorting 7 2  9 4   2   4   7
Lower Bounds & Sorting in Linear Time
Introduction to Algorithms
Algorithm Design and Analysis (ADA)
Quick-Sort 11/14/2018 2:17 PM Chapter 4: Sorting    7 9
Lecture 5 Algorithm Analysis
Linear Sorting Sections 10.4
Quick-Sort 11/19/ :46 AM Chapter 4: Sorting    7 9
Ch8: Sorting in Linear Time Ming-Te Chi
Lecture 5 Algorithm Analysis
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Linear Sorting Sorting in O(n) Jeff Chastine.
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Algorithms
Quick-Sort 2/23/2019 1:48 AM Chapter 4: Sorting    7 9
Lecture 5 Algorithm Analysis
Lower bound for sorting, radix sort
Algorithms CSCI 235, Spring 2019 Lecture 18 Linear Sorting
Linear Time Sorting.
Bucket-Sort and Radix-Sort
Lecture 5 Algorithm Analysis
Presentation transcript:

Analysis of Algorithms Lecture 13: Linear time sorting

More about sorting How many sorting algorithms do you know? What are their time complexity? What’s common about them? Can we do better than Θ(n log n)? Yes and no

Outline More about sorting Theoretical lower-bound Linear-time sorting algorithms Stability of sorting

Theoretical lower-bound Comparison sort: determines the relative order of two elements only by comparison What else can you do … Text book Ch8.1 shows that the theoretical lower-bound for any comparison-based sorting algorithm is Θ(n log n)

Lower-bound of comparison-based sort Assume an array A with 3 distinct elements, a1, a2, and a3 Use insertion sort # comparisons? A = [9 6 5] A = [5 6 9] A = … 1 < 2? yes no 2 < 3? 1 < 3 ? (1, 2, 3) 1 < 3? (1, 3, 2) (3, 1, 2) (2, 1, 3) (2, 3, 1) (3, 2, 1)

Lower-bound of comparison-based sort Assume all elements are distinct, each comparison has two possible outcomes: ai < aj or ai > aj Based on the outcome, change the relative order of some elements Output is a permutation of the input A correct sorting algorithm can handle any arbitrary input n! possible permutations Therefore, at least log(n!) = Θ(nlogn) comparisons in the worst case A ai < aj ai > aj A’ A’’ log(n!) Θ(n!)

Sorting in linear time Is there a problem with the theory? No. We are going to sort without doing comparison How is that possible? Key: knowledge about the data Example: Almost sorted? All distinct? Many identical ones? Uniformly distributed? The more you know about your data, the more likely you can have a better algorithm

Counting sort Knowledge: the numbers fall in a small range Example 1: sort the final exam score of a large class 1000 students Maximum score: 100 Minimum score: 0 Scores are integers Example 2: sort students according to the first letter of their last name Number of students: many Number of letters: 26

Counting sort Input: A[1 . . n], where A[ j]Î{1, 2, …, k} . Output: B[1 . . n], sorted. Auxiliary storage: C[1 . . k] . Not an in-place sorting algorithm Requires  (n+k) additional storage besides the original array

Intuition S1: 100 S2: 90 S3: 85 S4: 100 S5: 90 … 85 90 100 S3 S2 S1 S5 85 90 100 S3 S2 S1 S5 S4 … S3 … S2, S5, …, S1, S4

Intuition 75 85 90 100 1 1 2 2 50 students with score ≤ 75 What is the rank (lowest to highest) for a student with score = 75? 50 200 students with score ≤ 90 What is the rank for a student with score = 90? 200 or 199

Counting sort for i  1 to k do C[i]  0 for j  1 to n 1. for i  1 to k do C[i]  0 for j  1 to n do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}| for i  2 to k do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key £ i}| for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1 Initialize 2. Count 3. Compute running sum 4. Re-arrange

Counting-sort example 1 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: B:

Loop 1: initialization A: 4 1 3 4 3 C: B: for i  1 to k do C[i]  0 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: B: 1. for i  1 to k do C[i]  0

Loop 2: count A: 4 1 3 4 3 C: 1 B: for j  1 to n 5 1 2 3 4 A: 4 1 3 4 3 C: 1 B: 2. for j  1 to n do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}|

Loop 2: count A: 4 1 3 4 3 C: 1 1 B: for j  1 to n 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 B: 2. for j  1 to n do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}|

Loop 2: count A: 4 1 3 4 3 C: 1 1 1 B: for j  1 to n 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 1 B: 2. for j  1 to n do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}|

Loop 2: count A: 4 1 3 4 3 C: 1 1 2 B: for j  1 to n 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 2 B: 2. for j  1 to n do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}|

Loop 2: count A: 4 1 3 4 3 C: 1 2 2 B: for j  1 to n 5 1 2 3 4 A: 4 1 3 4 3 C: 1 2 2 B: 2. for j  1 to n do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}|

Loop 3: compute running sum 1 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 2 2 B: C': 1 1 2 2 3. for i  2 to k do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key £ i}|

Loop 3: compute running sum 1 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 2 2 B: C': 1 1 3 2 3. for i  2 to k do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key £ i}|

Loop 3: compute running sum 1 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 2 2 B: C': 1 1 3 5 3. for i  2 to k do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key £ i}|

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 3 5 B: 3 C': 1 1 3 5 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 3 5 B: 3 C': 1 1 3 5 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 3 5 B: 3 C': 1 1 2 5 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 2 5 B: 3 4 C': 1 1 2 5 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 2 5 B: 3 4 C': 1 1 2 4 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 2 4 B: 3 3 4 C': 1 1 2 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 2 4 B: 3 3 4 C': 1 1 2 4 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 2 4 B: 3 3 4 C': 1 1 1 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 2 4 B: 3 3 4 C': 1 1 1 4 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 1 4 B: 1 3 3 4 C': 1 1 1 4 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 1 4 B: 1 3 3 4 C': 1 1 1 4 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 1 4 B: 1 3 3 4 C': 1 1 4 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 1 4 B: 1 3 3 4 C': 1 1 4 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 4 B: 1 3 3 4 4 C': 1 1 4 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 4 B: 1 3 3 4 4 C': 1 1 4 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Loop 4: re-arrange A: 4 1 3 4 3 C: 1 1 4 B: 1 3 3 4 4 C': 1 1 3 2 3 4 5 1 2 3 4 A: 4 1 3 4 3 C: 1 1 4 B: 1 3 3 4 4 C': 1 1 3 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1

Analysis Q(k) Q(n) Q(k) Q(n) Q(n + k) 1. for i  1 to k do C[i]  0 2. for j  1 to n do C[A[ j]]  C[A[ j]] + 1 Q(n) 3. for i  2 to k do C[i]  C[i] + C[i–1] Q(k) 4. for j  n downto 1 do B[C[A[ j]]]  A[ j] C[A[ j]]  C[A[ j]] – 1 Q(n) Q(n + k)

Running time If k = O(n), then counting sort takes Q(n) time. But, theoretical lower-bound sorting takes W(n log n) time! Problem with the theory? Answer: Comparison sorting takes W(n log n) time. Counting sort is not a comparison sort. In fact, not a single comparison between elements occurs!

What other algorithms have this property? Stable sorting Counting sort is a stable sort: it preserves the input order among equal elements. A: 4 1 3 B: Why this is important? What other algorithms have this property?

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Task: sort students by alphabetical order of their addresses (state, city, street).

Stable sort Most Θ(n^2) sorting algorithms are stable Standard selection sort is not, but can be made so Most Θ(n log n) sorting algorithms are not stable Except merge sort Generic way to make any sorting algorithm stable Use two keys, the second key is the original index of the element When two elements are equal, compare their second keys 5, 6, 5, 1, 2, 3, 2, 6 (5, 1), (6, 2), (5, 3), (1, 4), (2, 5), (3, 6), (2, 7), (6, 8) (2, 5) < (2, 7) (5, 1) < (5, 3)

How to sort very large numbers? Those numbers are too large for the int type. They are represented as strings. One method: Use comparison-based sorting, but compare strings character by character. Change if (A[i] < A[j]) to if (compare(A[i], A[j]) < 0) Compare(s, t) for i = 1 to length(s) if (s[i] < t[i]) return -1; else if (s[i] > t[i]) return 1; return 0; What’s the cost to compare two strings, each with d characters? Total cost: Θ(d n log n) 198099109123518183599 340199540380128115295 384700101594539614696 382408360201039258538 614386507628681328936 738148652090990369197 987084087096653020299 185664124421234516454 785392075747859131885 530995223593137397354 267057490443618111767 795293581914837377527 815501764221221110674 142522204403312937607 718098797338329180836 856504702326654684056 982119770959427525245 528076153239047050820 305445639847201611168 478334240651199238019 Θ(d)

Radix sort Similar to sorting address books Treat each digit as a key Start from the least significant bit Most significant Least significant 198099109123518183599 340199540380128115295 384700101594539614696 382408360201039258538 614386507628681328936

Radix sort illustration Use simpler examples: 7 4 2 7 4 8 0 5 4 6 8 8 4 1 2 2 3 0 9 3 5 1 1 6 1 6 1 4 3 4 3 8 5 6 6 6 0 3 1 0 1 3 3 6 5 1 7 3 0 1 6

Radix sort illustration Sort the last digit: 2 3 0 1 6 1 0 3 1 7 4 2 4 1 2 0 1 3 1 7 3 0 5 4 4 3 4 9 3 5 3 8 5 3 6 5 1 1 6 6 6 6 0 1 6 7 4 8 6 8 8

Radix sort illustration Sort the second digit: 4 1 2 0 1 3 1 1 6 0 1 6 2 3 0 0 3 1 4 3 4 9 3 5 7 4 2 7 4 8 0 5 4 1 6 1 3 6 5 6 6 6 1 7 3 3 8 5 6 8 8

Radix sort illustration Sort the first digit: 0 1 3 0 1 6 0 3 1 0 5 4 1 1 6 1 6 1 1 7 3 2 3 0 3 6 5 3 8 5 4 1 2 4 3 4 6 6 6 6 8 8 7 4 2 7 4 8 9 3 5

Time complexity Sort each of the d digits by counting sort Total cost: d (n + k) k = 10 Total cost: Θ(dn) Partition the d digits into groups of 3 Total cost: (n+103)d/3 We work with binaries rather than decimals Partition d bits into groups of r bits Total cost: (n+2r)d/r Choose r = log n Total cost: dn / log n Compare with dn log n Catch: radix sort has a larger hidden constant factor

Space complexity Calls counting sort Therefore additional storage is needed (n)