Lecture 13: Quicksorting CS200: Computer Science

Slides:



Advertisements
Similar presentations
Class 21: Imperative Programming University of Virginia cs1120 David Evans.
Advertisements

David Evans CS200: Computer Science University of Virginia Computer Science Lecture 13: Of On and Off Grounds Sorting.
Cs1120 Fall 2009 David Evans Lecture 15: Running Practice.
Cs1120 Fall 2009 David Evans Lecture 16: Power Analysis.
David Evans CS150: Computer Science University of Virginia Computer Science Lecture 18: The Story So Far.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 16: Quicker Sorting.
David Evans CS150: Computer Science University of Virginia Computer Science Lecture 14: Asymptotic Growth.
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
מבוא מורחב - שיעור 81 Lecture 8 Lists and list operations (continue).
David Evans CS150: Computer Science University of Virginia Computer Science Lecture 11: 1% Pure Luck Make-up lab hours:
Cs1120 Fall 2009 David Evans Lecture 20: Programming with State.
Cs1120 Fall 2009 David Evans Lecture 19: Stateful Evaluation.
David Evans Class 12: Quickest Sorting CS150: Computer Science University of Virginia Computer Science Rose Bush by Jacintha.
David Evans Class 13: Quicksort, Problems and Procedures CS150: Computer Science University of Virginia Computer Science.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 12: Decrypting Work Circle Fractal.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 11: CS Logo, by Lincoln Hamilton and.
1 Lecture 16: Lists and vectors Binary search, Sorting.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 18: Think Globally, Mutate Locally.
David Evans CS200: Computer Science University of Virginia Computer Science Class 17: Mutation M. C. Escher, Day and Night.
David Evans Class 20: Quick Sorting CS200: Computer Science University of Virginia Computer Science Queen’s University,
David Evans CS150: Computer Science University of Virginia Computer Science Lecture 10: Puzzling Pegboards.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 9: Strange Loops and Sinister Repeaters.
David Evans Class 21: The Story So Far (Quicksort, Continuing Golden Ages) CS200: Computer Science University of Virginia.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
Class 8: Recursing on Lists David Evans cs1120 Fall 2009.
David Evans Lecture 13: Astrophysics and Cryptology CS200: Computer Science University of Virginia Computer Science.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 12: QuickSorting Queen’s University,
David Evans CS200: Computer Science University of Virginia Computer Science Class 16: Mutation M. C. Escher, Day and Night.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 19: Environments.
David Evans CS150: Computer Science University of Virginia Computer Science Lecture 9: Of On and Off Grounds Sorting Coffee.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 15: Intractable Problems (Smiley.
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 6: Ordered Data Abstractions
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 8: Crash Course in Computational Complexity.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 14: P = NP?
1 Vectors, binary search, and sorting. 2 We know about lists O(n) time to get the n-th item. Consecutive cons cell are not necessarily consecutive in.
CS 152: Programming Language Paradigms February 12 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak.
David Evans CS200: Computer Science University of Virginia Computer Science Lecture 8: Cons car cdr sdr wdr.
Lecture 4: Metacircles Eval Apply David Evans
Lecture 17: Environments CS200: Computer Science
Week 9 - Monday CS 113.
Representing Sets (2.3.3) Huffman Encoding Trees (2.3.4)
Lecture 4: Evaluation Rules Recursion CS200: Computer Science
Lecture 7: List Recursion CS200: Computer Science
COP4020 Programming Languages
Class 19: Think Globally, Mutate Locally CS150: Computer Science
CSE 143 Lecture 23: quick sort.
Lecture 8: Recursion Practice CS200: Computer Science
Lecture 16: Quickest Sorting CS150: Computer Science
Lecture 6: Programming with Data CS150: Computer Science
Class 14: Intractable Problems CS150: Computer Science
Lecture 11: All Sorts CS200: Computer Science University of Virginia
David Evans Lecture 9: The Great Lambda Tree of Infinite Knowledge and Ultimate Power CS200: Computer Science University.
Lecture 28: Types of Types
Lecture 13: Cost of Sorts CS150: Computer Science
Lecture #8 מבוא מורחב.
Lecture 22: P = NP? CS200: Computer Science University of Virginia
Lecture 10: Quicker Sorting CS150: Computer Science
Lecture 9: The Great Lambda Tree of Knowledge and Power
List and list operations (continue).
Lecture # , , , , מבוא מורחב.
Amortized Analysis and Heaps Intro
Cs1120 Fall 2009 David Evans Lecture 10: Fracturing Fractals cs1120 Fall 2009 David Evans
Class 26: Modeling Computing CS150: Computer Science
Lecture 15: Quicker Sorting CS150: Computer Science
Lecture 11: Sorting Grounds and Bubbles
326 Lecture 9 Henry Kautz Winter Quarter 2002
Lecture 8: Recursing Lists CS150: Computer Science
Presentation transcript:

David Evans http://www.cs.virginia.edu/evans Lecture 13: Quicksorting CS200: Computer Science University of Virginia Computer Science David Evans http://www.cs.virginia.edu/evans

Menu Improving insertsort quicksort 14 February 2003 CS 200 Spring 2003

Last time: insertsort insertsort is (n2) (define (insertsort cf lst) (if (null? lst) null (insertel cf (car lst) (insertsort cf (cdr lst))))) (define (insertel cf el lst) (if (null? lst) (list el) (if (cf el (car lst)) (cons el lst) (cons (car lst) (insertel cf el (cdr lst)))))) insertsort is (n2) 14 February 2003 CS 200 Spring 2003

Can we do better? (insertel < 88 (list 1 2 3 5 6 23 63 77 89 90)) Suppose we had procedures (first-half lst) (second-half lst) that quickly divided the list in two halves? 14 February 2003 CS 200 Spring 2003

Insert Halves (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) 14 February 2003 CS 200 Spring 2003

Evaluating insertelh Every time we call insertelh, the size > (insertelh < 3 (list 1 2 4 5 7)) |(insertelh #<procedure:traced-<> 3 (1 2 4 5 7)) | (< 3 1) | #f | (< 3 5) | #t | (insertelh #<procedure:traced-<> 3 (1 2 4)) | |(< 3 1) | |#f | |(< 3 4) | |#t | |(insertelh #<procedure:traced-<> 3 (1 2)) | | (< 3 1) | | #f | | (< 3 2) | | (insertelh #<procedure:traced-<> 3 (2)) | | |(< 3 2) | | |#f | | (2 3) | |(1 2 3) | (1 2 3 4) |(1 2 3 4 5 7) (1 2 3 4 5 7) (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) Every time we call insertelh, the size of the list is approximately halved! 14 February 2003 CS 200 Spring 2003

How much work is insertelh? Assume first-half and second-half are  (1) Each time we call insertelh, the size of lst halves. So, doubling the size of the list only increases the number of calls by 1. (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) List Size Number of insertelh applications 1 1 2 2 4 3 8 4 16 5 14 February 2003 CS 200 Spring 2003

How much work is insertelh? Assume first-half and second-half are  (1) Each time we call insertelh, the size of lst halves. So, doubling the size of the list only increases the number of calls by 1. insertelh is  (log2 n) log2 a = b means 2b = a List Size Number of insertelh applications 1 1 2 2 4 3 8 4 16 5 14 February 2003 CS 200 Spring 2003

insertsorth would be (n log2 n) Same as insertsort, except uses insertelh (define (insertsorth cf lst) (if (null? lst) null (insertelh cf (car lst) (insertsorth cf (cdr lst))))) (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) insertsorth would be (n log2 n) if we have fast first-half/second-half 14 February 2003 CS 200 Spring 2003

Is there a fast first-half procedure? No! To produce the first half of a list length n, we need to cdr down the first n/2 elements So: first-half is  (n) insertelh calls first-half every time…so insertelh is  (n) *  (log2 n) =  (n log2 n) insertsorth is  (n) *  (n log2 n) =  (n2 log2 n) Yikes! We’ve done all this work, and its still worse than our simple bubblesort! 14 February 2003 CS 200 Spring 2003

14 February 2003 CS 200 Spring 2003

The Great Lambda Tree of Ultimate Knowledge and Infinite Power 14 February 2003 CS 200 Spring 2003

el Sorted Binary Trees A tree containing A tree containing left right A tree containing all elements x such that (cf x el) is true A tree containing all elements x such that (cf x el) is false 14 February 2003 CS 200 Spring 2003

3 5 2 8 4 7 1 Tree Example cf: < null null 14 February 2003 CS 200 Spring 2003

Representing Trees (define (make-tree left el right) (list left el right)) (define (get-left tree) (first tree)) (define (get-element tree) (second tree)) (define (get-right tree) (third tree)) left and right are trees (null is a tree) tree must be a non-null tree tree must be a non-null tree tree must be a non-null tree 14 February 2003 CS 200 Spring 2003

5 2 8 1 Trees as Lists (make-tree (make-tree (make-tree null 1 null) 2 (define (make-tree left el right) (list left el right)) (define (get-left tree) (first tree)) (define (get-element tree) (second tree)) (define (get-right tree) (third tree)) 2 8 1 (make-tree (make-tree (make-tree null 1 null) 2 null) 5 (make-tree null 8 null)) 14 February 2003 CS 200 Spring 2003

insertel-tree (define (insertel-tree cf el tree) (if (null? tree) (make-tree null el null) (if (cf el (get-element tree)) (make-tree (insertel-tree cf el (get-left tree)) (get-element tree) (get-right tree)) (get-left tree) (insertel-tree cf el (get-right tree)))))) If the tree is null, make a new tree with el as its element and no left or right trees. Otherwise, decide if el should be in the left or right subtree. insert it into that subtree, but leave the other subtree unchanged. 14 February 2003 CS 200 Spring 2003

How much work is insertel-tree? Each time we call insertel-tree, the size of the tree. So, doubling the size of the tree only increases the number of calls by 1! (define (insertel-tree cf el tree) (if (null? tree) (make-tree null el null) (if (cf el (get-element tree)) (make-tree (insertel-tree cf el (get-left tree)) (get-element tree) (get-right tree)) (get-left tree) (insertel-tree cf el (get-right tree)))))) insertel-tree is  (log2 n) log2 a = b means 2b = a 14 February 2003 CS 200 Spring 2003

insertsort-tree (define (insertsort cf lst) (if (null? lst) null (insertel cf (car lst) (insertsort cf (cdr lst))))) (define (insertsort-worker cf lst) (if (null? lst) null (insertel-tree cf (car lst) (insertsort-worker cf (cdr lst))))) No change…but insertsort-worker evaluates to a tree not a list! (((() 1 ()) 2 ()) 5 (() 8 ())) 14 February 2003 CS 200 Spring 2003

extract-elements We need to make a list of all the tree elements, from left to right. (define (extract-elements tree) (if (null? tree) null (append (extract-elements (get-left tree)) (cons (get-element tree) (extract-elements (get-right tree)))))) 14 February 2003 CS 200 Spring 2003

How much work is insertsort-tree? (define (insertsort-tree cf lst) (define (insertsort-worker cf lst) (if (null? lst) null (insertel-tree cf (car lst) (insertsort-worker cf (cdr lst))))) (extract-elements (insertsort-worker cf lst))) (n) applications of insertel-tree each is (log n) (n log2 n) 14 February 2003 CS 200 Spring 2003

Growth of time to sort random list bubblesort n log2 n insertsort-tree 14 February 2003 CS 200 Spring 2003

Comparing sorts > (testgrowth bubblesort) n = 250, time = 110 n = 500, time = 371 n = 1000, time = 2363 n = 2000, time = 8162 n = 4000, time = 31757 (3.37 6.37 3.45 3.89) > (testgrowth insertsort) n = 250, time = 40 n = 500, time = 180 n = 1000, time = 571 n = 2000, time = 2644 n = 4000, time = 11537 (4.5 3.17 4.63 4.36) > (testgrowth insertsorth) n = 250, time = 251 n = 500, time = 1262 n = 1000, time = 4025 n = 2000, time = 16454 n = 4000, time = 66137 (5.03 3.19 4.09 4.02) > (testgrowth insertsort-tree) n = 250, time = 30 n = 500, time = 250 n = 1000, time = 150 n = 2000, time = 301 n = 4000, time = 1001 (8.3 0.6 2.0 3.3) 14 February 2003 CS 200 Spring 2003

Can we do better? Making all those trees is a lot of work Can we divide the problem in two halves, without making trees? 14 February 2003 CS 200 Spring 2003

Quicksort C. A. R. (Tony) Hoare, 1962 Divide the problem into: Sorting all elements in the list where (cf (car list) el) is true (it is < the first element) (not (cf (car list) el) is true (it is >= the first element) Will this do better? 14 February 2003 CS 200 Spring 2003

Quicksort (define (quicksort cf lst) (if (null? lst) lst (append (filter (lambda (el) (cf el (car lst))) (cdr lst))) (list (car lst)) (filter (lambda (el) (not (cf el (car lst)))) (cdr lst)))))) 14 February 2003 CS 200 Spring 2003

filter (define (filter f lst) (insertl (lambda (el rest) (if (f el) (cons el rest) rest)) lst null)) 14 February 2003 CS 200 Spring 2003

How much work is quicksort? (define (quicksort cf lst) (if (null? lst) lst (append (quicksort cf (filter (lambda (el) (cf el (car lst))) (cdr lst))) (list (car lst)) (filter (lambda (el) (not (cf el (car lst)))) (cdr lst)))))) What if the input list is sorted? Worst Case: (n2) What if the input list is random? Expected: (n log2 n) 14 February 2003 CS 200 Spring 2003

Comparing sorts Both are (n log2 n) > (testgrowth insertsort-tree) n = 250, time = 20 n = 500, time = 80 n = 1000, time = 151 n = 2000, time = 470 n = 4000, time = 882 n = 8000, time = 1872 n = 16000, time = 9654 n = 32000, time = 31896 n = 64000, time = 63562 n = 128000, time = 165261 (4.0 1.9 3.1 1.9 2.1 5.2 3.3 2.0 2.6) > (testgrowth quicksort) n = 250, time = 20 n = 500, time = 80 n = 1000, time = 91 n = 2000, time = 170 n = 4000, time = 461 n = 8000, time = 941 n = 16000, time = 2153 n = 32000, time = 5047 n = 64000, time = 16634 n = 128000, time = 35813 (4.0 1.1 1.8 2.7 2.0 2.3 2.3 3.3 2.2) Both are (n log2 n) Absolute time of quicksort much faster 14 February 2003 CS 200 Spring 2003

Good enough for VISA? (n log2 n) How long to sort 800M items? n = 128000, time = 35813 36 seconds to sort 128000 with quicksort (n log2 n) How long to sort 800M items? > (log 4) 1.3862943611198906 > (* 128000 (log 128000)) 1505252.5494914246 > (/ (* 128000 (log 128000)) 36) 41812.57081920624 > (/ (* 128000 (log 128000)) 41812.6) 35.99997487578923 > (/ (* 800000000 (log 800000000)) 41812.6) 392228.6064130373 392000 seconds ~ 4.5 days 14 February 2003 CS 200 Spring 2003

Quicksorting 800M items? > (log 4) 1.3862943611198906 1505252.5494914246 > (/ (* 128000 (log 128000)) 36) 41812.57081920624 > (/ (* 128000 (log 128000)) 41812.6) 35.99997487578923 > (/ (* 1000000 (log 1000000)) 41812.6) 330.4150078675871 > (/ (* 100000000 (log 100000000)) 41812.6) 44055.33438234496 > (/ (* 800000000 (log 800000000)) 41812.6) 392228.6064130373 > (/ (/ (/ (/ (* 800000000 (log 800000000)) 41812.6) 60) 60) 24) 4.539682944595339 4.5 days with quicksort is a lot better than 20,000 years with bubblesort… but still not good enough for VISA. 14 February 2003 CS 200 Spring 2003

Later in the course… So far, we have been talking amount the work a procedure requires In a few weeks, we will learn how to talk about the amount of work a problem requires That is, how much work is the best possible sorting procedure? For the general case, you can’t do better than (n log2 n) VISA’s problem is simpler, so they can do much better: (n) 14 February 2003 CS 200 Spring 2003

Charge Monday: Tyson’s essay Wednesday: Tim Koogle Friday: Exam Review Remember to send me a question for him That is your “ticket” to come to class Weds Friday: Exam Review Ask any questions you want before picking up exam 1. 14 February 2003 CS 200 Spring 2003