Lecture 5. The Incompressibility Method  A key problem in computer science: analyze the average case performance of a program.  Using the Incompressibility.

Slides:



Advertisements
Similar presentations
Covers, Dominations, Independent Sets and Matchings AmirHossein Bayegan Amirkabir University of Technology.
Advertisements

QuickSort Average Case Analysis An Incompressibility Approach Brendan Lucier August 2, 2005.
Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Lecture 12: Lower bounds By "lower bounds" here we mean a lower bound on the complexity of a problem, not an algorithm. Basically we need to prove that.
CSC 2300 Data Structures & Algorithms March 16, 2007 Chapter 7. Sorting.
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Deterministic Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms.
Spring 2015 Lecture 5: QuickSort & Selection
The number of edge-disjoint transitive triples in a tournament.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Chapter 3 Growth of Functions
Courtesy Costas Busch - RPI1 A Universal Turing Machine.
1 Linear Bounded Automata LBAs. 2 Linear Bounded Automata are like Turing Machines with a restriction: The working space of the tape is the space of the.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Randomized Algorithms and Randomized Rounding Lecture 21: April 13 G n 2 leaves
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
CSE 421 Algorithms Richard Anderson Lecture 4. What does it mean for an algorithm to be efficient?
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Fall 2004COMP 3351 A Universal Turing Machine. Fall 2004COMP 3352 Turing Machines are “hardwired” they execute only one program A limitation of Turing.
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
The Complexity of Algorithms and the Lower Bounds of Problems
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Sorting CS 202 – Fundamental Structures of Computer Science II Bilkent.
Lecture 3. Relation with Information Theory and Symmetry of Information Shannon entropy of random variable X over sample space S: H(X) = ∑ P(X=x) log 1/P(X=x)‏,
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Lecture 2 We have given O(n 3 ), O(n 2 ), O(nlogn) algorithms for the max sub-range problem. This time, a linear time algorithm! The idea is as follows:
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
2IL50 Data Structures Fall 2015 Lecture 2: Analysis of Algorithms.
Jessie Zhao Course page: 1.
Theory of Computing Lecture 17 MAS 714 Hartmut Klauck.
CS 221 Analysis of Algorithms Instructor: Don McLaughlin.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 15-1 Mälardalen University 2012.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Télécom 2A – Algo Complexity (1) Time Complexity and the divide and conquer strategy Or : how to measure algorithm run-time And : design efficient algorithms.
Lecture 5. The Incompressibility Method  A key problem in computer science: analyze the average case performance of a program.  Using the Incompressibility.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
1 Linear Bounded Automata LBAs. 2 Linear Bounded Automata (LBAs) are the same as Turing Machines with one difference: The input string tape space is the.
1 Turing’s Thesis. 2 Turing’s thesis: Any computation carried out by mechanical means can be performed by a Turing Machine (1930)
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Lecture 5. The Incompressibility Method  A key problem in computer science: analyze the average case performance of a program.  Using the Incompressibility.
1 Covering Non-uniform Hypergraphs Endre Boros Yair Caro Zoltán Füredi Raphael Yuster.
15.082J & 6.855J & ESD.78J September 30, 2010 The Label Correcting Algorithm.
Sorting & Lower Bounds Jeff Edmonds York University COSC 3101 Lecture 5.
Chapter 3 Chapter Summary  Algorithms o Example Algorithms searching for an element in a list sorting a list so its elements are in some prescribed.
Discrete Mathematics Chapter 2 The Fundamentals : Algorithms, the Integers, and Matrices. 大葉大學 資訊工程系 黃鈴玲.
1 A Universal Turing Machine. 2 Turing Machines are “hardwired” they execute only one program A limitation of Turing Machines: Real Computers are re-programmable.
CSE322 PUMPING LEMMA FOR REGULAR SETS AND ITS APPLICATIONS
The Incompressibility Method using Kolmogorov Complexity
Lecture 5. The Incompressibility Method
HIERARCHY THEOREMS Hu Rui Prof. Takahashi laboratory
Lecture 6. Prefix Complexity K
The Complexity of Algorithms and the Lower Bounds of Problems
Enumerating Distances Using Spanners of Bounded Degree
Lecture 5-1. The Incompressibility Method, continued
Data Structures Review Session
Formal Languages, Automata and Models of Computation
CSE 373 Sorting 2: Selection, Insertion, Shell Sort
Locality In Distributed Graph Algorithms
Presentation transcript:

Lecture 5. The Incompressibility Method  A key problem in computer science: analyze the average case performance of a program.  Using the Incompressibility Method:  Give the program a random input of length n, say of complexity n- log n (or sometimes complexity n).  Analyze the program with respect to this single and fixed input. This is usually easier than average case using the fact this input is almost incompressible.  If we used complexity n- log n, the running time for this single input is the average case running time of all inputs, since a (1-1/n)th fraction of all inputs have this high complexity!

Example: Show L={0 k 1 k | k>0} not regular. By contradiction, assume that DFA M accepts L. Choose k so that C(k) >> 2|M|. Simulate M: 000 … … 1 C(k) < |M| + |q| + O(1) < 2|M|. Contradiction. Remark. Generalizes to iff condition: more powerful & easier to use than “pumping lemmas”. kk M q stop here Formal language theory

Combinatorics Theorem. There is a tournament (complete directed graph) T of n players that contains no large transitive subtournaments ( >1 + 2 log n ). Proof by Picture: Choose a random T. One bit codes a directed edge, each tournament is encoded in string of n(n-1)/2 bits, and each string of n(n-1)/2 bits codes a tournament. Choose T such that C(T | n)  n(n-1)/2. If there is a large transitive subtournament on v(n) nodes, then a large number of edges are given for free! Subgraph-edges = v(n)(v(n)-1)/2. Overhead = v(n) log n. Overhead ≥ subgraph edges since C(T | n)≤ n(n  1)/2  subgraph-edges  overhead T Linearly ordered subgraph. Easy to describe

Combinatorics Theorem. Let w(n) be the largest integer such that every tournament T has disjoint node sets A and B both of cardinality w(n) such that AxB is a subset of the ordered edge set of T. Then, w(n) ≤ 2 log n. Proof. Choose T with C(T|n) ≥ n(n-1)/2. Add descriptions A and B in 2 w(n) log n bits (in lexicographic order, say). Save bits describing edges between A and B in w(n) ² bits. Add – Save ≥ 0. QED

Graphs Consider undirected labeled graphs. A clique is a subset of nodes with edges between every pair. An anticlique is a subset of nodes without edges between any pair. Encode graph G s.t. The set of node pairs are lexicographically ordered without repetition, {i,j} with i < j, and the corresponding bit is 1 if there is an edge, and 0 otherwise. Theorem. There is an undirected labeled graph G on n nodes that contains no clique or anticlique on >1+2 log n nodes. Proof. Let G be an undirected labeled graph of high Kolmogorov complexity, C(G|n) ≥ n(n-1)/2. The proof is now isomorphic to that of the transitive subtournaments.

Graphs Lemma. A fraction of at least 1 – 1/2^d(n) of all labeled undirected graphs on n nodes have C(G|n,d) ≥ n(n-1)/2 -d(n). Proof. There are at most 2^{n(n-1)/2 – d(n)} -1 programs of length < n(n-1)/2 -d(n). QED Remark. Hence a property that holds for such graphs holds with high probability and in expectation (on average). Lemma. All nodes of a graph with d(n)=o(n) have degree n/2+-o(n). Proof. Choose G s.t. C(G|n) ≥ n(n-1)/2 - d(n). For every node i, the scattered substring of bits corresponding to {i,j} or {j,i} has complexity ≥ n-d(n)- 2 log n, since otherwise its description + description i +the literal remainder of G|n gives a description of G|n of length < n(n-1)/2 – d(n). Let d(n)=o(n). Since the substring has complexity ≥ n-o(n), we have by similar reasoning to that of the last frame of lecture 2 that the substring contains n/2 +- O(√ o(n)n) = n/2 +- o(n) bits 1, and hence node i has degree n/2+-o(n). QED

Graphs Lemma. All graphs with d(n)=o(n) have diameter 2. Proof. Diameter 1 is a complete graph G with C(G|n)=O(1). Assume there is a shortest path of length >2 between nodes i,j. Add identity of nodes i,j in 2 log n bits. Save n/2-o(n) bits from omitting edge bits (k,j) (which are all 0) for every k for which there is an edge (i,k). There are >n/2-o(n) of them by previous lemma. QED Remark. There is some discrepancy between add and save here. We can in fact strengthen the theorem to show that all such graphs have n/4 -o(n) disjoint paths of length 2 between every pair of nodes.

Unlabeled Graphs # of labeled undirected graphs on n nodes is 2^{n(n-1)/2}. Theorem (Harary, Palmer 1973) # of unlabeled undirected graphs on n nodes is asymptotic to 2^{n(n-1)/2} / n! Proof by incompressibility (Sketch). There are n! ways to relabel a graph on n nodes for every graph. But, for example, the complete graph stays the same under every relabeling. So the automorphism group of that graph has cardinality n! A Kolmogorov random graph stays the same only under identity relabeling. Its automorphism group has cardinality 1 (such graps are called rigid.)‏ By incompressiblity we estimate the number of graphs (what is their minimum complexity and maximum complexity) which have automorphism groups of given cardinality. This gives the theorem. QED

Fast adder Example. Fast addition on average. Ripple-carry adder: n steps adding n-bit numbers. Carry-lookahead adder: 2 log n steps (divide-and-conquer). Burks-Goldstine-von Neumann (1946): log n expected length of carry sequence, so log n expected steps. S= x  y; C= carry sequence; while (C≠0) { S= S  C; C= new carry sequence; } Average case analysis: Fix x, take random y s.t. C(y|x)≥|y| x = … u1 … (Max such u is precise carry length)‏ Low order bits right. y = … û1 …, û is complement of u If |u| > log n, then C(y|x)<|y|. Average over all y, get log n. QED

Sorting  Given n elements (in an array). Sort them into ascending order.  This is the most studied fundamental problem in computer science.  Shellsort (1959): p passes. In each pass, compare in subarrays (length related to increment) adjacent elements and move larger elements to the right (Bubblesort) so that the large elements `bubble’ to front.  Open for over 40 years: a nontrivial general average case complexity lower bound of Shellsort?

Shellsort Algorithm Using p increments h 1, …, h p, with h p =1 At k-th pass, the array is divided in h k separate sublists of length n/h k (taking every h k -th element). Each sublist is sorted by insertion/bubble sort Application: Sorting networks --- n log 2 n comparators, easy to program, competitive for medium size lists to be sorted.

Shellsort history Invented by D.L. Shell [1959], using p k = n/2 k for step k. It is a Θ(n 2 ) time algorithm Papernow&Stasevitch [1965]: O(n 3/2 ) time by destroying regularity in Shell’s geometric sequence. Pratt [1972]: All quasi geometric sequences use O(n 3/2 ) time.Θ(nlog 2 n) time for p=(log n)^2 with increments 2^i3^j. Incerpi-Sedgewick, Chazelle, Plaxton, Poonen, Suel (1980’s) – best worst case, roughly, Θ(nlog 2 n / (log logn) 2 ). Average case: Knuth [1970’s]: Θ(n 5/3 ) for p=2 Yao [1980]: p=3 characterization, no running time. Janson-Knuth [1997]: O(n 23/15 ) for p=3. Jiang-Li-Vitanyi [J.ACM, 2000]: Ω(pn 1+1/p ) for every p.

Shellsort Average Case Lower bound Theorem. p-pass Shellsort average case T(n) ≥ pn 1+1/p Proof. Fix a random permutation Π with Kolmogorov complexity nlogn. I.e. C(Π)≥ nlogn. Use Π as input. (We ignore the self-delimiting coding of the subparts below. The real proof uses better coding.)‏ For pass i, let m i,k be the number of steps the kth element moves. Then T(n) = Σ i,k m i,k From these m i,k 's, one can reconstruct the input Π, hence Σ log m i,k ≥ C(Π) ≥ n logn Maximizing the left, all m i,k must be the same (maintaining same sum). Call it m. So Σ m = pnm = Σ i,k m i,k Then, Σ log m = pn log m ≥ Σ log m i,k ≥ nlogn  m p ≥ n. So T(n) = pnm > pn 1+1/p. ■ Corollary: p=1: Bubblesort Ω(n 2 ) average case lower bound. p=2: n 3/2 lower bound. p=3, n 4/3 lower bound (4/3=20/15); and only p=Θ(log n) can give average time O(n log n).

Heapsort 1964, JWJ Williams [CACM 7(1964), ] first published Heapsort algorithm Immediately it was improved by RW Floyd. Worst case O(n logn). Open for 40 years: Which is better in average case: Williams or Floyd? (choose between n log n and 2n log n)‏ R. Schaffer & Sedgewick (1996). Ian Munro provided the solution here.

Heapsort average analysis (I. Munro)‏ Average-case analysis of Heapsort. Heapsort: (1) Make Heap. O(n) time. (2) Delete max at root, restore heap, repeat. d d log n WilliamsFloyd 2 log n - 2dlog n + d Fix random heap H, C(H) > n log n. Simulate Step (2). Each round, encode the red path in log n -d bits. The n paths describe the heap! Hence, total n paths, length  n log n, hence d must be a constant. Floyd takes n log n comparisons, and Williams takes 2n log n. comparisons/round Compare sons; Compare largest with candidate. 2 comparisons/ step Compare sons, Repeat this for largest son. 1 comparison/step

A selected list of results proved by the incompressibility method Ω (n 2 ) for simulating 2 tapes by 1 (30 years)‏ k heads > k-1 heads for PDAs (15 years)‏ k one-ways heads can’t do string matching (13 yrs)‏ 2 heads are better than 2 tapes (40 years)‏ Average case analysis for heapsort (30 years)‏ k tapes are better than k-1 tapes. (20 years)‏ Many theorems in combinatorics, formal language/automata, parallel computing, VLSI Simplify old proofs (Hastad Lemma). Shellsort average case lower bound (40 years)‏

More on formal language theory Lemma (Li-Vitanyi) Let L  V *, and L x ={y: xy  L}. Then L is regular implies there is c for all x,y,n, let y be the n-th element in L x, we have C(y|x) ≤ C(n)+c. Proof. Like example. QED. Example 2. {1 p : p is prime} is not regular. Proof. Let p i, i=1,2 …, be the list of primes. Then p k+1 is the first element in L Pk, hence by Lemma, C(p k+1 |p k )≤O(1). Impossible since p k+1 -p k → ∞ for k→∞ QED

Characterizing regular sets For an lexicographic enumeration of  *={y 1,y 2, …}, define characteristic sequence X= X 1 X 2 …of L x ={y i : xy i  L} by X i = 1 iff xy i  L Theorem. L is regular iff there is a c for all x,n, C(X 1:n |n) < c Proof. L is regular (finite-state) iff L is the union of finitely many disjoint sets {x}L x (The Myhill-Nerode Theorem). Hence every X of L x is a recursive sequence. This shows the `if’ side. The `only if’ side depends on a sophisticated lemma, see textbook.