Approximate Medians and other Quantiles in One Pass and with Limited Memory Researchers: G. Singh, S.Rajagopalan & B. Lindsey Lecturer: Eitan Ben Amos,

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Boosting Textual Compression in Optimal Linear Time.
Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Fast Algorithms For Hierarchical Range Histogram Constructions
Lecture 3: Parallel Algorithm Design
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
Chapter 4: Trees Part II - AVL Tree
Greedy Algorithms Greed is good. (Some of the time)
Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Lectures on Recursive Algorithms1 COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski.
1 CS 361 Lecture 5 Approximate Quantiles and Histograms 9 Oct 2002 Gurmeet Singh Manku
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Algorithm Efficiency and Sorting
Advanced Topics in Algorithms and Data Structures Page 1 An overview of lecture 3 A simple parallel algorithm for computing parallel prefix. A parallel.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Tirgul 6 B-Trees – Another kind of balanced trees.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Randomized Algorithms - Treaps
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
1 Chapter 1 Analysis Basics. 2 Chapter Outline What is analysis? What to count and consider Mathematical background Rates of growth Tournament method.
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
An Optimal Cache-Oblivious Priority Queue and Its Applications in Graph Algorithms By Arge, Bender, Demaine, Holland-Minkley, Munro Presented by Adam Sheffer.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Randomized Algorithms CSc 4520/6520 Design & Analysis of Algorithms Fall 2013 Slides adopted from Dmitri Kaznachey, George Mason University and Maciej.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
CS 103 Discrete Structures Lecture 13 Induction and Recursion (1)
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
Lower bounds on data stream computations Seminar in Communication Complexity By Michael Umansky Instructor: Ronitt Rubinfeld.
Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all.
Clustering Data Streams A presentation by George Toderici.
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
1 Algorithms Searching and Sorting Algorithm Efficiency.
Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
Discrete Structures Li Tak Sing( 李德成 ) Lectures
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
B/B+ Trees 4.7.
Binary Search Trees A binary search tree is a binary tree
Streaming & sampling.
Chapter 5. Optimal Matchings
Spatial Online Sampling and Aggregation
B- Trees D. Frey with apologies to Tom Anastasio
3.5 Minimum Cuts in Undirected Graphs
Lecture 3 / 4 Algorithm Analysis
B+Trees The slides for this text are organized into chapters. This lecture covers Chapter 9. Chapter 1: Introduction to Database Systems Chapter 2: The.
Lecture 2- Query Processing (continued)
Compact routing schemes with improved stretch
Quick-Sort 4/25/2019 8:10 AM Quick-Sort     2
Tree-Structured Indexes
The Selection Problem.
Switching Lemmas and Proof Complexity
Presentation transcript:

Approximate Medians and other Quantiles in One Pass and with Limited Memory Researchers: G. Singh, S.Rajagopalan & B. Lindsey Lecturer: Eitan Ben Amos, 2003

Lecture Structure Problem Definition A Deterministic Algorithm Proof Complexity analysis Comparison to other algorithms A randomized solution. Pros & cons of randomized solution.

Problem Definition When given a large data set (N), design an algorithm for computing approximate quantiles (  ) in a single pass. Approximation guarantee is an input (  ). Algorithm should be applicable to any distribution of values & arrival. Compute multiple values with no extra cost. Low memory requirements.

Quantiles Given a stream of N values, the  -quantile, for  [0,1], is the value located in position  *N  in the sorted input stream. When  =0.5 the it is the median.  is  approximate  -quantile if its rank in the sorted input stream is between  (  -  )*N  and  (  +  )*N . There can be several values in this range.

Database Applications Used for query optimizations. Used by parallel DB systems in order to split the inserted data among the servers into approximately equal parts. Distributed parallel sorting uses quantiles to split the ranges between the machines.

Algorithm Framework An algorithm is parameterized by 2 integers: b,k. It will use b buffers, each stores k elements. Memory usage is b*k + C Every buffer (x) is associated with a positive integer weight, w(x). The weight denotes the number of input elements represented by an element in x.

Algorithm Framework (cont ’ d) Buffers are labeled either “Empty” or “Full”. Initially all buffers are “Empty”. The values of b,k are calculated so that they enforce the approximation guarantee (  ) and minimize memory requirement: b*k It must be able to process N elements.

Framework Basic Operations (1) NEW Takes an empty buffer as input. Populates the buffer with the next k elements from the input stream. Assigns the “Full” buffer a weight of 1. If there are less than k elements, an equal number of +  & -  are added to fill the buffer. The input stream with the additional ±  elements is called “augmented stream”.

Quantile in augmented stream Length of augmented stream is  *N,  >=1  ’ = (2*  +  -1)/(2*  ) The  -quantile in the original stream is the  ’ quantile in the augmented stream. Proof: (  -1)*N elements were added, ½ of which appear before  in the sorted stream.  ’N  =  *N+(  -1)*N/2= (N/2)*(2  +  -1)

Basic Operations (Cont ’ d) (2) COLLAPSE Takes c  2 “Full” input buffers X 1,….X c & outputs a buffer Y (all of size k). All input buffers are marked “Empty”, output buffer Y is marked “Full”. Weight of Y is the sum of weights of all input buffers: W(Y) =  w(X i )

Collapsing Buffers Make w(X i ) copies of each element in X i Sort elements from all buffers together. Elements of Y are k equally-spaced elements from the sorted elements. w(Y) is odd  elements are j*w(Y)+(w(Y)+1)/2, j=0,….,k-1 w(Y) is even  elements are j*w(Y)+w(Y)/2 or j*w(Y)+(w(Y)+2)/2

Collapsing Buffers (Cont ’ d) For 2 successive COLLAPSE with even w(Y) alternate between the choices. Define offset(Y)=(w(Y)+z)/2, z  {0,1,2} Y has the elements : j*w(y)+offset(Y) Collapsing buffers does not require the creation of multiple copies of elements. A single scan of the elements in a manner similar to merge-sort will do.

COLLAPSE example

Lemma 1 C = Number of COLLAPSE operations made by the algorithm. W = Sum of weights of output buffers produced by these COLLAPSE operations. Lemma: sum of offsets of all COLLAPSE operations is at least (W+C-1)/2

Proof C=C odd +C even (Number of COLLAPSE operations with w(Y) being odd & even respectively). C even = C even1 + C even2 (Number of COLLAPSE operations with offset(Y) being w(Y)/2 & (w(Y)+2)/2 respectively). Sum of all offsets is (W+C odd +2C even2 )/2

Proof (Cont ’ d) Since COLLAPSE alternates between the 2 offset choices for even w(Y): If C even1 =C even2  C even =2C even2 If C even1 =C even2 +1  C even =C even2 +1+C even2. In any case : C even2  (C even -1)/2 Sum-of-offsets  (W+C-1)/2

Basic Operations (Cont ’ d) (3) OUTPUT OUTPUT is performed only once, just before termination. Takes c  2 “Full” input buffers X 1,….X c of size k. Outputs a single element corresponding to the  ’ quantile of the augmented stream.

OUTPUT (Cont ’ d) Makes w(X i ) copies of each element in buffer X i, sorts all input buffers together. Outputs the element in position  ’kW  where W=w(X 1 )+….+w(X c )

COLLAPSE policies Different COLLAPSE policies mean different criteria for when to use the NEW/COLLAPE operations. – Munro & Pateson – Alsabti, Ranka & Singh – New Algorithm.

Munro & Pateson If there are empty buffers, invoke NEW. Otherwise, invoke COLLAPSE on 2 buffers having the same weight. Following is an example of operations for b=6.

Munro & Pateson

Alsabti, Ranka & Singh Fill b/2 “Empty” buffers by invoking NEW & then invoke COLLAPSE on them. Repeat this b/2 times. Invoke OUTPUT on b/2 resulting buffers. Following is an example of operations for b=10.

Alsabti, Ranka & Singh

New Algorithm Associate with every buffer X an integer l(X) denoting its level. Let l = minimum among all levels of currently “Full” buffers. If there’s exactly one “Empty” buffer, invoke NEW & assign it level l. If there are at least 2 “Empty” buffers, invoke NEW on each & assign them level 0.

New Algorithm (Cont ’ d) If there are no “Empty” buffers invoke COLLAPSE on the set of buffers with level l. Assign the output buffers level (l+1). Following is an example of operations for b=5, h=4.

New Algorithm

Tree representation Sequence of operations can be seen as a tree. Vertices (except root) represent the set of all logical buffers (initial, intermediate, final). Leaves correspond to initial buffers which are populated from the input stream by the NEW operation.

Tree representation (Cont ’ d) An edge is drawn from every input buffer to its output buffer (created by COLLAPSE). The root corresponds to the final OUTPUT operation. The children of the root are the final buffers that are produced (by COLLAPSE operations). Broken edges are drawn toward the children of the root.

Definitions User Specified: – NSize of input stream –  Quantile to be computed. –  Approximation Guarantee Others: – bNumber of buffers – ksize of each buffer –  ’ Quantile in the augmented stream

Definitions (Cont ’ d) More Others – C Number of COLLAPSE operations – Wsum of weights of all COLLAPSE – w max weight of heaviest COLLAPSE – LNumber of leaves in tree – hheight of tree

Approximation Guarantees We will prove the following: The difference in rank between the true  - quantile of the original dataset & the output of the algorithm is at most w max +(W-C-1)/2

Lemma 2 Lemma: The sum of weights of the top buffers (the children of the root) is L, the number of leaves

Proof Every buffers that is filled by NEW has a weight of 1. COLLAPSE of buffers creates a buffer with a weight that is the sum of weights of input buffers. Looking at the tree of operations, every node weighs exactly like the weight of all its children. Recursively applying this from the top buffers towards the root we can see that the weight of a top buffer is identical to the number of leaves in the sub-tree root at it.

Definitely Small/Large Let Q be the output of the algorithm. An element in the input stream is DS(DL) if it is smaller(larger) than Q. In order to identify all the DS(DL) elements we will start from the top buffers (children of root) and move towards the leaves. Mark elements of top buffers as DS(DL) if they are smaller(larger) than Q.

Definitely Small/Large (Cont ’ d) When going from a parent to its children, mark as DS(DL) all elements in the child buffers that are smaller(larger) than the DS(DL) elements in their parent. We will pursue a way of showing how many DS(DL) elements exists.

Weighted DS/DL bound Weight of element is the weight of the buffer it is in. Weighted DS(DL) adds w(X) for every element in buffer X that is DS(DL) Let DS top (DL top ) denote the weighted sum of DS(DL) elements among the top buffers.

Lemma 3  ’kL  - w max  DS top   ’kL  - 1 Right side: OUTPUT gives the element at position  ’kL  of the weighted buffers & so there’s obviously less than that number of elements which are smaller.

Lemma 3 (Cont ’ d) Left side: Surrounding Q there are w(Xi)-1 elements that are copies of Q. if we had asked a quantile that is just a bit different we would have just got a different copy of Q as the output, although it would have been a different element in the input stream. Error can be as large as w(Xi) which is bound by w max. Reducing the number of copies from the position of Q, all others are DS for sure.

Lemma 3 (Cont ’ d) kL -  ’kL  - w max + 1  DL top  kL -  ’kL  Right side: there are a total of kL elements in the augmented stream. Q is in position  ’kL . So there are kL -  ’kL  elements after the position of Q, of which some might be copies of Q.

Lemma 3 (Cont ’ d) Left Side: there are kL -  ’kL  elements after the position of Q. of these there are at most (w max -1) copies of Q after (w max including Q) which all elements are DL.

Weighted DS Consider node Y of the tree corresponding to a COLLAPSE operation. Let Y have s  0 DS elements. Consider the largest element among these DS elements. It appears in position (s- 1)*w(Y)+offset(Y) in the sorted sequence of elements of its children with each element being duplicated as the weight of the buffer it originates from.

Weighted DS (Cont ’ d) Therefore, the weighted sum of DS elements among children of Y is (s-1)*w(Y) + offset(Y) which is equivalent to s*w(Y)- (w(Y)-offset(Y)).

Weighted DL Similarly, let Y have l  0 DL elements. Consider the smallest element among these DL elements. It appears in position (l-1)*w(Y) + [w(Y)-offset(Y)] in the sorted sequence of elements of its children with each element being duplicated as the weight of the buffer it originates from (when counting from end of stream towards its beginning).

Weighted DL (Cont ’ d) the weighted sum of DL elements among children of Y is (l-1)*w(Y) + [w(Y)- offset(Y)] which is equivalent to l*w(Y)- offset(Y) which can also be written as l*w(Y)-(w(Y)-offset(Y)).

DS/DL Conclusion The weighted sum of DS(DL) among the children of a node Y is smaller by at most w(Y)-offset(Y) than the weighted sum of DS(DL) elements in Y itself. So we can count DS(DL) from the top buffers towards the leaves, reducing w(Y)- offset(Y) for each COLLAPSE on the way.

How many leaves ? Let DS leaves (DL leaves ) denote the number of definitely-small(large) elements among the leaf buffers of the operations tree. Weight of a leaf is 1  DS leaves (DL leaves ) are, in fact, the number of definitely- small(large) elements in the augmented stream.

Lemma 4 DS leaves  DS top - (W-C+1)/2 DL leaves  DL top - (W-C+1)/2

Lemma 4 – Proof Starting at the top buffers, the initial weighted sum of DS(DL) elements is DS top (DL top ) Each COLLAPSE that creates node Y diminishes the weighted sum by at most w(Y)-offset(Y). Traveling down to the leaves we do this for all COLLAPSE operations.

Lemma 4 – Proof (Cont ’ d) W(Y) on all COLLAPSE operations is W. offset(Y) on all COLLAPSE operations is at least (W+C-1)/2 by lemma 1. Reducing these 2 from DS top (DL top ) yields Lemma 4.

Lemma 5 The difference in rank between the true  - quantile of the original input stream & that of the output of the algorithm is at most (W-C-1)/2+w max.

Lemma 5 - proof Since there are L leaves each of size k, there are a total of k*L elements in the augmented input stream. The true  ’-quantile of the augmented stream is in position  ’*k*L . The output of the algorithm can be any element that is neither DS nor DL.

Lemma 5 – proof (Cont ’ d) So the output can be as small as DS leaves +1 or as large as k*L-DL leaves. The difference between the true  ’-quantile and the output could be as large as  ’kL  - DS leaves -1or kL-DL leaves -  ’kL . Assign DS leaves from Lemma 4 & we get:  ’kL  -DS leaves -1   ’kL  -DS top +(W-C+1)/2-1

Lemma 5 – proof (Cont ’ d) Substituting  ’kL  -DS top  w max from lemma 3 we get:  ’kL  -DS leaves -1  w max +(W-C+1)/2-1 = w max +(W-C-1)/2 The same bound can be established for the quantity kL-DL leaves -  ’kL .

Approx. bound Munro-Paterson Requires 2 buffers at leaf level & one buffer at every other level, except the root. Therefore height is at most b. The original paper assumes there are exactly 2^(b-1) leaves & that the final OUTPUT operation assumes 2 buffers of level 2^(b-2) as inputs.

Approx. bound Munro-Paterson W=(b-2)*2^(b-1) since the weight of nodes at each level is 2^(b-1) & COLLAPSE counts all levels except leaves & root. C=2^(b-1)-2 since a tree of height b-1 (ignoring leaves) has 2^(b-1)-1 nodes. Reducing the root yields the proper value. w max = 2^(b-2) since it is the entire tree under a child of the root.

Approx. bound Munro-Paterson Plugging these values in to Lemma 5 yields: (W-C-1)/2+w max =(b-2)*2^(b-2)+1/2 This value has to be smaller than  *N for the output to be  -Approximation Quantile.

Approx. bound Alsabti-Ranka-Singh B is assumed to be even (since b/2 is used) C=b/2 W=(b/2)^2 since there are b/2 COLLAPSE operations, each with b/2 buffers of weight 1. w max =b/2 since all COLLAPSE are the same. L=(b/2)^2 since the root has b/2 children with each having b/2 children.

Approx. bound Alsabti-Ranka-Singh Plugging these values in to Lemma 5 yields: (W-C-1)/2+w max =[(b^2)/4-b/2-1]/2 + b/2 = (b^2)/8+b/4-1/2 This value has to be smaller than  *N for the output to be  -Approximation Quantile.

Approx. bound new-Algorithm The values W, C, w max are a function of the height of the tree, denoted as h in addition to b. The height of the tree is not restricted by b, unlike the previous schemes we saw. Assume h  3 (so there is a level of COLLAPSE except the leaves & the root.

Approx. bound new-Algorithm

Plugging these values in to Lemma 5 yields: This value has to be smaller than  *N for the output to be  -Approximation Quantile.

Memory Usage Comparison

Memory Usage (Cont ’ d) Why does the curve of the Munro-Paterson algorithm has these kinks? We optimize under 2 equations. (B-2)*2^(b-2)+1/2 =N As N increases, k is increased until N reaches a threshold in which adding 1to b (constraint 1) diminishes k by half, thereby decreasing the memory usage by half.

Multiple Quantiles During the analysis we did not assume that a single quantile is being requested. Nor did we use the specific quantile until the last operation (OUTPUT) which selected a single element from the top buffers. Conclusion: any algorithm of this framework can output multiple quantiles with the same cost (of memory) as computing a single quantile.

Space Complexity Best space complexity is achieved for b=h. The  -Approximation constraint can be relaxed a little to get: This means that b=h=O(log(  N))

Space Complexity (Cont ’ d) Second constraint is kL  N. Replacing L with its value gives: Yields: k=(1/  )*O(b)=(1/  )*O(log(  N)) = O((1/  )*log(  N))

Space Complexity (Cont ’ d) The overall space complexity is b*k

Parallel Version The new algorithm scales very good on parallel machines. The input stream can be divided among the processors either statically (each one takes T values) or dynamically. Up till having the top buffers (children of root) which are the input buffers for the OUTPUT operation, parallelism is obvious.

Sampling based Algorithm The deterministic algorithm presented earlier, coupled with sampling can reduce the memory requirements dramatically. Interestingly, we will achieve a space bound that is independent of N. We add a new input parameter, . The probability that the output is correct is required to be 1- .

Hoeffding ’ s Inequality Let X 1, …, X n be independent random variables with 0  X i  1for i=1,….n. Let X= X 1 + …+X n Let E(X) denote the expected value of X. Then, for any > 0 the following holds: Pr[X – E(X)  ]  exp ((-2* * )/n)

Lemma 7 Let  =  1 +  2 A total of samples drawn from a population of N elements are enough to guarantee that the set of elements between the pair of positions  (  1 )*S  in the sorted sequence of samples is a subset of the set of elements between the pair of positions  (  )*N  in the sorted sequence of the N elements.

Proof We say a sample is “bad” if it does not satisfy the previously mentioned property; otherwise it is called “good”. Let N  -  (N  +  ) denote the elements preceding (succeeding) the  -  (  +  ) quantiles among the N elements. A sample of size S is “bad” iff more than  (  -  1 )*S  elements are drawn from N  -  or more than S-  (  +  1 )*S  elements are drawn from N  + 

Proof (Cont ’ d) The probability that more than  (  -  1 )*S  elements are drawn from N  -  is bounded as follows. The drawing of S elements from a population of N can be seen as S independent coin tosses with probability  -  The expected number of successful tosses is (  -  )S

Proof (Cont ’ d) The probability that this occurs is:

Values of  1,  2 When  1 is close to 1 (  2 close to 0) the number of samples increases to be very large. When  1 is close to 0 the required approximation guarantee from the deterministic algorithm increases. In either case the memory requirement is high.

Values of  1,  2 (Cont ’ d) We need to optimize  1,  2 to reduce memory usage. The theoretical complexity can be determined by setting  1 =  2 =0.5 Then S becomes The new algorithm’s space complexity is:

Values of  1,  2 (Cont ’ d) The space required to run the new algorithm on the samples is:

Multiple Quantiles We want p different quantiles, each with error bound  & confidence of 1- . Let  =  1 +  2 let We choose S samples & feed them all to the deterministic algorithm, which is  approximate. Read p quantiles from the output buffers.

Multiple Quantiles (Cont ’ d) All quantiles are guaranteed with probability >= 1-  to be  -approximate. Using lemma 7 & substituting  with  ’=  /p we compute the number of samples. The probability that some quantile is not an is 1-  ’/p. The probability that any quantile isn’t  - approximate is p*  ’ which is .

Pros & Cons (Pros) The randomized algorithm has a complexity that is not a function of N. (Cons) When computing multiple quantiles, the deterministic algorithm is unchanged. The randomized algorithm, however, does require a larger sample as the number of quantiles increases.