Presentation is loading. Please wait.

Presentation is loading. Please wait.

Charalampos (Babis) E. Tsourakakis SODA 2011 25 th January ‘11 SODA '111.

Similar presentations


Presentation on theme: "Charalampos (Babis) E. Tsourakakis SODA 2011 25 th January ‘11 SODA '111."— Presentation transcript:

1 Charalampos (Babis) E. Tsourakakis ctsourak@math.cmu.edu SODA 2011 25 th January ‘11 SODA '111

2 Richard Peng SCS, CMU Gary L. Miller SCS, CMU Russell Schwartz SCS & BioSciences CMU SODA '112

3  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '113

4 4 Array based comparative genomic hybridization (aCGH) log T/R for humans R=2 Genome

5  Near-by probes (genomic positions) tend to have the same DNA copy number.  Treat the data as 1d time series.  Fit piecewise constant segments. SODA '115 log T/R for humans R=2 Genome

6  Other applications  Histogram construction  Speech recognition  Data mining  Biology  and many more… SODA '116

7  Input: Noisy sequence (P 1,..,P n )  Output: (F 1,..,F n ) which minimizes  Digression: Constant C is determined by training on data with ground truth. SODA '117 Goodness of fitRegularization/avoid overfitting

8  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '118

9 9 Don Knuth Optimal BSTs in O(n 2 ) time Frances Yao Recurrence Quadrangle inequality (Monge) Then we can turn the naïve O(n 3 ) algorithm to O(n 2 )

10  Gaspard Monge SODA '1110 Quadrangle inequality Inverse Quadrangle inequality Transportation problems (1781)

11 SODA '1111 Eppstein Galil Giancarlo Larmore Schieber

12  SMAWK algorithm : finds all row minima of a totally monotone matrix NxN in O(N) time!  Bein, Golin, Larmore, Zhang showed that the Knuth-Yao technique is implied by the SMAWK algorithm.  Weimann [PhD thesis] improved state of the art results in several problems on planar graphs. SODA '1112

13 SODA '1113 Guha Koudas Shim Synopsis of data distributions: fit K segments to 1d time series. Monotonicity properties of the key quantities involved. (1+ε) approximation O(n+Κ 3 logn+K 2 /ε) time

14  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '1114

15  Recurrence for our optimization problem: SODA '1115

16  Compute nxn matrix M where M j,i = mean squared error for fitting a segment from point j to point i.  This can be done in O(n 2 ) time by keeping first and second moments in “online” way SODA '1116

17 SODA '1117 First Moments Squared Errors Recurse

18  Is the O(n 2 ) running time tight?  Probably not  What do halfspace queries have to do with this problem?  What is Multiscale Monge analysis? SODA '1118

19  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '1119

20  Technique 1: Using halfspace queries we get an approximation algorithm with ε additive error which runs in O(n 4/3+δ log(U/ε) ) time.  Technique 2: break carefully the original problem into a “small” number of Monge optimization problems. Approximates the optimal answer for the shifted objective within a factor of (1+ε), O(nlogn/ε) time. SODA '1120 ~

21  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '1121

22 SODA '1122

23  Let Claim: SODA '1123 This term kills the Monge property!

24  Basically, because we cannot be searching for the optimum breakpoint in a restricted range.  E.g., for C=1 and the sequence (0,2,0,2,….,0,2) : fit a segment per point (0,2,0,2,….,0,2,1): fit one segment for all points SODA '1124

25  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '1125

26 SODA '1126 ~ - ~

27 SODA '1127

28 SODA '1128 i fixed, binary search query constant ~ ~

29 29 Eppstein Agarwal Matousek

30  Hence the algorithm iterates through the indices i=1..n, and maintains the Eppstein et al. data structure containing one point for every j<i.  It performs binary search on the value, which reduces to emptiness queries  It provides an answer within ε additive error from the optimal one.  Running time: O(n 4/3+δ log(U/ε) ) SODA '1130 ~

31  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '1131

32  By simple algebra we can write our weight function w(j,i) as w’(j,i)/(i-j)+C where  The weight function w’ is Monge!  Key Idea: approximate i-j by a constant!  But how? SODA '1132

33  For each i, we break the choices of j into intervals [l k,r k ] s.t i-l k and i-r k differ by at most 1+ε.  Ο(logn/ε) such intervals suffice to get a 1+ε approximation.  However, we need to make sure that when we solve a specific subproblem, the optimum lies in the desired interval.  How? SODA '1133

34  M is a sufficiently large positive constant  Running time O(nlogn/ε) using an O(n) time per subproblem SODA '1134 Larmore Schieber

35  Motivation  Related Work  “Vanilla” DP algorithm  Our contributions  Analysis of the recurrence  Halfspaces and DP  Multiscale Monge optimization  Conclusions SODA '1135

36  Two new techniques for approximate DP for a recurrence not treated by existing methods:  Halfspace emptiness queries  Multiscale Monge Decomposition SODA '1136

37 SODA '1137

38 Thanks a lot! SODA '1138


Download ppt "Charalampos (Babis) E. Tsourakakis SODA 2011 25 th January ‘11 SODA '111."

Similar presentations


Ads by Google