Download presentation
Presentation is loading. Please wait.
Published byΚαλόγερος Στεφανόπουλος Modified over 5 years ago
1
Approximating Points by A Piecewise Linear Function: II
Approximating Points by A Piecewise Linear Function: II. Dealing with Outliers Danny Z. Chen and Haitao Wang Computer Science and Engineering University of Notre Dame Indiana, USA
2
Problem Definitions Input: A point set P in 2-D
Output: An approximation function f Error: Vertical distance e(P,f)=max{error of each point} error
3
Two Problem Versions min-#: min-ε: Given: An error toleranceε≥ 0
Goal: f of minimized size, with e(p,f) ≤ε min-ε: Given: k>0 Goal: f of minimized error e(p,f), with size ≤k
4
Problem Variations Step function (SF) error
5
Problem Variations (cont.)
Piecewise-linear function (PF) error
6
Problem Variations (cont.)
Weighted versions for both SF and PF Every point has a weight ui Error of each point: ui×vertical distance WSF and WPF there is a weight ui
7
Problem Variations (cont.)
Violation versions: Allow a given number g of violation points e(P,f)=max{error of every non-violation point} error violation
8
Problem Variations (cont.)
Violation versions of SF, WSF VSF and VWSF error violation
9
Problem Variations (cont.)
Violation versions of PF, WPF VPF and VWPF error violation
10
Problem Summary VSF, VPF, Weighted version: VWSF, VWPF min-# and min-ε
11
Motivations Applications where some noisy data must be removed
Find the outliers optimally
12
Previous Work min-# min-ε VSF O(ng2) [Fournier, Vigneron,08]
O(ng2logn) [Fournier, Vigneron,08] VWSF VPF VWPF
13
Our Results min-# min-ε VSF O(ng2logn) [Fournier,Vigneron,08]
O(ng3k log(log*n)) VWSF O(ng2) O(n2+ng2logn) VPF O(ng4log2n) O(nkg2(nlogg+g3nδ)) O(nglogn(nlogg+g3nδ)) VWPF
14
Technical Contributions
One min-# framework One min-εframework A min-εalgorithm for VWSF A min-εalgorithm for VWPF Data structures which are of independent interest
15
min-# Framework Problem: Dynamic programming Givenε
Goal: A function of minimized size Constraint: The function error is ≤ε Dynamic programming Sub-problems N(i,t): The minimum number of segments for Pin with t violations, for any 1≤i≤n, 0≤t≤g Goal: N(1,g) N(i,t)=1+min{N(riq+1,t-q)} for any 0≤q≤t
16
min-# Framework (cont.)
riq: The rightmost point such that all points from i to riq can be approximated by one segment with q violations and the error is ≤ε q violations point riq pi
17
min-# Framework (cont.)
N(i,t)=1+min{N(riq+1,t-q)} for any 0≤q≤t Use one segment with q violations to approximate points from i to riq and use N(riq+1,t-q) segments with t-q violations to approximate points from riq+1 to n N(i,t) riq riq+1 i n N(riq+1,t-q)
18
min-# Framework (cont.)
Time of the framework: O(ng2+T) (O(T) is for computing all ng riq’s) Major components: Compute riq for all 1≤i≤n, 0≤q≤g
19
min-# Framework (cont.)
A scheme for computing all riq’s Key: A fully dynamic data structure to maintain a point set for Point insertion: O(I) time, Point deletion: O(D) time, Feasibility test: Whether the current set can be approximated by one segment with q violations, O(F) time Time of the scheme: O(ng(I+D+F+g))
20
VSF and VWSF Data structure: Four arrays, and the range-minima
Time: O(g) per operation (insertion, deletion, feasibility test) Computing all riq’s: O(ng2) time min-# algorithm: O(ng2) time
21
VPF and VWPF Data structure: The fully dynamic convex hull data structure [Overmars,81] (not the one in [Brodal, 02]) Insertion and deletion: O(log2n) Feasibility test: Algorithm for 2D general LP with q violations, O(q3log2n) Computing all riq’s: O(ng4log2n) time
22
min-εFramework Problem: Dynamic programming Given k
Goal: A function of minimized error Constraint: The function size is k Dynamic programming Sub-problems E(j,l,t): The minimum error for P1j using l segments and t violations, for any 1≤i≤n, 0≤l≤k, 0≤t≤g Goal: E(n,k,g) E(j,l,t)=min{max{E(i-1,l-1,t-q), wijq}} for 1<i≤j, 0≤q≤t
23
min-εFramework (Cont.)
wijq: The minimum error to approximate Pij with q violations by one segment q violations wijq pi pj
24
min-εFramework (Cont.)
E(j,l,t)=min{max{E(i-1,l-1,t-q), wijq}} for 1<i≤j, 0≤q≤t Use one segment with q violations to approximate Pij and use l-1 segment with t-q violations to approximate P1,i-1 E(j,l,t) i-1 i j 1 q outliers E(i-1,l-1,t-q)
25
min-εFramework (Cont.)
A straightforward way: O(n2g2k W) where O(W) is the time for computing each wijq Not efficient
26
Improvement Transform 3-D to 2-D A totally monotone 2-D matrix
Use row-minima algorithm [Aggarwal etc, 87] Time: O(ng2kW) (instead of O(n2g2kW))
27
Major Components Compute wijq: The minimum error to approximate Pij with q violations q violations wijq pi pj
28
VSF and q-range-minima
Given: An array A[1…n], Query (i,j): The q smallest elements in A[i…j] in sorted order An extension of range-minima
29
VSF and q-range-minima (cont)
A simple solution: (n,qlogq), by range-minima Our solution: (nlog2q, qlog(log*n)) VSF solution: O(nkg3log(log*n))
30
VPF and VWPF To compute wijq: Total time: O(ngk(nlogq+q3nδ))
3-D feasible LP with q violations A O(nlogq+q3nδ) time solution in [Matousek,94] Total time: O(ngk(nlogq+q3nδ))
31
min-εAlgorithm for VWSF
Observation:ε* is determined by two points in P Idea: Consider all pairs of points, by using the min-# algorithm as a decision procedure Time: O(n2+ng2logn)
32
min-εAlgorithm for VWPF
Observation:ε* is equal to some Wijq A naïve solution: Compute all Wijq’s and then findε* Not efficient: There are O(n2g) Wijq’s Improved solution: Binary search on sorted arrays Only O(nglog n) Wijq’s need to be computed min-# algorithm as a decision procedure
33
Thank You Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.