Approximating Points by A Piecewise Linear Function: II

Approximating Points by A Piecewise Linear Function: II
Approximating Points by A Piecewise Linear Function: II. Dealing with Outliers Danny Z. Chen and Haitao Wang Computer Science and Engineering University of Notre Dame Indiana, USA

Problem Definitions Input: A point set P in 2-D
Output: An approximation function f Error: Vertical distance e(P,f)=max{error of each point} error

Two Problem Versions min-#: min-ε: Given: An error toleranceε≥ 0
Goal: f of minimized size, with e(p,f) ≤ε min-ε: Given: k>0 Goal: f of minimized error e(p,f), with size ≤k

Problem Variations Step function (SF) error

Problem Variations (cont.)
Piecewise-linear function (PF) error

Weighted versions for both SF and PF Every point has a weight ui Error of each point: ui×vertical distance WSF and WPF there is a weight ui

Violation versions: Allow a given number g of violation points e(P,f)=max{error of every non-violation point} error violation

Violation versions of SF, WSF VSF and VWSF error violation

Violation versions of PF, WPF VPF and VWPF error violation

Problem Summary VSF, VPF, Weighted version: VWSF, VWPF min-# and min-ε

Motivations Applications where some noisy data must be removed
Find the outliers optimally

Previous Work min-# min-ε VSF O(ng2) [Fournier, Vigneron,08]
O(ng2logn) [Fournier, Vigneron,08] VWSF VPF VWPF

Our Results min-# min-ε VSF O(ng2logn) [Fournier,Vigneron,08]
O(ng3k log(log*n)) VWSF O(ng2) O(n2+ng2logn) VPF O(ng4log2n) O(nkg2(nlogg+g3nδ)) O(nglogn(nlogg+g3nδ)) VWPF

Technical Contributions
One min-# framework One min-εframework A min-εalgorithm for VWSF A min-εalgorithm for VWPF Data structures which are of independent interest

min-# Framework Problem: Dynamic programming Givenε
Goal: A function of minimized size Constraint: The function error is ≤ε Dynamic programming Sub-problems N(i,t): The minimum number of segments for Pin with t violations, for any 1≤i≤n, 0≤t≤g Goal: N(1,g) N(i,t)=1+min{N(riq+1,t-q)} for any 0≤q≤t

min-# Framework (cont.)
riq: The rightmost point such that all points from i to riq can be approximated by one segment with q violations and the error is ≤ε q violations point riq pi

N(i,t)=1+min{N(riq+1,t-q)} for any 0≤q≤t Use one segment with q violations to approximate points from i to riq and use N(riq+1,t-q) segments with t-q violations to approximate points from riq+1 to n N(i,t) riq riq+1 i n N(riq+1,t-q)

Time of the framework: O(ng2+T) (O(T) is for computing all ng riq’s) Major components: Compute riq for all 1≤i≤n, 0≤q≤g

A scheme for computing all riq’s Key: A fully dynamic data structure to maintain a point set for Point insertion: O(I) time, Point deletion: O(D) time, Feasibility test: Whether the current set can be approximated by one segment with q violations, O(F) time Time of the scheme: O(ng(I+D+F+g))

VSF and VWSF Data structure: Four arrays, and the range-minima
Time: O(g) per operation (insertion, deletion, feasibility test) Computing all riq’s: O(ng2) time min-# algorithm: O(ng2) time

VPF and VWPF Data structure: The fully dynamic convex hull data structure [Overmars,81] (not the one in [Brodal, 02]) Insertion and deletion: O(log2n) Feasibility test: Algorithm for 2D general LP with q violations, O(q3log2n) Computing all riq’s: O(ng4log2n) time

min-εFramework Problem: Dynamic programming Given k
Goal: A function of minimized error Constraint: The function size is k Dynamic programming Sub-problems E(j,l,t): The minimum error for P1j using l segments and t violations, for any 1≤i≤n, 0≤l≤k, 0≤t≤g Goal: E(n,k,g) E(j,l,t)=min{max{E(i-1,l-1,t-q), wijq}} for 1<i≤j, 0≤q≤t

min-εFramework (Cont.)
wijq: The minimum error to approximate Pij with q violations by one segment q violations wijq pi pj

E(j,l,t)=min{max{E(i-1,l-1,t-q), wijq}} for 1<i≤j, 0≤q≤t Use one segment with q violations to approximate Pij and use l-1 segment with t-q violations to approximate P1,i-1 E(j,l,t) i-1 i j 1 q outliers E(i-1,l-1,t-q)

A straightforward way: O(n2g2k W) where O(W) is the time for computing each wijq Not efficient

Improvement Transform 3-D to 2-D A totally monotone 2-D matrix
Use row-minima algorithm [Aggarwal etc, 87] Time: O(ng2kW) (instead of O(n2g2kW))

Major Components Compute wijq: The minimum error to approximate Pij with q violations q violations wijq pi pj

VSF and q-range-minima
Given: An array A[1…n], Query (i,j): The q smallest elements in A[i…j] in sorted order An extension of range-minima

VSF and q-range-minima (cont)
A simple solution: (n,qlogq), by range-minima Our solution: (nlog2q, qlog(log*n)) VSF solution: O(nkg3log(log*n))

VPF and VWPF To compute wijq: Total time: O(ngk(nlogq+q3nδ))
3-D feasible LP with q violations A O(nlogq+q3nδ) time solution in [Matousek,94] Total time: O(ngk(nlogq+q3nδ))

min-εAlgorithm for VWSF
Observation:ε* is determined by two points in P Idea: Consider all pairs of points, by using the min-# algorithm as a decision procedure Time: O(n2+ng2logn)

min-εAlgorithm for VWPF
Observation:ε* is equal to some Wijq A naïve solution: Compute all Wijq’s and then findε* Not efficient: There are O(n2g) Wijq’s Improved solution: Binary search on sorted arrays Only O(nglog n) Wijq’s need to be computed min-# algorithm as a decision procedure

Thank You Questions?

Approximating Points by A Piecewise Linear Function: II

Similar presentations

Presentation on theme: "Approximating Points by A Piecewise Linear Function: II"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Approximating Points by A Piecewise Linear Function: II

Similar presentations

Presentation on theme: "Approximating Points by A Piecewise Linear Function: II"— Presentation transcript:

Similar presentations

About project

Feedback