Optimal Bandwidth Selection for MLS Surfaces Hao Wang Carlos E. Scheidegger Claudio T. Silva SCI Institute – University of Utah Shape Modeling International 2008 – Stony Brook University
Point Set Surfaces Levin’s MLS formulation Shape Modeling International 2008 – Stony Brook University
Neighborhood and Bandwidth Three parameters in both steps of Levin’s MLS: Weight function Neighborhood Bandwidth Second Step of Levin’s MLS is weighted least squares polynomial fitting Bandwidth determination is important because of the overfitting/underfitting problem Overfitting Underfitting Shape Modeling International 2008 – Stony Brook University
Neighborhood and Bandwidth Common practice Weight function: Exponential Neighborhood: Spherical Bandwidth: Heuristics Problems Optimality Anisotropic Dataset Weight function: Exponential function is computationally intensive. Other function may lead to higher computational efficiency or better reconstruction quality. The APSS paper (SIGGRAPH 2007) uses (1-x^2)^4 as the weight function. Neighborhood: Spherical neighborhood may not deal with anisotropic data properly. Other neighborhood shape may lead to better reconstruction quality. Bandwidth: Empirically choose bandwidth based on trial and error involves human interaction. In addition, even though heuristic methods can produce visually acceptable reconstruction, the result might not be geometrically accurate. Shape Modeling International 2008 – Stony Brook University
Related Work Other MLS Formulations Alexa et al. Guennebaud et al. Robust Feature Extraction Fleishman et al. Bandwidth Determination Adamson et al. Lipman et al. Guennebaud et al. “APSS” SIGGRAPH 2007
Locally Weighted Kernel Regression Problem Points sampled from functional with white noise added White noise are i.i.d. random variables Reconstruct the functional with least squares criterion Approach Consider each point p individually p is reconstructed by utilizing information of its neighborhood Influence of each neighboring point is related to its distance from p Shape Modeling International 2008 – Stony Brook University
Kernel Regression v.s MLS Surfaces Kernel Regression is mostly the same as the second step in Levin’s MLS. The only difference is between kernel weighting and MLS weighting. Shape Modeling International 2008 – Stony Brook University
Kernel Regression v.s MLS Surfaces Difference Kernel weighting for functional data MLS weighting for manifold data Advantages of Kernel Regression More mature technique for processing noisy sample points Behavior of the neighborhood and kernel better studied Goal Adapt techniques in kernel regression to MLS surfaces Extend theoretical results of kernel regression to MLS surfaces Shape Modeling International 2008 – Stony Brook University
Weight Function Common choices of weight functions in kernel regression: Epanechnikov Normal Biweight Optimal weight function: Epanechnikov Choice of weight function not important Implication: Optimality Most weight functions produce results with ignorable differences Implications: Exponential weight function in Levin’s MLS can be replaced by weight functions requiring less computational effort Weight function in Levin’s MLS is indeed a near optimal choice Shape Modeling International 2008 – Stony Brook University
Evaluation of Kernel Regression MSE MSE = Mean Squared Error Evaluate result of the functional fitting at each point Shape Modeling International 2008 – Stony Brook University
Evaluation of Kernel Regression MISE Integration of MSE over the domain Evaluate the global performance of kernel regression Shape Modeling International 2008 – Stony Brook University
Optimal Bandwidth Optimality Computation Leading to minimum MSE / MISE Each point with a different optimal bandwidth Computation MSE / MISE approximated by Taylor Polynomial Solve for the minimizing bandwidth Computation MSE / MISE can be approximated by a Taylor polynomial The Taylor polynomial involves bandwidth and thus can be considered as a function of bandwidth Solve analytically for minimizing bandwidth of the polynomial by setting its derivative to be 0 and solve the equation Shape Modeling International 2008 – Stony Brook University
Optimal Bandwidth Approach Unknown quantities in computation Derivatives of underlying functional Variance of random noise variables Density of point set Approach Derivatives: Ordinary Least Squares Fitting Variance: Statistical Inference Density: Kernel Density Estimation Computation of bandwidth is difficult because it involves unknown quantities: Taylor polynomial involves derivatives of underlying functional Taylor polynomial involves variance of random noise variables Taylor polynomial may involve density of point set Solution Underlying functional approximated by ordinary least squares fitting Derivatives approximated by derivatives of approximated functional Variance of random noise variables estimated by statistical inference Density of point set estimated by kernel density estimation Shape Modeling International 2008 – Stony Brook University
Optimal Bandwidth in 2-D Optimal bandwidth based on MSE: Interpretation Higher noise level : larger bandwidth Higher curvature : smaller bandwidth Higher density : smaller bandwidth More point samples : smaller bandwidth Shape Modeling International 2008 – Stony Brook University
Optimal Bandwidth in 3-D Kernel Function: with Kernel Shape: Shape Modeling International 2008 – Stony Brook University
Optimal Bandwidth in 3-D Optimal spherical bandwidth based on MSE: Optimal spherical bandwidth based on MISE: Shape Modeling International 2008 – Stony Brook University
Experiments Bandwidth selectors choose near optimal bandwidths Shape Modeling International 2008 – Stony Brook University
Experiments Shape Modeling International 2008 – Stony Brook University
Experiments Shape Modeling International 2008 – Stony Brook University
Optimal Bandwidth in MLS From functional domain to manifold domain Choose a functional domain Use kernel regression with modification There are 2 ways to do this: 1. Choose functional domain using k-NN or constant radius sphere and apply kernel regression with optimal bandwidth 2. Use kernel regression with modification Approximate underlying function using MLS with heuristically chosen bandwidths Approximate unknown quantities in optimal bandwidth formulas using approximated MLS surface Shape Modeling International 2008 – Stony Brook University
Robustness Insensitivity to error in first step of Levin’s MLS Reference plane found by the first step is rotated by angle theta, as shown on x-axis. The y-axis shows the mean value of the bandwidth / reconstruction error. The angle v.s bandwidth plot shows a curve similar to plot of cosine function because by rotating theta, roughly cos(theta) of original point set remains. Shape Modeling International 2008 – Stony Brook University
Comparison Constant h: uniform v.s non-uniform sampling k-NN: sampling v.s feature MSE/MISE based plug-in method: most robust and flexible Constant h: Not suitable for non-uniformly point samples because of its lack of local adaptation. k-NN: Can not distinguish high noise level and high curvature MSE/MISE based plug-in method: Work for both regularly and irregularly sampled points Capable of distinguishing noise and curvature Shape Modeling International 2008 – Stony Brook University
Comparison MSE/MISE-based plug-in method better than heuristic methods Shape Modeling International 2008 – Stony Brook University
Comparison Heuristic methods can produce visually acceptable but not geometrically accurate reconstruction. Shape Modeling International 2008 – Stony Brook University
Future Work Nonlinear kernel regression bandwidth selector in 3-D Compute optimal bandwidth implicitly Extend the method to other MLS formulations Nonlinear kernel regression bandwidth selector in 3-D Derivation mathematically intense Formulas need to be simplified Compute optimal bandwidth implicitly Avoiding computing bandwidth analytically may lead to do simpler method Extend the method to other MLS formulations Shape Modeling International 2008 – Stony Brook University