Fractional-Random-Weight Bootstrap

Fractional-Random-Weight Bootstrap
Applications of the Fractional-Random-Weight Bootstrap William Q. Meeker Department of Statistics Center for Nondestructive Evaluation Iowa State University Chris Gotwalt Director of JMP Statistical R&D JMP Division, SAS Institute Copyright © 2013, SAS Institute Inc. All rights reserved.

Fractional Weight Bootstrap
Overview Bootstrap background Weighted estimation and random-weight bootstrap Integer weights Fractional weights Fractional-random-weight methods Uniform Dirichlet weights iid weights Motivating example: Bearing cage field failure data Other examples Rocket motor failures Power transformer failure prediction Generalized gamma fitting to limited data Variable selection in a designed experiment Other potential applications Concluding remarks

Bootstrap Background Popular statistical tool: Leads to improved inferences (e.g., tests and confidence intervals) Requires mild assumptions Easy to implement Applies even to procedures where there is less classical theory to offer guidance

Common Bootstrap—Basic Ideas The most common bootstrapping approach is done via a Monte Carlo simulation: A new dataset is created by Resampling the rows of the original data with replacement or Parametrically simulating A statistical procedure (e.g., model fitting, a hypothesis test, or confidence interval) is applied to the simulated data set and the some results (estimates, p-values, etc.) are stored This repeated a number of times (e.g., 2,500 times) and then the saved results are processed (…somehow, depends on the application) to make inferences (e.g., construct confidence intervals) Copyright © 2013, SAS Institute Inc. All rights reserved.

Weighted Data and Weighted Estimation It is often convenient to put “weights” (or frequencies or counts) on observations Binary data (replace with number of 0s and 1s) Censored life test data (e.g., failure times plus the number of units that survived a 1,000 hour test) Data compression where similar observations are put into bins (e.g., a histogram) Observations with non-constant variance (weights inversely proportional to variance) Many estimation methods allow specification of weights or frequencies Sample moments (means and variances) Weighted least squares Likelihood Sample mean and standard deviation with weights Weighted likelihood

n=15 Integer Weight Bootstrap Samples
Fractional Weight Bootstrap n=15 Integer Weight Bootstrap Samples

n=15 Integer and Fractional Random-Weight Bootstrap Samples
Fractional Weight Bootstrap n=15 Integer and Fractional Random-Weight Bootstrap Samples

Choosing Bootstrap Random Weights
Fractional Weight Bootstrap Choosing Bootstrap Random Weights Integer weights (resampling) Sampling observations with replacement Equivalent to drawing a sample weights from a uniform multinomial distribution The integer weights have a mean and variance equal to 1 Fractional (non-integer) weights Sample from a uniform Dirichlet distribution (weights will sum to n). Suggested by Rubin (1981) as the “Bayesian bootstrap” Sample from some other distribution that has a mean and a variance of 1 (expected value of the sum of the weights will be equal to n) There is large-sample theory to support both methods

Fractional-Random Weight Bootstrap Background
Fractional Weight Bootstrap The fractional-random-weight bootstrap is also known as the Random-weight bootstrap Weighted likelihood bootstrap Weighted bootstrap Perturbation bootstrap Bayesian bootstrap Operationally, the fractional-random-weight bootstrap is used in the same way as the resampling bootstrap The fractional-random-weight bootstrap has important advantages in many applications Like resampling, the method is nonparametric All original observations remain in the bootstrap samples Estimation difficulties do not arise

BOOTSTRAPPING IN JMP PRO In JMP PRO, it is easy to bootstrap almost any analysis: Fractional Weights are not the default in JMP PRO, but easy to select Copyright © 2013, SAS Institute Inc. All rights reserved.

Bearing Cage Field Failure Data Event Plot
Fractional Weight Bootstrap Bearing Cage Field Failure Data Event Plot 6 Failures 1697 Right-censored observations Will the bootstrap work?

Weibull Analysis Using Life Distribution
Fractional Weight Bootstrap Weibull Analysis Using Life Distribution

Will the Bootstrap Work with Heavy Censoring?
Fractional Weight Bootstrap Will the Bootstrap Work with Heavy Censoring? If the expected number failing is too small there could be bootstrap samples with only 0 or 1 failure, causing JMP’s ML algorithm to fail For the Bearing Cage example, the probability of obtaining a bootstrap sample with 0 or 1 failure using the integer weight method is 0.017 Using the fractional weight method, the probability is 0

Bootstrap Confidence Limits Bootstrap Confidence Limits Bootstrap Confidence Limits Fractional Weight Bootstrap Bootstrap Results for Estimating the Bearing Cage Weibull Distribution Shape Parameter Integer Weights (resampling) Fractional Weights

Rocket Motor Field Data Event Plot
Fractional Weight Bootstrap Rocket Motor Field Data Event Plot 3 Left-censored observations 1934 Right-censored observations The usual resampling (integer weight) bootstrap will not work

Rocket Motor Weibull Analysis
Fractional Weight Bootstrap Rocket Motor Weibull Analysis

Rocket Motor Weibull Analysis Resampling and Bootstrap Results
Bootstrap Confidence Limits Bootstrap Confidence Limits Fractional Weight Bootstrap Rocket Motor Weibull Analysis Resampling and Bootstrap Results Fractional Weights Integer Weights (resampling) Weibull β Weibull β

Power Transformer Data from Hong, Meeker, and McCalley (2009)
Fractional Weight Bootstrap Power Transformer Data from Hong, Meeker, and McCalley (2009) 710 Power transformers with 62 failed units Some units more than 60 years old Units still in service at the “data freeze” date in March 2008 are right censored Data records begin in 1980 Units installed before 1980 are observations from a truncated distribution Several explanatory variables Purpose: predict future failures

Event Plot of a Subset of the Power Transformer Field Failure Data
Fractional Weight Bootstrap Event Plot of a Subset of the Power Transformer Field Failure Data

Power Transformer Model and Likelihood
Fractional Weight Bootstrap Power Transformer Model and Likelihood Fit Weibull and lognormal distributions Stratification based on whether manufactured before or after 1987 The likelihood functions is where tij is the failure tij is the censoring time is the lower truncation time and are censoring and truncation indicators for transformer i in stratum j How to bootstrap/simulate to get prediction intervals?

Power Transformer Fleet Predictions Based on the Fractional-Random-Weight Bootstrap

Power Transformer Individual Predictions based on the Fractional-Weight Bootstrap

Ball Bearing Failure Time Event Plot Analyzed in Meeker and Escobar (1998)

Ball Bearing Failure Times Generalized Gamma Analysis on Weibull Probability Paper

Ball Bearing Failure Times Resampling and Fractional-Random-Weight Bootstrap Results Integer Weights (resampling) Fractional Weights lambda

Ball Bearing Failure Times Generalized Gamma, Weibull, and Lognormal distributions on Weibull Probability Paper

DESIGNED EXPERIMENTS Designed Experiments (DOEs) are a common approach to problem solving in science and industry Experiments are often costly in time and resources Goal: get as much information as possible about the relationship between inputs (X) and response variables (y) DOEs are specially constructed combinations of X values that optimize information gained in a small number of runs We are interested in the variable selection aspect of DOEs – Identifying the subset of the X variables (and their interaction and quadratic effects) that best explain variation in y Copyright © 2013, SAS Institute Inc. All rights reserved.

DOEs AND THE RESAMPLING BOOTSTRAP Removing observations can drastically change the properties of the design – especially the set of models that can be fit The resampling bootstrap will almost always drop rows from the analysis. Different bootstrap samples will different sets of identifiable models. Copyright © 2013, SAS Institute Inc. All rights reserved.

TWO OTHER BOOTSTRAP PROCEDURES Two well-known alternatives: Parametric bootstrap Choose a model Simulate new data from it Bootstrapping Errors Choose a model, Calculate the errors Simulate new data by adding the permuted residual errors to the model predictions Both of these require choosing a model! Fractional-random-weight bootstrap provides a non-parametric bootstrap that works for DOEs… Copyright © 2013, SAS Institute Inc. All rights reserved.

NITROGEN OXIDE RSM Nitrogen Oxides (NOx) are toxic greenhouse gases that are common by-products of burning organic compounds. An experiment was done on an industrial burner to study the amount of NOx it created. A 32 run I-Optimal RSM design was created with 7 continuous factors: Hydrogen Fraction in primary fuel Air/Fuel Ratio Lance Position X Lance Position Y Secondary Fuel Fraction Dispersant Ethanol Percentage in primary fuel Copyright © 2013, SAS Institute Inc. All rights reserved.

NITROGEN OXIDE RSM We want to assess the importance of the input variables We apply a forward stepwise procedure that selects the a model using the AICc model selection criteria We can apply the fractional-random-weight bootstrap to obtain selection probabilities of the model terms Copyright © 2013, SAS Institute Inc. All rights reserved.

NITROGEN OXIDE RSM Run the Distribution script and you can see the variables that were selected less often These will have a histogram “spike” at zero Copyright © 2013, SAS Institute Inc. All rights reserved.

NITROGEN OXIDE RSM Can obtain probabilities that terms enter the model We can interpret these loosely as posterior probabilities of whether a term belongs to the model Use a cutoff on the model selection probabilities like .5 to determine which terms should be kept. Copyright © 2013, SAS Institute Inc. All rights reserved.

NITROGEN OXIDE RSM Lance Position Y(5,10) has a large p-value in the subset model’s analysis. This probably reflects that it was often pulled in along with its various interaction terms to preserve effect heredity. Recommend removing it from the final model used (or select the JMP option not to enforce effect heredity). Copyright © 2013, SAS Institute Inc. All rights reserved.

Concluding Remarks With vastly improved computing capabilities and developed theory, bootstrapping provides an important useful tool for obtaining Trustworthy confidence intervals Trustworthy prediction intervals Better regression models Should also be useful for Generalized Pivotal Quantity (aka Generalized Fiducial) statistical methods. The Fractional-random-weight bootstrap tremendously expands the potential areas of application of the bootstrap

Some References Rubin, D. B. (1981). The Bayesian bootstrap. Annals of Statistics 9, 130–134. Newton, M. A. and A. E. Raftery (1994). Approximate Bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society, Series B 56, 3–48. Jin, Z., Z. Ying, and L. Wei (2001). A simple resampling method by perturbing the minimand. Biometrika 88, 381–390. Hong, Y., W. Q. Meeker, and J. D. McCalley (2009). Prediction of remaining life of power transformers based on left truncated and right censored lifetime data. Annals of Applied Statistics 3, 857–879. Meeker, W. Q., Hahn, G.J., and L. A. Escobar (2017) Statistical Intervals: A Guide for Practitioners and Researchers, Wiley.

Fractional-Random-Weight Bootstrap

Similar presentations

Presentation on theme: "Fractional-Random-Weight Bootstrap"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fractional-Random-Weight Bootstrap

Similar presentations

Presentation on theme: "Fractional-Random-Weight Bootstrap"— Presentation transcript:

Similar presentations

About project

Feedback