Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’ Compensation Insurance David Speights Senior Research Statistician HNC.

Slides:



Advertisements
Similar presentations
Nonparametric Bootstrap Inference on the Characterization of a Response Surface Robert Parody Center for Quality and Applied Statistics Rochester Institute.
Advertisements

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
3 pivot quantities on which to base bootstrap confidence intervals Note that the first has a t(n-1) distribution when sampling from a normal population.
Lecture 8 Relationships between Scale variables: Regression Analysis
An Introduction to Stochastic Reserve Analysis Gerald Kirschner, FCAS, MAAA Deloitte Consulting Casualty Loss Reserve Seminar September 2004.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Confidence intervals. Population mean Assumption: sample from normal distribution.
Chapter 6 Introduction to Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.
Bootstrapping LING 572 Fei Xia 1/31/06.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Bootstrap Estimation of the Predictive Distributions of Reserves Using Paid and Incurred Claims Huijuan Liu Cass Business School Lloyd’s of London 11/07/2007.
Basics of regression analysis
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
Bootstrapping applied to t-tests
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.
Bootstrapping – the neglected approach to uncertainty European Real Estate Society Conference Eindhoven, Nederlands, June 2011 Paul Kershaw University.
Statistical Bootstrapping Peter D. Christenson Biostatistician January 20, 2005.
Microeconometric Modeling William Greene Stern School of Business New York University.
Two Approaches to Calculating Correlated Reserve Indications Across Multiple Lines of Business Gerald Kirschner Classic Solutions Casualty Loss Reserve.
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
12.1 Heteroskedasticity: Remedies Normality Assumption.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Resampling techniques
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
Eurostat Statistical matching when samples are drawn according to complex survey designs Training Course «Statistical Matching» Rome, 6-8 November 2013.
ANOVA ANOVA is used when more than two groups are compared In order to conduct an ANOVA, several assumptions must be made – The population from which the.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Lynn Lethbridge SHRUG November, What is Bootstrapping? A method to estimate a statistic’s sampling distribution Bootstrap samples are drawn repeatedly.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Case Selection and Resampling Lucila Ohno-Machado HST951.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4.
Lecture 4 Confidence Intervals. Lecture Summary Last lecture, we talked about summary statistics and how “good” they were in estimating the parameters.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
1/61: Topic 1.2 – Extensions of the Linear Regression Model Microeconometric Modeling William Greene Stern School of Business New York University New York.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Estimating standard error using bootstrap
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Application of the Bootstrap Estimating a Population Mean
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Sampling distribution
Estimation & Hypothesis Testing for Two Population Parameters
Behavioral Statistics
Microeconometric Modeling
Statistics in Applied Science and Technology
Test for Mean of a Non-Normal Population – small n
Simulation: Sensitivity, Bootstrap, and Power
Bootstrap - Example Suppose we have an estimator of a parameter and we want to express its accuracy by its standard error but its sampling distribution.
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
The regression model in matrix form
Regression Models - Introduction
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Microeconometric Modeling
Bootstrapping Jackknifing
Bootstrapping and Bootstrapping Regression Models
Regression Models - Introduction
Presentation transcript:

Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’ Compensation Insurance David Speights Senior Research Statistician HNC Insurance Solutions Irvine, California

Presentation Outline Introduction to the problem Introduction to Bootstrap Resampling Two resampling approaches for comparing two groups Examples Conclusions

Introduction to the Problem Compare two groups from observational data –Outcome (Y) {e.g. Claim Cost} –Characteristics (X) have distributions F 1 and F 2 Difficulties –F 1  F 2 –X is associated with Y (i.e. X is a confounder) –example: claim severity associated with claim cost

Introduction to the Problem Ideal solution –Randomize subjects into the two groups –Ideal solution not usually possible Alternate solution {Topic of the paper} –Identify characteristics where F 1  F 2 –Adjust the distribution of Y to account for the differing distributions of X

Introduction to Bootstrap Resampling Purpose –Obtain the distribution of a parameter estimate (i.e. sampling distribution) –Not rely on assumptions about the underlying distribution –Often used when parameter estimate has difficult to obtain distribution relies heavily on unrealistic assumptions

Introduction to Bootstrap Resampling Given Data – {X 1, X 2, …, X n } where X i is a p x 1 vector –X has unspecified distribution F Parameter of interest  –  = T(F) is a parameter of interest We want the distribution of

Introduction to Bootstrap Resampling Distribution of –usually obtained through theoretical properties if repeated sampling is performed on a population with a known distribution of X –bootstrap techniques resample from the data to simulate repeated sampling from the population with unknown distribution of X

Introduction to Bootstrap Resampling Example -- Population Mean Example -- Population Mean Resample with replacement from data –Data is (X 1, …, X n ). –Each data point equally likely to be selected –Resampled data is (X (b) 1, …, X (b) n ). – is the b th bootstrap estimate of 

Introduction to Bootstrap Resampling Example -- Population Mean B bootstrap samples are drawn Distribution of is estimated with the empirical distribution function of Mean and variance of this distribution used to estimate mean and variance of

Two Resampling Methods for Comparing Two Groups Method 1: Normalized comparisons –Y is a response of interest –X is a category variable, confounder –Z=1 for group 1, Z=2 for group 2 –F(Y|Z=1) normalized for distribution of X in group 2 –F(Y|Z=2) non- normalized

Two Resampling Methods for Comparing Two Groups Method 1: Normalized comparisons –Resample from (Y i,X i ) seperately for groups 1 and 2 –Construct estimates of F(Y|X=x j ) and P(X=x j ) for two groups –Construct estimates of the normalized distribution functions on the previous slide –Parameter estimates can be obtained from this

Two Resampling Methods for Comparing Two Groups Method 2: Bootstrapping linear regression –Y is a response of interest –X is vector of variables, confounders –Z=1 for group 1, Z=2 for group 2 –Use the regression model

Two Resampling Methods for Comparing Two Groups Method 2: Bootstrapping linear regression –Estimate ( ,  ) with the least squares estimates on original data –Resample with replacement from the residuals –Construct the b th bootstrap value of Y as –b th bootstrap sample is

Two Resampling Methods for Comparing Two Groups Method 2: Bootstrapping linear regression –Construct estimates of ( ,  ) with the least squares estimates on bootstrap sample –Using the B bootstrap estimates of ( ,  ), construct the distribution of the parameters of interest

Examples Using Data from a Nationwide Data Base of Workers Compensation Claims Normalized comparisons of percentiles –Y= Total claim cost –Group 1: Providers in network A –Group 2: Providers not in network A –X is a 10 level variable representing claim severity derived through ICD9 code on a claim –B = 500 bootstrap sample drawn –median, 75 th, and 95 th percentiles compared –Normalization relative to group 1

Examples Using Data from a Nationwide Data Base of Workers Compensation Claims Normalized comparisons of percentiles

Examples Using Data from a Nationwide Data Base of Workers Compensation Claims Bootstrapping linear regression –Y = log(Total Indemnity Costs) –X consists of several variables NCCI body part designation, nature of injury designation, accident cause, industry class code, and injury type 10 level claim severity measure derived with ICD9 code Age and gender –Group 1: Specific provider of interest (Provider Z) –Group 2: All other providers –B=500 bootstrap samples

Examples Using Data from a Nationwide Data Base of Workers Compensation Claims

Conclusions Bootstrap methodology is a flexible robust method for deriving sampling distributions Can be used to compare two groups while considering possible confounder variables Useful method for observational studies Only a few examples shown in this paper/presentation, much more potential