2000 SEMINAR ON REINSURANCE PITFALLS IN FITTING LOSS DISTRIBUTIONS CLIVE L. KEATINGE.

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

Heuristic Search techniques
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Copyright © 2010 Pearson Education, Inc. Slide
1 Regression Models & Loss Reserve Variability Prakash Narayan Ph.D., ACAS 2001 Casualty Loss Reserve Seminar.
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Reserve Risk Within ERM Presented by Roger M. Hayne, FCAS, MAAA CLRS, San Diego, CA September 10-11, 2007.
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
An Introduction to Stochastic Reserve Analysis Gerald Kirschner, FCAS, MAAA Deloitte Consulting Casualty Loss Reserve Seminar September 2004.
Reinsurance Presentation Example 2003 CAS Research Working Party: Executive Level Decision Making using DFA Raju Bohra, FCAS, ARe.
Sampling Distributions
7-2 Estimating a Population Proportion
Copyright © Cengage Learning. All rights reserved. 6 Point Estimation.
Business Forecasting Chapter 5 Forecasting with Smoothing Techniques.
Confidence Intervals W&W, Chapter 8. Confidence Intervals Although on average, M (the sample mean) is on target (or unbiased), the specific sample mean.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Transition Matrix Theory and Loss Development John B. Mahon CARe Meeting June 6, 2005 Instrat.
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
Determining Sample Size
Session # P2 & P3 November 14, 2006, 10:00 am Estimating the Workers Compensation Tail Richard E. Sherman, FCAS, MAAA Gordon F. Diss, ACAS, MAAA.
Chapter 6 The Normal Probability Distribution
CORRELATION & REGRESSION
Inferences for Regression
597 APPLICATIONS OF PARAMETERIZATION OF VARIABLES FOR MONTE-CARLO RISK ANALYSIS Teaching Note (MS-Excel)
Traffic Modeling.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Two Approaches to Calculating Correlated Reserve Indications Across Multiple Lines of Business Gerald Kirschner Classic Solutions Casualty Loss Reserve.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
1999 CAS SEMINAR ON RATEMAKING OPRYLAND HOTEL CONVENTION CENTER MARCH 11-12, 1999 MIS-43 APPLICATIONS OF THE MIXED EXPONENTIAL DISTRIBUTION CLIVE L. KEATINGE.
Approximation of Aggregate Losses Dmitry Papush Commercial Risk Reinsurance Company CAS Seminar on Reinsurance June 7, 1999 Baltimore, MD.
Testing Models on Simulated Data Presented at the Casualty Loss Reserve Seminar September 19, 2008 Glenn Meyers, FCAS, PhD ISO Innovative Analytics.
Toward a unified approach to fitting loss models Jacques Rioux and Stuart Klugman, for presentation at the IAC, Feb. 9, 2004.
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
Forecasting Chapter 9. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall Define Forecast.
In section 11.9, we were able to find power series representations for a certain restricted class of functions. Here, we investigate more general problems.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
Reserve Variability – Session II: Who Is Doing What? Mark R. Shapland, FCAS, ASA, MAAA Casualty Actuarial Society Spring Meeting San Juan, Puerto Rico.
Tail Factors Working Party: Part 2. The Work Product Mark R. Shapland, FCAS, ASA, MAAA Casualty Loss Reserve Seminar Boston, MA September 12-13, 2005.
Ranges of Reasonable Estimates Charles L. McClenahan, FCAS, MAAA Iowa Actuaries Club, February 9, 2004.
On Predictive Modeling for Claim Severity Paper in Spring 2005 CAS Forum Glenn Meyers ISO Innovative Analytics Predictive Modeling Seminar September 19,
Reserve Uncertainty 1999 CLRS by Roger M. Hayne, FCAS, MAAA Milliman & Robertson, Inc.
Estimation and Application of Ranges of Reasonable Estimates Charles L. McClenahan, FCAS, MAAA 2003 Casualty Loss Reserve Seminar.
Ab Rate Monitoring Steven Petlick CAS Underwriting Cycle Seminar October 5, 2009.
G. Cowan RHUL Physics LR test to determine number of parameters page 1 Likelihood ratio test to determine best number of parameters ATLAS Statistics Forum.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 2: Aging and Survival.
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Stochastic Loss Reserving with the Collective Risk Model Glenn Meyers ISO Innovative Analytics Casualty Loss Reserving Seminar September 18, 2008.
1 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS This sequence presents two methods for dealing with the problem of heteroscedasticity. We will.
CARe Seminar ILF estimation Oliver Bettis 15 th September 2009.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
What’s the Point (Estimate)? Casualty Loss Reserve Seminar September 12-13, 2005 Roger M. Hayne, FCAS, MAAA.
A Random Walk Model for Paid Loss Development Daniel D. Heyer.
Reserving for Medical Professional Liability Casualty Loss Reserve Seminar September 10-11, 2001 New Orleans, Louisiana Rajesh Sahasrabuddhe, FCAS, MAAA.
Chapter 15 Forecasting. Forecasting Methods n Forecasting methods can be classified as qualitative or quantitative. n Such methods are appropriate when.
Forecasting Quantitative Methods. READ FIRST Outline Define Forecasting The Three Time Frames of Forecasting Forms of Forecast Movement Forecasting Approaches.
Prepared for the Annuity Reserve Work Group By Steve Strommen FSA, CERA, MAAA May 1, 2013 POTENTIAL RESERVE METHODOLOGY.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Advantages and Limitations of Applying Regression Based Reserving Methods to Reinsurance Pricing Thomas Passante, FCAS, MAAA Swiss Re New Markets CAS.
DISCUSSION OF MINIMUM DISTANCE ESTIMATION OF LOSS DISTRIBUTIONS BY STUART A. KLUGMAN AND A. RAHULJI PARSA CLIVE L. KEATINGE.
EVENT PROJECTION Minzhao Liu, 2018
Presentation transcript:

2000 SEMINAR ON REINSURANCE PITFALLS IN FITTING LOSS DISTRIBUTIONS CLIVE L. KEATINGE

2 PITFALL #1 Using Aggregate Loss Development Factors to Develop Individual Losses

3 For example, suppose that every year has loss experience as shown in Year 1 below.

4 The aggregate loss development factor to project from the first evaluation to the second evaluation is: 2.00 = 1,000,000/(5*100,000) If this factor is used to develop the individual losses from the first evaluation of Year 2, there will be five projected losses of $200,000 each. The aggregate total of $1,000,000 will be correct, but the distribution will be incorrect.

5 Developing individual losses with aggregate development factors does not work because losses do not all develop by the same percentage. Even if all reported losses were fully developed, fitting a distribution to them would still produce inaccurate results because unreported losses are likely to have a different distribution (usually thicker tailed) than reported losses.

6 Ideally, we would like to have probability distributions for each loss size and for each evaluation point that would project probabilities for where a loss of a given size and maturity will develop. We would also need a reporting pattern and probability distributions for the distribution of newly reported losses at each evaluation point. Unfortunately, to estimate the parameters of such a model would require more data than is generally available.

7 Recently, Philbrick and Holler (CAS Forum, Winter 1996) and Gillam and Couret (PCAS 1997) have postulated models in this spirit that rely on grouping losses of similar sizes together. However, both of these proposals still require quite a bit of data and make modeling assumptions that may or may not be appropriate.

8 Here I will suggest a simple method to account for development that does not require as much data and will give unbiased estimates of survival probabilities with virtually no modeling assumptions. This method is based on computing age-to- age differences in survival probabilities and then summing the differences. This is analogous to computing age-to-age factors and multiplying the factors together. I will demonstrate this method with some actual large loss data (that has already been trended). We will assume that we are interested in projecting the loss distribution excess of $1,000,000. Note: A survival probability at a given point of a distribution is simply 1 minus the value of the cumulative distribution function.

9 We have a triangle of 12 accident years of data with 10 total losses reported excess of $1,000,000 as of the first evaluation. The survival probabilities are:

10 For the first 11 accident years, there are 8 total losses reported as of the first evaluation and 32 total losses reported as of the second evaluation. The survival probabilities are:

11 The age-to-age survival probability differences through the 4th evaluation are:

12 The age-to-age survival probability differences from the 4th through the 8th evaluation are:

13 The age-to-age survival probability differences from the 8th through the 12th evaluation are:

14 The sum of the survival probability differences is shown here next to the survival probabilities calculated using the latest evaluation of the 106 nonzero reported losses without adjustment for development.

15 Given the usual assumption that the loss distribution and its development pattern does not vary by accident year (and development is complete by the 12th evaluation), the sums of the survival probability differences are unbiased estimates of the true survival probabilities. The estimates are subject to the usual uncertainty resulting from the random nature of the loss process. In this case, the unadjusted survival probability estimates generally appear to be too high at small loss sizes and too low at large loss sizes. Although some of this may be simply a result of randomness, this pattern does make intuitive sense.

16 A loss distribution may be fit to the estimated survival probabilities using grouped maximum likelihood estimation with the loss sizes shown as the group boundaries. Given the reversals in the estimated survival probabilities, it may be desirable to smooth out the estimated survival probabilities before fitting a distribution to them. However, this is not necessary. Maximum likelihood estimation will still work even with negative coefficients on some of the terms of the loglikelihood function.

17 This shows a mixed exponential distribution fit to the estimated survival probabilities. The mixed exponential distribution has means of 1,132,090, 4,929,981 and 63,027,959, with weights of , and , respectively (with loss sizes shifted by $1,000,000).

18 It is not necessary to use a whole triangle of data. Just as can be done with age-to-age factors, it is also possible to use only the most recent diagonals of data. Survival probability estimates with data that has various attachment points and policy limits can be made using the Kaplan-Meier Product-Limit Estimator. This has historically been used extensively in survival analysis. It is covered briefly in Loss Models, by Klugman, Panjer and Willmot, and is covered in more detail in Survival Analysis: Techniques for Censored and Truncated Data, by Klein and Moeschberger, and in Survival Models and Their Estimation, by London.

19 When combining data of various maturities, there is inherently going to be more uncertainty in the resulting estimates than if all losses had emerged and were fully developed. Accounting for development by using age-to- age differences in survival probabilities is (as far as I know) a new, untested idea. However, I think it has the potential to be a very useful actuarial technique to attack a difficult problem.

20 PITFALL #2 Failing to Use a Loss Distribution that Fits the Available Data

21 Here is the result of a Pareto distribution fit to some actual data. We assume that losses have been trended and that any necessary adjustments to the empirical distribution to account for loss development have already been made. The Pareto distribution was fit using grouped maximum likelihood estimation with the loss sizes shown as the group boundaries. The very poor fit is a result of attempting to fit a Pareto over the entire range from $0 to $1,000,000. For this Pareto, the scale parameter is 982 and the shape parameter is

22 Here is the result of a Pareto distribution fit with all group boundaries between $0 and $100,000 removed (so the first group is $0-$100,000). The fit is much better above $100,000, but worse below $100,000. One option would be to use another distribution from $0-$100,000 and this distribution from $100,000-$1,000,000. For this Pareto, the scale parameter is 11,157 and the shape parameter is

23 Instead, the actuary in this case used a judgmental method of moments procedure to come up with the following Pareto distribution. Method of moments is generally not a good estimation method to use, and the resulting Pareto does not adequately fit the data anywhere. For this Pareto, the scale parameter is 29,792 and the shape parameter is 2.1.

24 Another good alternative would be to use a mixed exponential distribution as shown below. A mixed exponential is virtually guaranteed to fit well throughout the range of data available. This mixed exponential has means of 867, 10,890, 45,362, 225,607 and infinity (more will be said about this later), with weights of , , , and , respectively.

25 PITFALL #3 Extrapolating Beyond the Available Data

26 This exhibit shows the expected number of claims excess of loss sizes from $100,000 to $100,000,000 for a lognormal fit with a first group of $0-$100,000 (mu=8.19 and sigma=1.86), the Pareto fit with these groups, and the mixed exponential, which was fit with all groups from $0- $1,000,000.

27 All the distributions show similar behavior up to $1,000,000, where the data stops. Beyond this, their behavior differs greatly. There is no way to tell with any reliability what the true distribution looks like in the tail where no data is available. Other data from similar risks must be referred to, or judgment can be used. However, extrapolation is inappropriate. The mixed exponential survival probabilities are constant in the tail, because this distribution has a small weight on a mean of infinity. In other cases, where the mixed exponential does not have a mean of infinity, the survival probabilities can tail off to zero very fast. This simply illustrates that the mixed exponential cannot be used (nor can any other distribution be used) to extrapolate.

28 The fundamental reason that loss distributions are useful is that we generally believe that the probability distribution underlying a process that generates loss data is reasonably smooth, certainly more smooth than the empirical distribution. Thus, by smoothing the data, we expect to obtain better estimates than if we just used the empirical data. We virtually never have any particular reason to believe that data comes from one type of distribution or another. Distributions are just a smoothing device. Thus, the use of ill-fitting parametric distributions or extrapolation with ill-fitting or well-fitting parametric distributions is poor actuarial practice. The mixed exponential distribution is a particularly useful distribution for fitting loss data because it is flexible enough to virtually always provide a good fit and yet it still maintains an appropriate degree of smoothness. For those who are interested, my 1999 Proceedings paper on the mixed exponential distribution is available on the CAS Web site.