Bivariate Poisson regression models for automobile insurance pricing Lluís Bermúdez i Morata Universitat de Barcelona IME 2007 Piraeus, July
Introduction Bivariate Poisson regression models The database Results Conclusions Bivariate Poisson regression models for automobile insurance pricing
Introduction Designing a tariff structure for insurance is the main task for the actuaries. This task is specially problematic in the branch of automobile insurance because of the presence of very heterogenous portfolios. A way to handle the problem of the heterogeneity, called tariff segmentation or a priori ratemaking, consists of segmenting the portfolio in homogenous classes so that all the insured who belong to a class pay the same premium. For this purpose, we must first determine the factors to classify each risk (a priori variables), verifying statistically that the probability of occurrence of the risk depends on these factors, and secondly, measure their influence.
It is convenient to achieve a priori classification by resorting on generalized linear models. The most common use of a glm for this tariff system is the Poisson regression model and its generalizations (Dionne and Vanasse, 1989). Although it is possible to apply these models using the total number of claims as response variable, the nature of automobile insurance policies (covering different risks), it causes the use of these models taking the number of claims for each class of guarantee. Therefore, a premium is obtained for each class of guarantee attending to different factors. Then, assuming independence between them, the total premium is obtained by the sum of all of them. Introduction
Here, we assume two different types of guarantee: the automobile third-party liability insurance (type I) and the rest of automobile guarantees (type II). Assuming independence between types, the premium is obtained by sum of the premiums for each type of guarantee and depends on the rating factors chosen. HOWEVER... Is realistic the independence assumption? If we relax it, how does affect this to the tariff system? Introduction These important questions are the aim of this work.
Bivariate Poisson regression models N 2 : the number of claims for the rest of automobile guarantees. N 1 : the number of claims for the automobile third-party liability insurance. Let, = E [N 1 ] + E [N 2 ] = N 1 ~ Poisson ( 1 ) N 2 ~ Poisson ( 2 ) The usual methodology assumes that and independence between them. Finally, the total premium is obtained by
If we relax the independence assumption, we can use the bivariate Poisson regression models which are suitable for paired count data exhibiting correlation. We can consider independent random variables X i (i=1,2,3) be distributed as Poisson with parameters i respectively. Then, the random variables follow jointly a bivariate Poisson distribution: (Kocherlakota and Kocherlakota, 1992) N 1 = X 1 + X 3 and N 2 = X 2 + X 3 (N 1, N 2 ) ~ BP( 1, 2, 3 ). Bivariate Poisson regression models
And its joint probability function is given by: Bivariate Poisson regression models
Several interesting and useful properties for our purpose: allows for positive dependence between N 1 and N 2, and Cov(N 1,N 2 )= 3 is a measure for this dependence; if 3 =0, the two random variables are independent and the product of two independent Poisson distributions, referred as double Poisson distribution, is obtained; the marginal distributions are Poisson with E[N 1 ]= and E[N 2 ]= ; and the total premium can be obtained by = E[N 1 ] + E[N 2 ] = Bivariate Poisson regression models
If the use of covariates is introduced to model i (i=1,2,3), we can define a bivariate Poisson regression model with the following scheme: For the calculations, we have used an EM algorithm provided by Karlis and Ntzoufras (2005) and its implementation using R. Standard errors for the parameters have been calculated using standard bootstrap methods. Bivariate Poisson regression models
A problem arises when we look more insight at joint probability function. In our database, we can clearly see that the proportion of (0, 0) is quite larger than the other cases. Therefore, it is reasonable to fit a zero-inflated model. There are few papers discussing zero-inflated model in bivariate discrete distributions. Here we follow the zero- inflated bivariate Poisson model proposed by Karlis and Ntzoufras (2005). Actually, they propose an extension of the simple zero-inflated model which inflates the probabilities in the diagonal of the probability table. Bivariate Poisson regression models
The database The original sample is a 10% sample of the automobile portfolio of an Spanish insurance company, contains policyholders. The exogenous variables that are also used in Pinquet et al. (2001) are the ones described here:
The cross-tabulation for the factors number of claims for third-party liability N 1 and number of claims for the rest of guarantees N 2 is shown here: The database
Results First, in order to show the convenience of using the bivariate Poisson model, we fitted the simple bivariate Poisson model (with constant 1, 2, 3 ). The estimated values for these parameters are: 1 = = = We obtained an AIC of for the bivariate Poisson model, which is better than the values for the double Poisson model ( ) and the saturated model ( ). We see that even with an small correlation between N 1 and N 2, including 3 produces better fit for the data used.
Results
Results: zero-inflated bivariate Poisson
Conclusions We show that the use of bivariate Poisson models, which includes a term to measure the dependence, produces better fit than univariate models assuming independence. Therefore, answering the first question, the usual independence assumption is not realistic here. The following step would be analyze more carefully the consequences in the tariff system in order to answer the second question. Is realistic the independence assumption? If we relax it, how does affect this to the tariff system?