Bivariate Poisson regression models for automobile insurance pricing Lluís Bermúdez i Morata Universitat de Barcelona IME 2007 Piraeus, 10-12 July.

Slides:



Advertisements
Similar presentations
Copula Regression By Rahul A. Parsa Drake University &
Advertisements

Introduction A recursive approach A Gerber Shiu function at claim instants Numerical illustrations Conclusions.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Business Statistics for Managerial Decision
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
The Simple Linear Regression Model: Specification and Estimation
Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.
Simple Linear Regression
Discrete Random Variables and Probability Distributions
SIMPLE LINEAR REGRESSION
Log-linear and logistic models
European University at St. Petersburg THE METHOD OF QUANTILE REGRESSION, A NEW APPROACH TO ACTUARIAL MATHEMATICS Authors: Ruslan Abduramanov Andrey Kudryavtsev.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
Continuous Random Variables and Probability Distributions
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.
Experimental Evaluation
Inferences About Process Quality
Archimedean Copulas Theodore Charitos MSc. Student CROSS.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
5-3 Inference on the Means of Two Populations, Variances Unknown
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
The Poisson Probability Distribution The Poisson probability distribution provides a good model for the probability distribution of the number of “rare.
A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper
Chapter 11 Simple Regression
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Chapter 5 Discrete Random Variables and Probability Distributions ©
Probability theory 2 Tron Anders Moger September 13th 2006.
Practice Problems Actex 8. Section 8 -- #5 Let T 1 be the time between a car accident and reporting a claim to the insurance company. Let T 2 be the time.
CHAPTER 14 MULTIPLE REGRESSION
MULTIPLE TRIANGLE MODELLING ( or MPTF ) APPLICATIONS MULTIPLE LINES OF BUSINESS- DIVERSIFICATION? MULTIPLE SEGMENTS –MEDICAL VERSUS INDEMNITY –SAME LINE,
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Danila Filipponi Simonetta Cozzi ISTAT, Italy Outlier Identification Procedures for Contingency Tables in Longitudinal Data Roma,8-11 July 2008.
STK 4540Lecture 3 Uncertainty on different levels And Random intensities in the claim frequency.
Reserve Variability – Session II: Who Is Doing What? Mark R. Shapland, FCAS, ASA, MAAA Casualty Actuarial Society Spring Meeting San Juan, Puerto Rico.
The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.
1 Introduction What does it mean when there is a strong positive correlation between x and y ? Regression analysis aims to find a precise formula to relate.
Math 4030 – 6a Joint Distributions (Discrete)
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
Chapter 8: Simple Linear Regression Yang Zhenlin.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Review of statistical modeling and probability theory Alan Moses ML4bio.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Estimating standard error using bootstrap
Inference about the slope parameter and correlation
Chapter 7. Classification and Prediction
The Poisson Probability Distribution
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
CH 5: Multivariate Methods
The break signal in climate records: Random walk or random deviations
Types of Poisson Regression. Offset Regression  A variant of Poisson Regression  Count data often have an exposure variable, which indicates the number.
SA3202 Statistical Methods for Social Sciences
6-1 Introduction To Empirical Models
Regression Models - Introduction
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Undergraduated Econometrics
The Simple Linear Regression Model: Specification and Estimation
CHAPTER 14 MULTIPLE REGRESSION
Parametric Methods Berlin Chen, 2005 References:
Discrete Random Variables and Probability Distributions
Presentation transcript:

Bivariate Poisson regression models for automobile insurance pricing Lluís Bermúdez i Morata Universitat de Barcelona IME 2007 Piraeus, July

Introduction Bivariate Poisson regression models The database Results Conclusions Bivariate Poisson regression models for automobile insurance pricing

Introduction  Designing a tariff structure for insurance is the main task for the actuaries. This task is specially problematic in the branch of automobile insurance because of the presence of very heterogenous portfolios.  A way to handle the problem of the heterogeneity, called tariff segmentation or a priori ratemaking, consists of segmenting the portfolio in homogenous classes so that all the insured who belong to a class pay the same premium.  For this purpose, we must first determine the factors to classify each risk (a priori variables), verifying statistically that the probability of occurrence of the risk depends on these factors, and secondly, measure their influence.

 It is convenient to achieve a priori classification by resorting on generalized linear models. The most common use of a glm for this tariff system is the Poisson regression model and its generalizations (Dionne and Vanasse, 1989).  Although it is possible to apply these models using the total number of claims as response variable, the nature of automobile insurance policies (covering different risks), it causes the use of these models taking the number of claims for each class of guarantee.  Therefore, a premium is obtained for each class of guarantee attending to different factors. Then, assuming independence between them, the total premium is obtained by the sum of all of them. Introduction

 Here, we assume two different types of guarantee: the automobile third-party liability insurance (type I) and the rest of automobile guarantees (type II).  Assuming independence between types, the premium is obtained by sum of the premiums for each type of guarantee and depends on the rating factors chosen. HOWEVER... Is realistic the independence assumption? If we relax it, how does affect this to the tariff system? Introduction  These important questions are the aim of this work.

Bivariate Poisson regression models N 2 : the number of claims for the rest of automobile guarantees. N 1 : the number of claims for the automobile third-party liability insurance.  Let,  = E [N 1 ] + E [N 2 ] = N 1 ~ Poisson ( 1 ) N 2 ~ Poisson ( 2 )  The usual methodology assumes that and independence between them.  Finally, the total premium is obtained by

 If we relax the independence assumption, we can use the bivariate Poisson regression models which are suitable for paired count data exhibiting correlation.  We can consider independent random variables X i (i=1,2,3) be distributed as Poisson with parameters i respectively.  Then, the random variables follow jointly a bivariate Poisson distribution: (Kocherlakota and Kocherlakota, 1992) N 1 = X 1 + X 3 and N 2 = X 2 + X 3 (N 1, N 2 ) ~ BP( 1, 2, 3 ). Bivariate Poisson regression models

 And its joint probability function is given by: Bivariate Poisson regression models

 Several interesting and useful properties for our purpose: allows for positive dependence between N 1 and N 2, and Cov(N 1,N 2 )= 3 is a measure for this dependence; if 3 =0, the two random variables are independent and the product of two independent Poisson distributions, referred as double Poisson distribution, is obtained; the marginal distributions are Poisson with E[N 1 ]= and E[N 2 ]= ; and the total premium can be obtained by  = E[N 1 ] + E[N 2 ] = Bivariate Poisson regression models

 If the use of covariates is introduced to model i (i=1,2,3), we can define a bivariate Poisson regression model with the following scheme:  For the calculations, we have used an EM algorithm provided by Karlis and Ntzoufras (2005) and its implementation using R. Standard errors for the parameters have been calculated using standard bootstrap methods. Bivariate Poisson regression models

 A problem arises when we look more insight at joint probability function. In our database, we can clearly see that the proportion of (0, 0) is quite larger than the other cases. Therefore, it is reasonable to fit a zero-inflated model.  There are few papers discussing zero-inflated model in bivariate discrete distributions. Here we follow the zero- inflated bivariate Poisson model proposed by Karlis and Ntzoufras (2005). Actually, they propose an extension of the simple zero-inflated model which inflates the probabilities in the diagonal of the probability table. Bivariate Poisson regression models

The database  The original sample is a 10% sample of the automobile portfolio of an Spanish insurance company, contains policyholders. The exogenous variables that are also used in Pinquet et al. (2001) are the ones described here:

 The cross-tabulation for the factors number of claims for third-party liability N 1 and number of claims for the rest of guarantees N 2 is shown here: The database

Results  First, in order to show the convenience of using the bivariate Poisson model, we fitted the simple bivariate Poisson model (with constant 1, 2, 3 ).  The estimated values for these parameters are: 1 = = =  We obtained an AIC of for the bivariate Poisson model, which is better than the values for the double Poisson model ( ) and the saturated model ( ). We see that even with an small correlation between N 1 and N 2, including 3 produces better fit for the data used.

Results

Results: zero-inflated bivariate Poisson

Conclusions  We show that the use of bivariate Poisson models, which includes a term to measure the dependence, produces better fit than univariate models assuming independence.  Therefore, answering the first question, the usual independence assumption is not realistic here. The following step would be analyze more carefully the consequences in the tariff system in order to answer the second question. Is realistic the independence assumption? If we relax it, how does affect this to the tariff system?