By Torna Omar Soro Modeling risk of accidents in the Property and Casualty Insurance Industry.

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

7. Models for Count Data, Inflation Models. Models for Count Data.
Copula Regression By Rahul A. Parsa Drake University &
Linear Regression.
Brief introduction on Logistic Regression
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.
Qualitative and Limited Dependent Variable Models
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Log-linear and logistic models
Some standard univariate probability distributions
Some standard univariate probability distributions
Linear and generalised linear models
Some standard univariate probability distributions
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Multivariate Probability Distributions. Multivariate Random Variables In many settings, we are interested in 2 or more characteristics observed in experiments.
Classification and Prediction: Regression Analysis
Generalized Linear Models
Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Review of Lecture Two Linear Regression Normal Equation
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper
Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.
Lecture 6 Generalized Linear Models Olivier MISSA, Advanced Research Skills.
Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School.
© Department of Statistics 2012 STATS 330 Lecture 26: Slide 1 Stats 330: Lecture 26.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Some standard univariate probability distributions Characteristic function, moment generating function, cumulant generating functions Discrete distribution.
Introduction to Generalized Linear Models Prepared by Louise Francis Francis Analytics and Actuarial Data Mining, Inc. October 3, 2004.
Non-life insurance mathematics Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring.
[Part 4] 1/43 Discrete Choice Modeling Bivariate & Multivariate Probit Discrete Choice Modeling William Greene Stern School of Business New York University.
Repeated Measures  The term repeated measures refers to data sets with multiple measurements of a response variable on the same experimental unit or subject.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Logistic Regression Database Marketing Instructor: N. Kumar.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Discrete Choice Modeling William Greene Stern School of Business New York University.
STK 4540Lecture 3 Uncertainty on different levels And Random intensities in the claim frequency.
1 Components of the Deterministic Portion of the Utility “Deterministic -- Observable -- Systematic” portion of the utility!  Mathematical function of.
Bivariate Poisson regression models for automobile insurance pricing Lluís Bermúdez i Morata Universitat de Barcelona IME 2007 Piraeus, July.
Fall Statistical Models For Crash Data Modeling Process Determine Modeling Objectives Definition (Intersections, Pedestrians, etc.) Data availability.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Université d’Ottawa - Bio Biostatistiques appliquées © Antoine Morin et Scott Findlay :32 1 Logistic regression.
Logistic Regression. Example: Survival of Titanic passengers  We want to know if the probability of survival is higher among children  Outcome (y) =
Discrete Choice Modeling William Greene Stern School of Business New York University.
Machine Learning 5. Parametric Methods.
Discrete Choice Modeling William Greene Stern School of Business New York University.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Personal Lines Actuarial Research Department Generalized Linear Models CAS - Boston Monday, November 11, 2002 Keith D. Holler Ph.D., FCAS, ASA, ARM, MAAA.
6. Ordered Choice Models. Ordered Choices Ordered Discrete Outcomes E.g.: Taste test, credit rating, course grade, preference scale Underlying random.
Logistic Regression and Odds Ratios Psych DeShon.
Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional.
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Discrete Choice Modeling William Greene Stern School of Business New York University.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Naive Bayes (Generative Classifier) vs. Logistic Regression (Discriminative Classifier) Minkyoung Kim.
Probability Theory and Parameter Estimation I
William Greene Stern School of Business New York University
Generalized Linear Models
Discrete Choice Modeling
Generalized Linear Models
Generalized Additive Model
Presentation transcript:

By Torna Omar Soro Modeling risk of accidents in the Property and Casualty Insurance Industry.

DEPENDENT VARIABLE: Number of claims (0, 1, 2) ATTRIBUTES:  1. Total number of vehicles on a policy  2. Total number of drivers on a policy  3. Anti-theft  4. Driver with training (1 if driver has training, 0 otherwise)  5. Age of the oldest driver  6. Age of the youngest driver  7. Territory (driver’s location) (Take the Log) ◦ Cost of territory (numerical value)  8. Sdip: The Safe Driver Insurance Plan  9. Credit Score flag (1 if driver has credit score, 0 otherwise)  10. Credit score (numerical value) (take the Log)  11. Business Source (1 if it’s a book transfer, 0 walking)  12. Group Insurance flag ( 1 if driver is from a group insurance, 0 otherwise)

 Where is the expected value (mean) of y.  And  An unusual property:  This model can be estimated by maximum likelihood.

 Overdispersion ( often Var(y) > E(y) ) or underdispersion.  While overdispersion doesn’t bias the coefficients, it does lead to underestimates of the standard errors.  Overdispersion also implies that conventional MLE are not efficient.  One Reason for Overdispersion: Excess Zero

 The state wildlife biologists want to model how many fish are being caught by fishermen at a state park.  Visitors are asked how long they stayed, how many people were in the group, were there children in the group and how many fish were caught.  Some visitors do not fish, but there is no data on whether a person fished or not.  Some visitors who did fish did not catch any fish so there are excess zeros in the data because of the people that did not fish.

 Property and Casualty Insurance: The excess zeros may come from different sources :  1. Censor data create more zeros  2. Some drivers drive less or occasionally. They prefer taking public transportation.

 It’s a generalization of the Poisson Model  Allows for correction of overdispersion  A disturbance term is included in the model which accounts for the overdispersion.  has a standard gamma distribution  is a constant (Poisson: )

 Alternate Response to modeling Overdispersion  Some zeros result from fishing and not catching any fish. In the case of insurance, the Zeros may result from driving and not causing accidents.  Some zeros result from not fishing at all. In the case of insurance some zeros result from not driving a lot. Censor data may also result in more zeros.  Zero-inflated models allow one to model each process separately.

 the zip model has two parts: a Poisson count model and the Logit model for predicting excess zeros. With probability A logit model is used with a count model

 We have a large degree of freedom (DF) relative to the deviance: 55595>19119: Underdispersion(less variation in the model) Deviance = and DF =  Overestimate of standard errors  MLE not efficient  Inadequate fit of the Poisson Model  Estimation of negative binomial and Zero inflated poisson

Quality of fitDEVIANCEAIC (Akaike Information Criteria): Poisson Negative Bionomial (NB) AIC (NB) < AIC (Poisson) NB is better AIC = 2k – 2ln(L) K= number of parameters in the model L = maximum value of the likelihood function The preferred model is the one with the minimum AIC AIC rewards goodness of fit and imposes a penalty that is an increase function of the number of estimated Parameters. The penalty discourages overfitting.

The Vuong test is a likelihood-ratio based test for model selection

 Given our unbalanced we cannot used SVM  Conditional Random Field is a particular case of Log linear models.

 A log linear model can be written as:   special case : Poisson:  Where the partition  Given x, the label predicted by the model is:  Is called a feature-function.

 A feature-function is any mapping:

 Linear-chain CRF  A case of Multilevel  Given a sentence: each word can be tag as: noun, verb, adjective, preposition, etc…  Fj is a sum along the sentence, for i = 1 to i = n where n is the length of

 1: CRF++  2. CRFSGD (Stochastic gradient Descent)  3. Mallet :Umass Amherst

The current CRF may not be best suited for a model where the response variable is a count due to the way the feature functions are being built. The feature functions describe the interactions between response variables and covarites. I found (below) an extension of the CRF that can be applied to Count model. I may tried this one.  Eunho Yang, Pradeep K Ravikumar, Genevera I Allen, Zhandong Liu UT Austin; UT Austin; Rice University; Baylor College of Medicine: “Conditional Random Fields via Univariate Exponential Families” 2013 (Neural Information Processing Systems Foundation).

 They introduced a “novel subclass of CRFs”, derived by imposing node-wise conditional distributions of response variables conditioned on the rest of the responses and the covariates as arising from univariate exponential families.  This allows them to derive novel multivariate CRFs given any univariate exponential distribution, including the Poisson, negative binomial, and exponential distributions.

  