A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper
2 Agenda Brief Introduction to GLM Overview of the Exponential Family Some Specific Distributions Suggestions for Insurance Applications
3 Context for GLM Linear Regression Generalized Linear Models Maximum Likelihood Y~ NormalY ~ Exponential FamilyY ~ Any Distribution
4 Advantages over Linear Regression Instead of linear combination of covariates, we can use a function of a linear combination of covariates Response variable stays in original units Great flexibility in variance structure
5 Transforming the Response versus Transforming the Covariates Linear RegressionGLM E[g(y)] = X· E[y] = g -1 (X· ) Note that if g(y)=ln(y), then Linear Regression cannot handle any points where y 0.
6 Advantages of this Special Case of Maximum Likelihood Pre-programmed in many software packages Direct calculation of standard errors of key parameters Convenient separation of Mean parameter from “nuisance” parameters
7 Advantages of this Special Case of Maximum Likelihood GLM useful when theory immature, but experience gives clues about: How mean response affected by external influences, covariates How variability relates to mean Independence of observations Skewness/symmetry of response distribution
8 General Form of the Exponential Family Note that y i can be transformed with any function e().
9 “Natural” Form of the Exponential Family Note that y i is no longer within a function. That is, e(y i )=y i.
10 Specific Members of the Exponential Family Normal (Gaussian) Poisson Negative Binomial Gamma Inverse Gaussian
11 Some Other Members of the Exponential Family Natural Form Binomial Logarithmic Compound Poisson/Gamma (Tweedie) General Form [use ln(y) instead of y] Lognormal Single Parameter Pareto
12 Normal Distribution Natural Form: The dispersion parameter, , is replaced with 2 in the more familiar form of the Normal Distribution.
13 Poisson Distribution Natural Form: “Over-dispersed” Poisson allows 1. Variance/Mean ratio =
14 Negative Binomial Distribution Natural Form: The parameter k must be selected by the user of the model.
15 Gamma Distribution Natural Form: Constant Coefficient of Variation (CV): CV = -1/2
16 Inverse Gaussian Distribution Natural Form:
17 Table of Variance Functions DistributionVariance Function Normal Var(y) = Poisson Var(y) = · Negative Binomial Var(y) = · +( /k)· 2 Gamma Var(y) = · 2 Inverse Gaussian Var(y) = · 3
18 The Unit Variance Function We define the “Unit Variance” function as V( ) = Var(y) / a( ) That is, =1 in the previous table.
19 Uniqueness Property The unit variance function V( ) uniquely identifies its parent distribution type within the natural exponential family. f(y) V( )
20 Table of Skewness Coefficients DistributionSkewness Normal 0 Poisson CV Negative Binomial[1+ /( +k)]·CV Gamma 2·CV Inverse Gaussian 3·CV
21 Graph of Skewness versus CV
22 The Big Question: What should the variance function look like for insurance applications?
23 What is the Response Variable? Number of Claims Frequency (# claims per unit of exposure) Severity Aggregate Loss Dollars Loss Ratio (Aggregate Loss / Premium) Loss Rate (Aggregate Loss per unit of exposure)
24 An Example for Considering Variance Structure How would you calculate the mean and variance in these loss ratios?
25 Defining a Variance Structure We intuitively know that variance changes with loss volume – but how? This is the same as asking “ V( ) = ?”
26 Defining a Variance Structure We want CV to decrease with loss size, but not too quickly. GLM provides several approaches: Negative BinomialVar(y) = · +( /k)· 2 TweedieVar(y) = · p 1<p<2 Weighted L-SVar(y) = /w
27 The Negative Binomial The variance function: Var(y) = · + ( /k)· 2 random systematic variance variance
28 The “Tweedie” Distribution TweedieNeg. Binomial FrequencyPoisson Poisson SeverityGammaLogarithmic (exponential when p=1.5) Both the Tweedie and the Negative Binomial can be thought of as intermediate cases between the Poisson and Gamma distributions.
29 Defining a Variance Structure Negative Binomial Tweedie
30 Defining a Variance Structure
31 Weighted Least-Squares Use Normal Distribution but set a( ) = /w i such that, variance is proportional to some external exposure weight w i. This is equivalent to weighted least- squares:L-S = Σ(y i - i ) 2 ·w i
32 Conclusion A model fitted to insurance data should reflect the variance structure of the phenomenon being modeled. GLM provides a flexible tool for doing this.