Download presentation
Presentation is loading. Please wait.
Published byMoris Curtis Modified over 9 years ago
1
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University
2
2 Classical Multiple Linear Regression Y i = a 0 + a 1 X i1 + a 2 X i2 …+ a m X im + e i Y i are the response variables X ij are predictors i subscript denotes i th observation j subscript identifies j th predictor Y i = a 0 + a 1 X i1 + a 2 X i2 …+ a m X im + e i Y i are the response variables X ij are predictors i subscript denotes i th observation j subscript identifies j th predictor
3
3 One Predictor: Y i = a 0 + a 1 X i
4
4 Classical Multiple Linear Regression: Solving for a i
5
5 Classical Multiple Linear Regression μ i = E[Y i ] = a 0 + a 1 X i1 + …+ a m X im Y i is Normally distributed random variable with constant variance σ 2 Want to estimate μ i = E[Y i ] for each i μ i = E[Y i ] = a 0 + a 1 X i1 + …+ a m X im Y i is Normally distributed random variable with constant variance σ 2 Want to estimate μ i = E[Y i ] for each i
6
6
7
7 Problems with Traditional Model Number of claims is discrete Claim sizes are skewed to the right Probability of an event is in [0,1] Variance is not constant across data points i Nonlinear relationship between X’s and Y’s Number of claims is discrete Claim sizes are skewed to the right Probability of an event is in [0,1] Variance is not constant across data points i Nonlinear relationship between X’s and Y’s
8
8 Generalized Linear Models - GLMs Fewer restrictions Y can model number of claims, probability of renewing, loss severity, loss ratio, etc. Large and small policies can be put into one model Y can be nonlinear function of X’s Classical linear regression model is a special case Fewer restrictions Y can model number of claims, probability of renewing, loss severity, loss ratio, etc. Large and small policies can be put into one model Y can be nonlinear function of X’s Classical linear regression model is a special case
9
9 Generalized Linear Models - GLMs Same goal Predict : μ i = E[Y i ] Same goal Predict : μ i = E[Y i ]
10
10 Generalized Linear Models - GLMs g(μ i )= a 0 + a 1 X i1 + …+ a m X im g( ) is the link function E[Y i ] = μ i = g -1 (a 0 + a 1 X i1 + …+ a m X im ) Y i can be Normal, Poisson, Gamma, Binomial, Compound Poisson, … Variance can be modeled g(μ i )= a 0 + a 1 X i1 + …+ a m X im g( ) is the link function E[Y i ] = μ i = g -1 (a 0 + a 1 X i1 + …+ a m X im ) Y i can be Normal, Poisson, Gamma, Binomial, Compound Poisson, … Variance can be modeled
11
11 GLMs Extend Classical Linear Regression If link function is identity: g(μ i ) = μ i And Y i has Normal distribution → GLM gives same answer as Classical Linear Regression* * Least squares and MLE equivalent for Normal dist. If link function is identity: g(μ i ) = μ i And Y i has Normal distribution → GLM gives same answer as Classical Linear Regression* * Least squares and MLE equivalent for Normal dist.
12
12 Exponential Family of Distributions – Canonical Form
13
13 Some Math Rules: Refresher
14
14 Normal Distribution in Exponential Family
15
15 Normal Distribution in Exponential Family θb(θ)b(θ)
16
16 Poisson Distribution in Exponential Family θ
17
17 Poisson Distribution in Exponential Family
18
18 Compound Poisson Distribution Y = C 1 + C 2 +... + C N N is Poisson random variable C i are i.i.d. with Gamma distribution This is an example of a Tweedie distribution Y is member of Exponential Family Y = C 1 + C 2 +... + C N N is Poisson random variable C i are i.i.d. with Gamma distribution This is an example of a Tweedie distribution Y is member of Exponential Family
19
19 Members of the Exponential Family Normal Poisson Binomial Gamma Inverse Gaussian Compound Poisson Normal Poisson Binomial Gamma Inverse Gaussian Compound Poisson
20
20 Variance Structure E[Y i ] = μ i = b' ( θ i ) → θ i = b' (-1) (μ i ) Var[ Y i ] = a(Φ i ) b' ' ( θ i ) = a(Φ i ) V(μ i ) Common form: Var[ Y i ] = Φ V(μ i )/w i Φ is constant across data but weights applied to data points E[Y i ] = μ i = b' ( θ i ) → θ i = b' (-1) (μ i ) Var[ Y i ] = a(Φ i ) b' ' ( θ i ) = a(Φ i ) V(μ i ) Common form: Var[ Y i ] = Φ V(μ i )/w i Φ is constant across data but weights applied to data points
21
21 V(μ) Normal 0 Poisson Binomial (1- ) Tweedie p, 1<p<2 Gamma 2 Inverse Gaussian 3 Recall: Var[ Y i ] = Φ V(μ i )/w i V(μ) Normal 0 Poisson Binomial (1- ) Tweedie p, 1<p<2 Gamma 2 Inverse Gaussian 3 Recall: Var[ Y i ] = Φ V(μ i )/w i Variance Functions V(μ)
22
22 Variance at Point and Fit A Practioner’s Guide to Generalized Linear Models: A CAS Study Note
23
23 Variance of Y i and Fit at Data Point i Var(Y i ) is big → looser fit at data point i Var(Y i ) is small → tighter fit at data point i Var(Y i ) is big → looser fit at data point i Var(Y i ) is small → tighter fit at data point i
24
24 Why Exponential Family? Distributions in Exponential Family can model a variety of problems Standard algorithm for finding coefficients a 0, a 1, …, a m Distributions in Exponential Family can model a variety of problems Standard algorithm for finding coefficients a 0, a 1, …, a m
25
25 Modeling Number of Claims Number of PolicySexTerritoryClaims in 5 Years 1M020 2F010 3F 0 4F021 5F010 6F021 7M 2 8M 2 9M 1 10F011 ::::
26
26 Assume a Multiplicative Model μ i = expected number of claims in five years μ i = B F,01 x C Sex(i) x C Terr(i) If i is Female and Terr 01 → μ i = B F,01 x 1.00 x 1.00 μ i = expected number of claims in five years μ i = B F,01 x C Sex(i) x C Terr(i) If i is Female and Terr 01 → μ i = B F,01 x 1.00 x 1.00
27
27 Multiplicative Model μ i = exp(a 0 + a S X S(i) + a T X T(i) ) μ i = exp(a 0 ) x exp(a S X S(i) ) x exp( a T X T(i) ) i is Female → X S(i) = 0; Male → X S(i) = 1 i is Terr 01 → X T(i) = 0; Terr 02 → X T(i) = 1 μ i = exp(a 0 + a S X S(i) + a T X T(i) ) μ i = exp(a 0 ) x exp(a S X S(i) ) x exp( a T X T(i) ) i is Female → X S(i) = 0; Male → X S(i) = 1 i is Terr 01 → X T(i) = 0; Terr 02 → X T(i) = 1
28
28 Values of Predictor Variables Policy Sex X S(i) Territory X T(i) 1 M 1 02 1 2 F 0 01 0 3 F 0 0 4 F 0 02 1 5 F 0 01 0 6 F 0 02 1 7 M 1 1 8 M 1 1 9 M 1 1 10 F 0 01 0
29
29 Natural Log Link Function ln(μ i ) = a 0 + a S X S(i) + a T X T(i) μ i is in (0, ∞ ) ln(μ i ) is in ( - ∞, ∞ ) ln(μ i ) = a 0 + a S X S(i) + a T X T(i) μ i is in (0, ∞ ) ln(μ i ) is in ( - ∞, ∞ )
30
30 Poisson Distribution in Exponential Family
31
31 Natural Log is Canonical Link for Poisson θ i = ln ( μ i ) θ i = a 0 + a S X S(i) + a T X T(i) θ i = ln ( μ i ) θ i = a 0 + a S X S(i) + a T X T(i)
32
32 Estimating Coefficients a 1, a 2,.., a m Classical linear regression uses least squares GLMs use Maximum Likelihood Method Solution will exist for distributions in exponential family Classical linear regression uses least squares GLMs use Maximum Likelihood Method Solution will exist for distributions in exponential family
33
33 Likelihood and Log Likelihood
34
34 Find a 0, a S, and a T for Poisson
35
35 Iterative Numerical Procedure to Find a i ’s Use statistical package or actuarial software Specify link function and distribution type “Iterative weighted least squares” is the numerical method used Use statistical package or actuarial software Specify link function and distribution type “Iterative weighted least squares” is the numerical method used
36
36 Solution to Our Example a 0 = -.288 → exp(-.288) =.75 a S =.262 → exp(.262) = 1.3 a T =.095 → exp(.095) = 1.1 μ i = exp(a 0 ) x exp(a S X S(i) ) xexp( a T X T(i) ) μ i =.75 x 1.3 X S(i) x 1.1 X T(i) i is Male,Terr 01 → μ i =.75 x 1.3 1 x 1.1 0 a 0 = -.288 → exp(-.288) =.75 a S =.262 → exp(.262) = 1.3 a T =.095 → exp(.095) = 1.1 μ i = exp(a 0 ) x exp(a S X S(i) ) xexp( a T X T(i) ) μ i =.75 x 1.3 X S(i) x 1.1 X T(i) i is Male,Terr 01 → μ i =.75 x 1.3 1 x 1.1 0
37
37 Testing New Drug Treatment
38
38 Multiple Linear Regression
39
39 Logistic Regression Model p = probability of cure, p in [0,1] odds ratio: p/(1-p) in [0, + ∞ ] ln[p/(1-p)] in [ - ∞, + ∞ ] ln[p/(1-p)] = a + b 1 X 1 + b 2 X 2 p = probability of cure, p in [0,1] odds ratio: p/(1-p) in [0, + ∞ ] ln[p/(1-p)] in [ - ∞, + ∞ ] ln[p/(1-p)] = a + b 1 X 1 + b 2 X 2 Link function
40
40 Logistic Regression Model
41
41 Which Exponential Family Distribution? Frequency: Poisson, {Negative Binomial} Severity: Gamma Loss ratio: Compound Poisson Pure Premium: Compound Poisson How many policies will renew: Binomial Frequency: Poisson, {Negative Binomial} Severity: Gamma Loss ratio: Compound Poisson Pure Premium: Compound Poisson How many policies will renew: Binomial
42
42 What link function? Additive model: identity Multiplicative model: natural log Modeling probability of event: logistic Additive model: identity Multiplicative model: natural log Modeling probability of event: logistic
43
43
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.