Use Macro to Enter TP Logo March 11-12, 1999, Nashville, Tennessee Mark Scully, Tillinghast-Towers Perrin The Use of Multivariate Analysis Techniques to.

Slides:



Advertisements
Similar presentations
Regression Eric Feigelson Lecture and R tutorial Arcetri Observatory April 2014.
Advertisements

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
The General Linear Model Or, What the Hell’s Going on During Estimation?
Integration of sensory modalities
Part V The Generalized Linear Model Chapter 16 Introduction.
Loglinear Models for Contingency Tables. Consider an IxJ contingency table that cross- classifies a multinomial sample of n subjects on two categorical.
Count Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Simulation Modeling and Analysis
Statistical Methods Chichang Jou Tamkang University.
Generalised linear models
Evaluating Hypotheses
Linear and generalised linear models
Analysis of Variance & Multivariate Analysis of Variance
Today Concepts underlying inferential statistics
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Severity Distributions for GLMs: Gamma or Lognormal? Presented by Luyang Fu, Grange Mutual Richard Moncher, Bristol West 2004 CAS Spring Meeting Colorado.
Chapter 14 Inferential Data Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
A Primer on the Exponential Family of Distributions David Clark & Charles Thayer American Re-Insurance GLM Call Paper
Regression Analysis (2)
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Simple Linear Regression
The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 4: Regression Models and Multivariate Analyses.
On project probabilistic cost analysis from LHC tender data Ph. Lebrun CERN, Geneva, Switzerland TILC’09, Tsukuba, Japan April 2009.
Generalized Minimum Bias Models
Introduction to Generalized Linear Models Prepared by Louise Francis Francis Analytics and Actuarial Data Mining, Inc. October 3, 2004.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Elements of Financial Risk Management Second Edition © 2012 by Peter Christoffersen 1 Distributions and Copulas for Integrated Risk Management Elements.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Today: Lab 9ab due after lecture: CEQ Monday: Quizz 11: review Wednesday: Guest lecture – Multivariate Analysis Friday: last lecture: review – Bring questions.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Danila Filipponi Simonetta Cozzi ISTAT, Italy Outlier Identification Procedures for Contingency Tables in Longitudinal Data Roma,8-11 July 2008.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
1 Combining GLM and data mining techniques Greg Taylor Taylor Fry Consulting Actuaries University of Melbourne University of New South Wales Casualty Actuarial.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
© 2012 Towers Watson. All rights reserved. GLM II Basic Modeling Strategy 2012 CAS Ratemaking and Product Management Seminar by Len Llaguno March 20, 2012.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
America CAS Seminar on Ratemaking March 2005 Presented by: Serhat Guven An Introduction to GLM Theory Refinements.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
MARKETING RESEARCH CHAPTER 17: Hypothesis Testing Related to Differences.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
B AD 6243: Applied Univariate Statistics Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma.
Tutorial I: Missing Value Analysis
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
The Use of Multivariate Analysis Techniques to Design a Class Plan
Chapter 4: Basic Estimation Techniques
Analysis of Variance and Covariance
Basic Estimation Techniques
Generalized Linear Models
Basic Estimation Techniques
CHAPTER 29: Multiple Regression*
MOHAMMAD NAZMUL HUQ, Assistant Professor, Department of Business Administration. Chapter-16: Analysis of Variance and Covariance Relationship among techniques.
Statistics II: An Overview of Statistics
Generalized Linear Models
Presentation transcript:

Use Macro to Enter TP Logo March 11-12, 1999, Nashville, Tennessee Mark Scully, Tillinghast-Towers Perrin The Use of Multivariate Analysis Techniques to Design a Class Plan 1999 CAS Seminar on Ratemaking

2 Overview of Presentation Background Multivariate analysis techniques: Generalized Linear Models (GLMs) Classification and Regression Trees (CART,CHAID) Implementation Pricing Marketing Agents’ compensation Results monitoring

3 Several Factors are Converging toward Better Analysis of Customer and Prospect Attributes Greater emphasis on pricing vs. underwriting Increased familiarity with techniques Faster computers Influence of direct writers, non- standard cos.and banks Use of multiple distribution channels Increased competition

4 Why Multivariate Statistical Techniques? Most rating variables are correlated. Different variables may be showing the same underlying effect. Repeated use of univariate techniques leads to double- counting of same effects. Can capture interactions. Provides more than a point estimate, also standard errors.

5 Different Rating Variables may be Manifestations of the Same Underlying Effect Driving Intensity Annual Mileage Vehicle Make/Model Driver Age Underlying EffectRating Variables

6 Interactions Arise when the Combined Effect of two Variables Differs from the Sum of their Single Effects The differential between female and male differs by age.

7 Confidence Intervals Indicate the Degree of Certainty Inherent in Relativity Estimates

8

9

10 Statistical Rating Techniques Indicate the Relative Explanatory Power of each Variable... …and the extent to which variables are correlated. Variable A Variable B

11 What statistical techniques do we commonly use? Generalized Linear Models (GLMs) Classification and regression trees CHAID CART

12 What are GLMs? Statistical procedure for measuring the effect of one or more independent variables upon a dependent variable Dependent variables are, for ratemaking, typically: frequency and severity GLMs allow extreme flexibility in model structure and design multiplicative or additive plans (or others) different error distributions variable interactions Explicitly produce relativity estimates (and more)

13 Basic Theory of GLMs (I) Let Y i, I=1,2,…,n be observations from a random variable. We model them as follows: Where: h=the link function x i =a vector of variables associated with the i-th observation  I =a scalar parameter (the offset)  =the parameter vector e i =an error term(with mean equal to 0)

14 Basic Theory of GLMs (II) Typically, the random term e i is chosen from the exponential family with density in the following general form: Where  and  are parameters and w the weight of each observation. If we denote the mean of this distribution as then its variance may be expressed as V( )  /w, where V() is referred to as the variance function.

15 Basic Theory of GLMs (III)

16 Literature on GLMs Generalized Linear Models, Second Edition, P. McCullach and J.A. Nelder, Chapman & Hall 1989 (ISBN ) “Statistical Motor Rating: making Effective Use of Your Data”, M.J. Brockman and T.S. Wright, JIA 119, III, (April 1992). “Technical Aspects of Domestic Lines Pricing”, Greg Taylor, University of Melbourne Research Paper 45 (ISBN )

17 GLMs-Some Practical Considerations (I) A log link function produces multiplicative relativities. Separate models for frequency and severity: Better understanding of data Appropriate distributions exist Typical error distributions for frequency: Poisson/Quasi-Poisson Negative binomial Typical distributions for severity: Normal Gamma Inverse Gaussian

18 GLMs-Some Practical Considerations (II) Variables may be modeled as continuous covariates or categorical factors An array of statistical and practical tests exists for model testing: Variable significance tests Quantile plots Residual plots Comparison of actual data to model

19 Comparison of Actual to Model Helps to Identify Areas Currently Under- or Overpriced Loss-Segments: How much do we write? Are we growing here? How many $ involved? Other reasons to stay here? Profit-Segments: How much do we write? Are we losing business? How many $ involved? How do we get more?

20 The Significance of these Profit/Loss Areas Depends also on their Volume of Business Note: Gain/(Loss) = (Current PP - Indicated PP) x Exposures

21 What are classification and regression trees? Procedures for successively subdividing data into homogeneous groups Like GLMs, they use a dependent variable and one or more independent ones Result is not necessarily symmetric Implicitly capture the natural interactions between factors Can produce a simpler rating plan or form a single rating variable out of many Produces homogeneous groups(i.e., a tree structure) but no rating plan or relativities

22 Classification and Regression Trees produce an asymmetrical grouping of the data Bestand SF M.O 1/2 Männlich. Kfz-Alter < 2 Weiblich SF 1-3 Typ R & A Typ GarageKeine. SF Typ SF 11-15SF MännlichWeiblich Typ Typ Kfz-Alter > 2 Beamten

23 Some differences between CHAID and CART Dependent variable for CHAID must be categorical; for CART it can be metric Different splitting algorithm (e.g., CHAID uses a Chi-squared test using contingency tables) CHAID splits into multiple groups, CART makes binary splits Different stopping criteria

24 GLMS may be used to Produce a Rating Plan with Variables Generated by CART or CHAID Potential Rating Variables CART/ CHAID Analysis CART/ CHAID Variables GLM Analysis

25 Results from the Rating Analysis Can be Used Beyond the Production of a Rating Plan Actuarially Optimal Model Constraints: Regulatory Agents Stability Competition etc. Rating Analysis Rating Plan Actually Implemented Marketing UW Guidelines Agents’ Compensation Monitoring