Optimal scaling for a logistic regression model with ordinal covariates Sanne JW Willems, Marta Fiocco, and Jacqueline J Meulman Leiden University & Stanford University s.j.w.willems@math.leidenuniv.nl www.SanneJWWillems.nl
Optimal scaling for generalized linear models with nonlinear covariates Sanne JW Willems, Marta Fiocco, and Jacqueline J Meulman Leiden University & Stanford University s.j.w.willems@math.leidenuniv.nl www.SanneJWWillems.nl
Goal Reducing linearity in Generalized Linear Models using Optimal Scaling Transformations
Generalized Linear Models Linear predictor: Link function - (nonlinear) relation between the linear predictor and the outcome:
Generalized Linear Models Nonlinear predictor: Link function - (nonlinear) relation between the linear predictor and the outcome:
Why?
Data types
Data types – Nominal Categorical Grouping
Data types – Nominal Categorical Grouping Dummy Coding
Data types – Ordinal Categorical Grouping Ordering
Data types – Ordinal Categorical Grouping Ordering Dummy Coding
Data types – Ordinal Categorical Grouping Ordering Continuous variable via integer Coding
Data types – Numeric Grouping Ordering Equal relative spacing
Data types – Numeric Grouping Ordering Equal relative spacing Continuous variable Grouping Ordering Equal relative spacing
What if the linear predictor should be nonlinear?
What if the linear predictor should be nonlinear? Keep ordinal property, but do not introduce equal relative spacing
What if the linear predictor should be nonlinear? Keep ordinal property, but do not introduce equal relative spacing Remove property of equal relative spacing
Solution: Optimal Scaling transformations Transform variables:
Solution: Optimal Scaling transformations Transform variables: Scaling levels: Nominal spline Numeric Nominal Ordinal Ordinal spline
How?
Optimal Scaling Generalized Linear Models Nonlinear predictor: Link function - (nonlinear) relation between the linear predictor and the outcome:
Algorithm
Algorithm
Optimal Scaling step Apply restrictions according to the chosen scaling level
Algorithm
Example: logistic regression
Example: logistic regression Inpatient treatment or ? Day clinic treatment
Result nominal scaling level
Result ordinal scaling level
Predictions for training data nominal vs ordinal Nominal: Ordinal: Sensitivity = 0.924 Specificity = 0.829 Efficiency (correct classification rate) = 0.880 Sensitivity = 0.918 Specificity = 0.823 Efficiency (correct classification rate) = 0.874
Predictions for training data ordinal vs numeric Ordinal: Numeric: Sensitivity = 0.918 Specificity = 0.823 Efficiency (correct classification rate) = 0.874 Sensitivity = 0.864 Specificity = 0.810 Efficiency (correct classification rate) = 0.839
Summary Optimal Scaling GLMs More flexibility by transforming variables Can be helpful when linear predictor should be nonlinear is nonlinear