The log-rate model Statistical analysis of occurrence-exposure rates 16 January 2019 The log-rate model Statistical analysis of occurrence-exposure rates
References Laird, N. and D. Olivier (1981) Covariance analysis of censored survival data using log-linear analysis techniques. Journal of the American Statistical Institute, 76(374):231-240 Holford, T.R. (1980) The analysis of rates and survivorship using log-linear models. Biometrics, 36:299-305 Yamaguchi, K. (1991) Event history analysis. Sage, Newbury Park, Chapter 4:’Log-rate models for piecewise constant rates’
Data: leaving parental home Leaving home
The log-rate model: the occurrence matrix and the exposure matrix Leaving home The log-rate model: the occurrence matrix and the exposure matrix Occurrences: Number leaving home by age and sex, 1961 birth cohort: nij Exposures: number of months living at home (includes censored observations): PMij
ij = E[Nij] The log-rate model PMij fixed offset The log-rate model is a log-linear model with OFFSET (constant term)
Ln(PM): offset : linear predictor The log-rate model Multiplicative form Addititive form Ln(PM): offset : linear predictor The log-rate model is a log-linear model with OFFSET (constant term)
The log-rate model in two steps Use the model to predict the counts (predict counts from marginal distribution of occurrences and from exposures): IPF (Iterative proportional fitting) Estimate parameters of log-rate model from predicted values using conventional log-linear modeling The model:
Leaving home
Leaving home
The log-rate model in SPSS: unsaturated model Leaving home The log-rate model in SPSS: unsaturated model Model and Design Information: unsaturated model Model: Poisson Design: Constant + SEX + TIMING Ref. cat Ref. cat Parameter Estimates Asymptotic 95% CI Parameter Estimate SE Lower Upper ln 170/9114 (ref.cat) 1 -3.9818 .0694 -4.12 -3.85 2 .5070 .0878 .33 .68 [ln 151/4876]+3.9818 3 .0000 . . . 4 -1.3044 .0897 -1.48 -1.13 [ln 82/16202]+3.9818 5 .0000 . . .
The log-rate model in SPSS: unsaturated model Leaving home The log-rate model in SPSS: unsaturated model PM *exp[ ] = RATE 9114*exp[-3.982 ] = 170.0 0.01865 16202*exp[-3.982-1.304 ] = 82.0 0.00506 15113*exp[-3.982-1.304+0.507] = 127.0 0.00840 4876*exp[-3.982+ 0.507] = 151.0 0.03096
The log-rate model in SPSS: unsaturated model SEX TIMING NUMBER EXPOSURE 1 1 135 15113 2 1 74 16202 1 2 143 4876 2 2 178 9114 GENLOG timing sex /CSTRUCTURE=exposure /MODEL=POISSON /PRINT FREQ ESTIM CORR COV /CRITERIA =CIN(95) ITERATE(20) CONVERGE(.001) DELTA(0) /DESIGN sex timing /SAVE PRED .
Leaving home The log-rate model in GLIM: unsaturated model Occ = Exp * exp[overall + sex] DATA: Occurrence matrix and exposure matrix (2*2) [i] $fit +sex$ [o] scaled deviance = 218.48 (change = -14.80) at cycle 4 [o] d.f. = 2 (change = -1 ) [o] [i] $d e$ [o] estimate s.e. parameter [o] 1 -4.275 0.05997 1 [o] 2 -0.3344 0.08697 SEX(2) [o] scale parameter taken as 1.000 Females 278 = 19989 * exp[-4.275] RATE = exp[-4.275] = 0.0139 Males 252 = 25316 * exp [-4.275 - 0.3344] RATE = exp [-4.275 - 0.3344] = 0.0100 [i] $d r$ [o] unit observed fitted residual [o] 1 135 210.19 -5.186 [o] 2 74 161.28 -6.873 [o] 3 143 67.81 9.130 [o] 4 178 90.72 9.163
Leaving home The log-rate model in GLIM: unsaturated model Occ = Exp * exp[overall + sex + timing]
The log-rate model in GLIM: unsaturated model Leaving home The log-rate model in GLIM: unsaturated model
Leaving home The log-rate model in TDA The basic exponential model with time-constant covariates (Blossfeld and Rohwer, pp. 87ff) Occ = Exp * exp[overall + sex] SN Org Des Episodes Weighted Duration TS Min TF Max Excl ---------------------------------------------------------------------------- 1 0 0 53 53.00 128.47 0.00 144.00 - 1 0 1 530 530.00 72.63 0.00 140.00 - Sum 583 583.00 Number of episodes: 583 Successfully created new episode data. Idx SN Org Des MT Variable Coeff Error C/Error Signif ------------------------------------------------------------------- 1 1 0 1 A Constant -4.6098 0.0630 -73.1777 1.0000 2 1 0 1 A SEX1 0.3344 0.0870 3.8451 0.9999 Log likelihood (starting values): -2887.5967 Log likelihood (final estimates): -2880.1982 command file: ehd21.cf data file: test.dat (micro data)
LOG-RATE MODEL IN TDA: PROGRAMME Leaving home LOG-RATE MODEL IN TDA: PROGRAMME # ehd2.cf Basic exponential model with covariate SEX nvar( dfile = test.dat, # data file ID = c1, # identification number SN = c2, # spell number TF = c3, # TIME LEAVING HOME (=ENDING TIME) # measured from age 0!!!! TF15 = TF-180, # measured from age 15 SEX = c4, # sex REASON = c5, # reason SEX1 = SEX[1], # see boek p. 61 SEX1 = 1 for females and 0 for males # MALES ref.cat SEX2 = SEX[2], # = 1 for females DES = if eq(REASON,4) then 0 else 1, # destination TFP = TF15, # Blossfeld: TF+1 !!!!!! ); edef( # define single episode data ts = 0, # starting time tf = TFP, # ending time org = 0, # origin state des = DES, # destination state # BASIC exponential model (Blossfeld-Rohwer p. 90-91) rate( xa (0,1) = SEX1, pres = ehd21.res, ) = 2;
Related models Parameters of these models are related Poisson distribution: counts have Poisson distribution (total number not fixed) Poisson regression Log-linear model: model of count data (log of counts) Binomial and multinomial distributions: counts follow multinomial distribution (total number is fixed) Logit model: model of proportions [and odds (log of odds)] Logistic regression Log-rate model: log-linear model with OFFSET (constant term) Parameters of these models are related
The unsaturated model Similarity with log-rate model
The unsaturated log-linear model Leaving home The unsaturated log-linear model Assume: two-way classification; counts unknown but marginal totals given Predict the expected counts (cell entries)
Leaving home
Odds ratio = 1 The unsaturated log-linear model as a log-rate model Leaving home The unsaturated log-linear model as a log-rate model Odds ratio = 1
Leaving home With PMij = 1
Update table Update a table Similarity with log-rate model Illustration: migration analysis with incomplete data Migration is a realisation of a Poisson process Literature: “Indirect estimation of migration”, Special issue of Mathematical Population Studies, A. Rogers ed. Vol 7, no 3 (1999)
Update table Updating a table: THE LOG-RATE MODEL IN TWO STEPS Odds ratio = 2.270837
Updating a table: THE LOG-RATE MODEL IN TWO STEPS Update table Updating a table: THE LOG-RATE MODEL IN TWO STEPS
Update table
Log-rate model: rate = events/exposure Update table Log-rate model: rate = events/exposure Gravity / spatial interaction model i and j are balancing factors
IPF and biproportional adjustment Update table IPF and biproportional adjustment Log-likelihood function:
Biproportional adjustment method Update table Biproportional adjustment method RAS method (Richard A. Stone: Input-output models, 1962) DSF procedure (DSF = Deming, Stephan, Furness) (Sen and Smith, 1995, p. 374) See e.g. Willekens (1983) Log-linear analysis of spatial interaction
Biproportional adjustment Update table Biproportional adjustment Step 0: s (Step) = 0 Step 1 Step 2 Step 3: go to Step 1 unless convergence criteria is reached. The stopping criterion is reached when the change is the adjustment factors is less than 10-6 for all x and j.
Likelihood equations may be written as: Update table Likelihood equations may be written as: Marginal totals are sufficient statistics
A different way of writing the spatial interaction model: Update table A different way of writing the spatial interaction model: Link Poisson - Multinomial
The gravity model is a log-linear model Update table The gravity model is a log-linear model The entropy model is a log-linear model The RAS model is as log-linear (log-rate) model
Update table Parameter estimation Maximise (log) likelihood function: probability that the model predicts the data Expectation: predict E[Nrs] = rs given the model and initial parameter estimates. Maximisation: maximise the ‘complete-data’ log-likelihood.
The log-rate model Piecewise constant hazard model Kidney Transplant Histocompatibility Study The data describe the survival of the kidney graft (organ) following kidney transplant operations. The risk factor 'donor relationship' has two categories, cadaveric nonrelated donor (CAD) and living related donor (LRD). The sample in this follow-up study is 1975 transplant operations. Laird N. and D. Olivier (1981) Covariance analysis of censored survival data using log-linear analysis techniques, Journal of the American Statistical Association, Vol. 76, no. 374, pp. 231- 240. The authors claim that they go beyond Holford (1980) ‘The analysis of rates and survivorship using log-linear models’, Biometrics, 36:299-306 d:\s\data\laird\kidney\laird.doc
Life-table data on graft survival Kidney Transplant Study Life-table data on graft survival Exposure (Exp) is calculated as follows: Exp = [E - 0.5(W + D)]*# in days where E = number entered W = number withdrawn D = number died # = width of interval (the last open interval was taken as having 180 days) 608*90 + 30*45 d:\s\data\laird\laird_lt.xls
Death rates (* 1000; per day) Kidney Transplant Study Death rates (* 1000; per day)
Kidney Transplant Study CAD LRD
SPSS Deaths 9-12 m: 107325 * exp[-8.7857+1.0087]=107325*0.0004193=45 Kidney Transplant Study Model: Poisson Design: Constant + TIME SPSS 1 Constant 2 [TIME = 1] 3 [TIME = 2] 4 [TIME = 3] 5 [TIME = 4] 6 [TIME = 5] 7 [TIME = 6] 8 [TIME = 7] 9 [TIME = 8] 10 [TIME = 9] 11 [TIME = 10] 12 [TIME = 11] 13 [TIME = 12] 14 [TIME = 13] 15 [TIME = 14] 16 [TIME = 15] 17x[TIME = 16] Parameter Estimate SE 1 -8.7857 .5774 2 3.6281 .5883 3 3.5743 .5879 4 3.6502 .5909 5 3.4168 .5893 6 3.1263 .5825 7 2.7212 .5858 8 2.0913 .5831 9 1.0350 .5951 10 1.0087 .5963 11 .1819 .5997 12 .2822 .5986 13 .1147 .6065 14 .0197 .6191 15 .0742 .6325 16 -.7640 .7638 17 .0000 . Deaths 9-12 m: 107325 * exp[-8.7857+1.0087]=107325*0.0004193=45
Kidney Transplant Study Model: Poisson Design: Constant + DONOR TYPE + TIME (unsaturated model) Estimate SE -9.2184 .5791 .8573 .0730 .0000 . 3.4734 .5885 3.4260 .5880 3.5097 .5910 3.2837 .5893 3.0026 .5826 2.6089 .5858 1.9928 .5832 .9476 .5952 .9258 .5963 .1046 .5997 .2116 .5986 .0555 .6065 -.0258 .6191 .0349 .6325 -.7890 .7638 1 Constant 2 [CAD = 1.00] 3 x [LRD = 2.00] 4 [P1 = 1] 5 [P1 = 2] 6 [P1 = 3] 7 [P1 = 4] 8 [P1 = 5] 9 [P1 = 6] 10 [P1 = 7] 11 [P1 = 8] 12 [P1 = 9] 13 [P1 = 10] 14 [P1 = 11] 15 [P1 = 12] 16 [P1 = 13] 17 [P1 = 14] 18 [P1 = 15] 19 x [P1 = 16] Deaths 9-12 m: 53370 * exp[-9.2184+0.8573+0.9258]=53370*0.000590=31.49 Observed: 30