Presentation is loading. Please wait.

Presentation is loading. Please wait.

A. Analysis of count data

Similar presentations


Presentation on theme: "A. Analysis of count data"— Presentation transcript:

1 A. Analysis of count data
Introduction to log-linear models

2 Log-linear analysis Contingency-table analysis
Categorical data analysis Discrete multivariate analysis (Bishop, Fienberg and Holland, 1975) Analysis of cross-classified data Multivariate analysis of qualitative data (Goodman, 1978) Count data analysis

3 Contrast Coding Log-linear models for two-way tables
Saturated log-linear model: Overall effect (level) Main effects (marginal freq.) Interaction effect In case of 2 x 2 table: 4 observations 9 parameters Normalisation constraints

4 Survey: leaving parental home in the Netherlands

5 Descriptive statistics
Leaving home Descriptive statistics Counts Percentages Odds of leaving home early rather than late Reference category

6 Log-linear models for two-way tables 4 models
Leaving home Log-linear models for two-way tables 4 models Model 1: Null model or overall effect model All categories are equiprobable (an observation is equally likely to fall into any cell) for all i and j Exp(4.887) = 132.5 = 530/4  = s.e ij is expected count (frequency) in cell (ij): category i of variable A (row) and category j of variable B (column)

7 Leaving home Where ij is a cell frequency generated by a Poisson process and Var[aX] = a2 Var[X] where a is a constant (e.g. Fingleton, 1984, p. 29)

8 Log-linear models for two-way tables
Leaving home Log-linear models for two-way tables Model 2: B null model Categories of variable B (sex) are equiprobable within levels of variable A (age) for all j GLIM estimate s.e Parameter Exp(parameter) Overall effect TIME(1) TIME(2)

9 Log-linear models for two-way tables
Leaving home Log-linear models for two-way tables Model 3: B null model Categories of variable A (age) are equiprobable within levels of variable B (sex) for all j SPSS estimate s.e Parameter Exp(parameter) Overall effect TIME(1) TIME(2)

10 Log-linear models for two-way tables
Leaving home Log-linear models for two-way tables Model 4: independence model (unsaturated model) Categories of variable B (sex) are not equiprobable but the probability is independent of levels of variable A (time) estimate s.e Parameter Exp(parameter) Overall effect TIME(2) SEX(2) GLIM

11 LOG-LINEAR MODEL: predictions Females leaving home early: 109.62
Females leaving home late: * = Males leaving home early: * = 99.37 Males leaving home late: * * =

12 SPSS Parameter Estimate SE 1 5.0280 .0721 Overall effect
Leaving home SPSS Parameter Estimate SE Overall effect Time(1) Time(2) Sex(1) Sex (2)

13 Log-linear models for two-way tables
Leaving home Log-linear models for two-way tables Model 5: saturated model The values of categories of variable B (sex) depend on levels of variable A (time) estimate s.e parameter Overall effect TIME(2) SEX(2) TIME(2).SEX(2) GLIM

14 Parameter Estimate SE Parameter 1 5.1846 .0748 Overall effect
Leaving home Parameter Estimate SE Parameter Overall effect Time(1) Time(2) Sex(1) Sex(2) Time(1) * Sex(1) Time(1) * Sex(2) Time(2) * Sex(1) Time(2) * Sex(2) SPSS

15 LOG-LINEAR MODEL: predictions Expected frequencies
Leaving home LOG-LINEAR MODEL: predictions Expected frequencies Observed Model 1 Model 2 Model 3 Model 4 Model 5 Fem_<20 F Mal_<20 F Fem_>20 F Mal_>20 F D:\s\1\liebr\2_2\2_2.wq2

16 Relation log-linear model and Poisson regression model
are dummy variables (0 if i or j is equal to 1and1 if i or j equal to 2) and interaction variable is

17

18

19

20

21

22

23 Log-linear model fit a model to a table of frequencies
Data: survey of political attitudes of British electors Source: Payne, C. (1977) The log-linear model for contingency. In: C.O. Muircheartaigh and C. Payne eds. The analysis of survey data. Vol 2: Model fitting, Wiley, New York, pp [data p. 106].(from Butler and Stokes, ‘Political change in Britain’, Macmillan, 2nd edidition, 1974)

24 The classical approach
Geometric means (Birch, 1963) Effect coding (mean is ref. Cat.) Birch, M.W. (1963) ‘Maximum likelihood in three-way contingency tables’,J. Royal Stat. Soc. (B), 25:

25 The basic model Political attitudes Overall effect : 22.98/4 = 5.7456
Effect of party : Conservative : 11.49/ = Labour : 11.49/ = Effect of gender : Male : 11.44/ = Female : 11.54/ = Interaction effects: Gender-Party interaction effect Male conservative : = Female conservative : = Male labour : = Female labour : =

26 The basic model (Effect Coding: Mean)
Political attitudes The basic model (Effect Coding: Mean) Birch, M.W. (1963) ‘Maximum likelihood in three-way contingency tables’,J. Royal Stat. Soc. (B), 25: Coding: effect coding Parameters are subject to constraints: normalisation constraints Only first-order contrasts can be estimated:

27 Political attitudes The basic model (GLIM) Estimate S.E.

28 Political attitudes The basic model (SPSS)

29 The basic model (1) Political attitudes
ln 11 = = ln 12 = = ln 21 = = ln 22 = =

30 The design-matrix approach

31 I. Design matrix: Effect Coding unsaturated log-linear model
Number of parameters exceeds number of equations  need for additional equations (X’X)-1 is singular  identify linear dependencies

32 I. Design matrix unsaturated log-linear model
(additional eq.) Coding!

33 3 unknowns  3 equations where is the frequency predicted by the model

34 Political attitudes

35  Political attitudes 314.17*1.0040*0.9772 = 308.23
314.17*[1/1.0040]* =

36 Design matrix Saturated log-linear model

37 Political attitudes exp[ ] = exp[5.6312] = 279 exp[ ] = 335

38 Political attitudes

39 Other Ways of Restricting II. Design Matrix: Contrast Coding

40 III. Design matrix: other restrictions on parameters saturated log-linear model
(SPSS)

41 Political attitudes

42 Political attitudes

43 Political attitudes

44 Political attitudes

45 Prediction of counts or frequencies:
Political attitudes Prediction of counts or frequencies: A. Effect coding 279 = * * * 352 = * * * 335 = * * * 291 = * * * B. Contrast coding: GLIM 291 = 279 * * * (females voting labour) 279 = 279 * * * (males voting conservative = ref.cat) 352 = 279 * * * (females voting conservative) 335 = 279 * * * (males voting labour) C. Contrast coding: SPSS (SPSS adds 0.5 to observed values ) 279.5 = * * * 352.5 = * * * 1 291.5 = * * * 1 (females voting labour = ref.cat) 335.5 = * * * 1


Download ppt "A. Analysis of count data"

Similar presentations


Ads by Google