E&I for 2006 Canadian Census Mike Bankier Statistics Canada
NIM/CANCEIS Nearest neighbour imputation methodology (NIM) used last 2 Censuses Processes both categorical and numeric variables simultaneously NIM finds nearest neighbours and then detemines minimum number of variables to impute (opposite of Fellegi/Holt) NIM does good job preserving distributions
Where NIM Used 100% 2006 Canadian Census variables Canadian Survey of Household Spending Prototype software used in 2000 Brazil and Swiss Censuses CANCEIS licences signed by UK,USBC, Italy, New Zealand, Brazil, Peru, Australia, NASS, Netherlands, Switzerland, Ukraine
Extensions for 2006 Census In 2001 Census – DOS based system, no Windows interface – able to do minimum change donor imputation – can process coded or numeric variables For 2006 Census – Windows interfaces added – can also do deterministic imputation – alphanumeric variables added
2006 Census Windows Interfaces specify the data dictionary specify edit rules using decision logic tables with access to the data dictionary submit and monitor jobs
Concluding Remarks Finding nearest neighbours and then determining the minimum number of variables to impute has significant computational advantages Specialized windows interfaces makes it much easier to specify edits Developed collaboratively and iteratively over several censuses. Resulted in highly effective and efficient tool.
E&I for 2006 Canadian Census Mike Bankier Statistics Canada