National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL Session 38 Alcohol Imputation Model Why Impute? Joseph (Joe) M. Tessmer Mathematical Analysis Division
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 2 Why?
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 3 There is a problem in Alcohol Reporting
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 4 Wide range of BAC reporting of drivers and non-occupants by states Levels of reporting alcohol test results for drivers and non-occupants involved in fatal crashes ranged by states from: Less than 12 Percent to More than 86 Percent
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 5 Why Impute ? Reduce Potential Biases in Estimates 14 % to 88 % of the BAC test results are missing in FARS – dependent on the state Nationally, approximately 60 % of BAC data are missing for drivers and non-occupants
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 6 Why Impute ? If the individuals selected for BAC level testing is not a random sample, the estimates will be biased Often only drivers suspected of a high BAC are tested We would over-estimate BAC levels 44 % of tested individuals had BAC > 0
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 7 Current Imputation Procedure 3 Level discriminant analysis For each missing BAC number calculate the probability that 1) BAC = 0 2) 0 < BAC < 0.1 3) 0.1 <= BAC Note the probabilities add to 1
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 8 Current Imputation Procedure Provides some useful information It was a major step forward when introduced in 1986 It is a rigid procedure and can not be used to quantify the effect of the current 0.08 BAC legislation
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 9 Current Imputation Procedure Results can not be used as input to other types of analysis Can not be used as an independent variable in crash analysis Can be used as a weight
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 10 Why Change to Multiple Imputation ? State of the art solution Imputed values are actual BAC levels which can be used in additional analysis.
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 11 Why Change to Multiple Imputation ? Improve fidelity of results Permits analysis at any level of BAC Old technique uses the probability that value falls within one of three ranges [More difficult to use.]
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 12 Why Change to Multiple Imputation ? Can calculate the standard error of the estimates. Achieve greater confidence in results (narrower confidence limits)
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 13 Multiple Imputation Advantages Uses several characteristics of the crash and of the driver or non-occupant to estimate BAC levels.
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 14 Example 1 Driver Characteristics Female Driver 36 years old Seat belt used Crash Characteristics 8:20 a.m. Tuesday in October 3 passengers all children in vehicle 2 Vehicles involved Police reported no drinking and no BAC data Estimated BAC = 0.0
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 15 Example 2 Driver Characteristics Male Driver 23 years old Seatbelt not used Crash Characteristics 2:10 a.m. Saturday in July No passengers Single vehicle crash Police reported drinking but no BAC data Estimated BAC = 0.14
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 16 Example 3 Driver Characteristics Male Driver 23 years old Seatbelt not used Crash Characteristics 2:10 a.m. Saturday in July No passengers Single vehicle crash Police reported no drinking and no BAC data Estimated BAC = 0.00
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 17 Multiple Imputation is a lot of work... Uses several characteristics of the crash and of the driver or non-occupant to estimate 10 BAC levels for each case w/missing BAC. Does it work? Are the numbers right?
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 18 Yes!!
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 19 Verification Test Select a year of FARS data Restrict data to known BAC data Randomly recoded 25 % of known BAC “data missing”
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 20 Verification Test Impute the “missing data”, based on 75% remaining data Compare estimates vs. the actual data Repeat test.
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 21 Verification Test BAC Estimates for the 25% Removed Percent Total Drivers Involved Known MI Percent Total Drivers Killed (Daytime) Known MI
National Center for Statistics & Analysis People Saving People 28 th Annual Traffic Records Forum, Orlando, FL 22 QUESTIONS?