Download presentation
Presentation is loading. Please wait.
Published byAnabel Patrick Modified over 9 years ago
1
Using Modern Missing Data Analyses for effective inference about Hunters’ satisfaction towards OFW Program Muhammad Imran Khan
2
Motivation of Study Hunting & fishing are part of Nebraska's heritage NGPC is interested in improving hunter/angler recruitment & retention ( NGPC,2008 ) Data collected in 2013 to know about hunters’ motivations & satisfactions towards OFW lands Purpose of this study is to compare estimates using appropriate imputation methods 2
3
Missing Data Missingness in Surveys ( Groves et al., 2004 ) – Noncoverage – Unit Nonresponse – Item Nonresponse – Partial Nonresponse ( Brick & Kalton,1996 ) – Data Entry Error ( Anne & Andrea,2014 ) Missing data Mechanism( Buuren, 2012 ) – Missing Completely At Random (MCAR) – Missing At Random (MAR) – Missing Not At Random (MNAR) 3
4
How much missing data is “problematic” Researchers assign some limits: – > 5% ( Schafer,1999 ) – >10% ( Benntt,2001 ) – >20% ( Peng et al., 2006 ) – ( Widaman,2006 ) specified the following scale o 1%-2% (Negligible) o 5%-10% ( Minor) o 10%-25% (Moderate) o 25%-50% (High) o >50% (Excessive) Important problems of missingness ( Bell & Fairclough,2013 ) – decrease in precision – Increase bias in parameter estimation 4
5
NGPC & UNL conducted survey Sampling frame: hunters who purchased hunting license for hunting in 2012 in NE – The survey contained three parts: o Where, & what hunt; Environment Impact o Motivations(Relatedness, Competence, Autonomy) o Socio-demographic factors About collected data – Total questions = 42 (used 19 Qus. for analysis) – Sample size = 8181 – Completely filled =1555 (19%) – Unit nonresponse = 627 (8%) – Item nonresponse = 5999 (73%) o Varies from 1 to 8 missingness per respondent in all 19 Qus. 5 81%
6
Determining Type of Missing Data 6 M.Satisf.Rel_1Rel_2Comp.Auto. H_Days“Harvest” Educ.IncomeAge Ns.5171332 34539750960108814651263 %0.6850.04 0.0460.0530.67500.1440.1940.167
7
Data used for analysis 13 Questions for motivation based on SDT 5 Questions on relatedness transformed to 2 factors 7
8
Data used for analysis 13 Questions for motivation based on SDT 4 Qus. on competence & autonomy transformed each to 1 factor 8
9
Satisfaction=Rel_1+Rel_2+Comp+Auto+ Educ+Age+Income+H_Days+Harvest Model used for the analysis 9 VariableDescription of the variable [measured on 7 point Likert scale] SatisfactionHow satisfied were you with your experience on private lands enrolled in the Open Fields and Waters (OFW)? Releatedness_1I enjoy mentoring other hunters Releatedness_2I go hunting primarily to spend time with others & people I care about CompetenceOverall, Hunting makes me feel competent in other areas of my life AutonomyHunting helps me to feel independent; self-sufficient and more control in life Education Highest level of education that you have complete (<HS;HS;S.C;C;≥ G ) Age Age (Approximately in years) Income Total annual income for your household before taxes (8 diff. levels) Hunting_Days Visiting OFW sites allowed me to increase total days I spent hunting “Harvest” If you hunted in 2012 on a OFW site, did you harvest? (Yes/No)
10
Deletion or non-imputing methods: o List-wise Deletion ( Pigott, 2001 ) o Pair-wise Deletion ( Bennett, 2001 ) Nonstochastic or ad-hoc methods: o Mean Imputation (Graham,2003) o Regression Imputation ( Qin et.al., 2007 ) Stochastic or Established methods: o Stochastic Regression ( Todd et al., 2013 ) o Multiple Imputation(MI) (John, et al., 2007) o Full Information Maximum Likelihood(FIML) o Expectation Maximization (EM)(Yiran & Chao-Ying, 2013) Methods for Handling Missing Data 10
11
Mean Imputation 11
12
Comparing Results 12 Fitted Model List-wise DeletionMean Imputation p-value Intercept 0.4150.2050.0430.3810.0620.000 Releatedness_1 -0.0230.0400.565 -0.0050.0100.614 Releatedness_2 0.0380.0450.401 0.0170.0110.120 Competence 0.1470.0790.062 0.0230.0190.227 Autonomy 0.0490.0750.514 0.0090.0180.619 Education -0.0450.0390.241 -0.0110.0100.296 Age -0.0010.0030.682 0.0000.0010.563 Income 0.0030.0220.903 0.0020.0060.754 Hunting_Days 0.1350.0170.000 0.1620.0070.000 “Harvest” 0.5690.0770.000 0.3640.0280.000 5999 cases or rows are Deletedm=1, maxit=1
13
Multiple Imputation 13
14
Comparing Results 14 Fitted Model List-wise DeletionMean ImputationMultiple Imputation p-value Intercept 0.4150.2050.0430.3810.0620.000 0.3160.1830.093 Releatedness_1 -0.0230.0400.565 -0.0050.0100.614 -0.0190.0370.605 Releatedness_2 0.0380.0450.401 0.0170.0110.120 0.0480.0370.205 Competence 0.1470.0790.062 0.0230.0190.227 0.0970.0770.219 Autonomy 0.0490.0750.514 0.0090.0180.619 0.0170.0610.787 Education -0.0450.0390.241 -0.0110.0100.296 -0.0320.0270.245 Age -0.0010.0030.682 0.0000.0010.563 -0.0010.0020.731 Income 0.0030.0220.903 0.0020.0060.754 0.0070.0220.761 Hunting_Days 0.1350.0170.000 0.1620.0070.000 0.1520.0130.000 “Harvest” 0.5690.0770.000 0.3640.0280.000 0.5750.0600.000 5999 cases or rows are Deletedm=1, maxit=1 m=20, maxit=10
15
Comparing Results 15 Fitted Model List-wise Deletion Full Information Maximum Likelihood (FIML) Imputation Expectation Maximization (EM) Imputation p-value Intercept 0.4150.2050.043 0.3090.1850.096 0.3010.1550.053 Releatedness_1 -0.0230.0400.565 -0.0120.0320.713 -0.0100.0340.781 Releatedness_2 0.0380.0450.401 0.0610.0360.089 0.0610.0340.076 Competence 0.1470.0790.062 0.1020.0650.116 0.1060.0650.106 Autonomy 0.0490.0750.514 0.0160.0620.798 0.0130.0620.839 Education -0.0450.0390.241 -0.0340.0340.319 -0.0300.0330.359 Age -0.0010.0030.682 -0.0010.0020.779 0.0050.0200.803 Income 0.0030.0220.903 0.0060.0200.766 -0.0010.0020.752 Hunting_Days 0.1350.0170.000 0.1480.0140.000 0.1480.0150.000 “Harvest” 0.5690.0770.000 0.5990.0620.000 0.5980.0600.000 5999 cases or rows are Deleted EM algorithm (MLE) converges in 37 iterations
16
EM only shows that Releadness_2 is significant EM estimates smallest standard error for Income Comparison of Imputation Methods Summary 16 % of smaller estimations than List-wise Deletion out of 10 variables ApproachesEstimatesStd. Err.P-valueSuggestions List-wise DeletionBase Avoid to use Mean Imputation60%100%40%Careful use Multiple Imputation30%100%20%Better Full Information Maximum Likelihood 30%100%20%Better Expectation Maximization 40%90%20%Preferred if converged
17
Thanks for your kind attention Special Thanks to: Dr. Andrew Tyre, Uni. Of Nebraska, Lincoln Dr. Lisa Pennisi, Uni. Of Nebraska, Lincoln Dr. Allan McCutcheon, Uni. Of Nebraska, Lincoln Nebraska Game & Parks Commission
18
Anne-Kathrin,F. & Andrea B. (2014). The economic performance of Swiss drinking water utilities. Journal of Prod. Analysis. 41:383-397. doi 10.1007/s11123-013-0344-0 Bell, M. L.,& Fairclough,D.L. (2013). Practical and statistical issues in missing data for longitudinal patient reported outcomes. Statistical Methods in Medical Research, 0(0), 1-20. doi: 10.1177/0962280213476378 Bennett, D.A. (2001). How can I deal with missing data in my study? Australian and New Zealand Journal of Public Health, 25, 464-469. Brick, J., & Kalton, J. (1996). Handling missing data in survey research. Statistical Methods in Medical Research, 5, 215–238. doi:10.1177/096228029600500302 Buuren, S.V.(2012). Flexible imputation of missing data. Taylor & Francis, FL: CRC Press. John, W. G. & Allison E. O. & Tamika D. G.(2007). How many imputations are really needed? some practical clarifications of multiple imputation theory, Springer,8:206- 213. Graham, J. W. (2003). Adding missing-data-relevant variables to FIML based structuralequation models. Structural Equation Modeling, 10,80–100. Groves, R., Fowler, F., Couper, M., Lepkowski, J., Singer, E., & Tourangeau, R. (2004). Survey methodology. Hoboken, NJ: John Wiley. Little, R.J.A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198-1202. NGPC (2008). Nebraska 20 year hunter/angler recruitment, development and retention plan. Lincoln, NE. Pigott, T. D. (2001). A Review of Methods for Missing Data. Educational Research and Evaluation, 7(4), 353-383. Peng, C.Y., Harwell, M., Liou, S.M., & Ehman, L.H. (2006). Advances in missing data methods and implications for educational research. In S Sawilowsky (Ed.), Real data analysis (pp.31-78), Greenwich, CT: Information Age. Qin,Y.,Zhang,S.,Zhu,X.,Zang,J.,& Zhang,C. (2007). Semi-parametric optimization for missing data imputation. Appl Intell 27,79-88. DOI 10.1007/s10489-006-0032-0 Schafer, J.L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research. 8: 3-15. Todd D. L., Terrence D. J., Kyle M. L., & Whitney M. (2013). On the joys of missing data. Journal of Pediatric Psychology, 1-12. doi:10.1093/jpepsy/jst048 Yiran D. & Chao-Ying J.P.(2013). Principled missing data methods for researchers. Springer, 2:222. References 18
19
Contact Information: mik3.stat@gmail.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.