Tetiana Ianevych and Veronika Serhiienko

Slides:



Advertisements
Similar presentations
Qualitative and Limited Dependent Variable Models Chapter 18.
Advertisements

Properties of Least Squares Regression Coefficients
Challenges in small area estimation of poverty indicators
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY
Chapter Nine Copyright © 2006 McGraw-Hill/Irwin Sampling: Theory, Designs and Issues in Marketing Research.
STA291 Statistical Methods Lecture 16. Lecture 15 Review Assume that a school district has 10,000 6th graders. In this district, the average weight of.
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Measures of Dispersion & The Standard Normal Distribution 2/5/07.
Poverty Estimation in Small Areas Agne Bikauskaite European Conference on Quality in Official Statistics (Q2014) Vienna, 3-5 June 2014.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Chapter 3- Model Fitting. Three Tasks When Analyzing Data: 1.Fit a model type to the data. 2.Choose the most appropriate model from the ones that have.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Tobit and Selection Models HISHAM ABOU-TALEB R3 basics course
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Estimating standard error using bootstrap
The simple linear regression model and parameter estimation
Notes on Weighted Least Squares Straight line Fit Passing Through The Origin Amarjeet Bhullar November 14, 2008.
Chapter 12 Understanding Research Results: Description and Correlation
EMPA P MGT 630.
Chapter 2: Measurements and Calculations
Significant Figures.
Chapter 7. Classification and Prediction
Part 5 - Chapter
Part 5 - Chapter 17.
12. Principles of Parameter Estimation
Limited Dependent Variables
26134 Business Statistics Week 5 Tutorial
Inference and Tests of Hypotheses
Calibrated estimators of the population covariance
Introductory Statistics
Chapter 4: The Nature of Regression Analysis
Fundamentals of regression analysis 2
Statistics in Applied Science and Technology
Chapter Three Research Design.
Simple Linear Regression - Introduction
Ratio and regression estimation STAT262, Fall 2017
Making Statistical Inferences
Introduction to Instrumentation Engineering
Part 5 - Chapter 17.
Estimation of Sampling Errors, CV, Confidence Intervals
Arithmetic Mean This represents the most probable value of the measured variable. The more readings you take, the more accurate result you will get.
Measurements and Their Uncertainty 3.1
Measurements and Their Uncertainty
Descriptive Analysis and Presentation of Bivariate Data
Chapter 10: Basics of Confidence Intervals
مدلسازي تجربي – تخمين پارامتر
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Statistics Workshop Tutorial 1
5.2 Least-Squares Fit to a Straight Line
Chengyaun yin School of Mathematics SHUFE
The Coefficient of Determination (R2) vs Relative Standard Error (RSE)
Discrete Least Squares Approximation
Statistics II: An Overview of Statistics
Winsorisation for estimates of change
Lesson 2.2 Linear Regression.
Descriptive Statistics Univariate Data
Chapter 4: The Nature of Regression Analysis
2.3. Measures of Dispersion (Variation):
12. Principles of Parameter Estimation
Propagation of Error Berlin Chen
Propagation of Error Berlin Chen
Part II Second-Generation Studies of Labor Supply
Applied Statistics and Probability for Engineers
Presentation transcript:

Effect of using Tobit and Heckit models in regression estimation for data with many zeros Tetiana Ianevych and Veronika Serhiienko Taras Shevchenko National University of Kyiv, Ukraine Jelgava, Latvia 2018

Outline The main problem Tobit model Heckit model Simulation results Consclusions

The main problem Data contain number of zero observations Estimators have undesirable or unacceptable precision

Why the zeros are present in the data? a result of censoring Tobit model a decision that the researcher has no control over for some reason Heckit model or some other models

Tobit model & Censoring Data are censored when we have partial information about the value of a variable— we know it is beyond some boundary, but not how far above or below it. For example, censoring occurs when a value occurs outside the range of a measuring instrument. For example, a bathroom scale might only measure up to 140 kg. If a 160 kg individual is weighed using the scale, the observer would only know that the individual's weight is at least 140 kg.

Formally, it can be written as The most common choice is τ = τy =0 and where This mathematical model was introduced by Tobin in 1958 and is known as Tobit model or a censored regression model

Underlying generating process

Censored data

Heckit model This type of model is appropriate when yi = 0 because of the non-observable response. It means that knowledge yi = 0 is uninformative in estimating the determinants of the level of yi . The Heckit model was introduced by Heckman in 1979

Heckit model can be formulated using : “participation” equation and “consumption” equation

Underlying generating process

Zero-inflated data

Simulation study Our objective is to investigate the effect of using Tobit and Heckit models in regression estimation for data containing different percent of zero values within SRS sampling design

Measures of efficiency absolute relative bias relative root mean square error

Tobit - 26% Our first simulated population U consists of N=1000 elements for which we produce 26% of zero values using censoring. Sampling size is 100. Horvitz-Thompson estimator (%) GREG estimator LM assisted (%) Tobit assisted (%) ARB 0.139 2.764 0.185 RRMSE 9.287 7.200 7.073

Tobit - 51% The second simulated population U consists of N=1000 elements for which we produce 51% of zero values using censoring. Sampling size is 100. Horvitz-Thompson estimator (%) GREG estimator LM assisted (%) Tobit assisted (%) ARB 0.380 10.203 0.690 RRMSE 13.986 13.972 13.782

Tobit - 72% The third simulated population U consists of N=1000 elements for which we produce 72% of zero values using censoring. Sampling size is 100. Horvitz-Thompson estimator (%) GREG estimator LM assisted (%) Tobit assisted (%) ARB 0.109 0.522 1.170 RRMSE 21.039 20.150 28.089

Heckit - 30% Simulated population U consists of N=1000 elements for which we produce 30% of zero values using Heckit model. Sampling size is 100. Horvitz-Thompson estimator (%) GREG estimator LM assisted (%) Heckit assisted (%) ARB 8.497 14.869 13.334 RRMSE 27.845 23.302 22.537

Heckit - 51% The simulated population U consists of N=1000 elements for which we produce 51% of zero values using Heckit model. Sampling size is 100. Horvitz-Thompson estimator (%) GREG estimator LM assisted (%) Heckit assisted (%) ARB 0.181 0.941 0.988 RRMSE 21.528 16.331 16.317

Heckit - 73% The simulated population U consists of N=1000 elements for which we produce 73% of zero values using Heckit model. Sampling size is 100. Horvitz-Thompson estimator (%) GREG estimator LM assisted (%) Heckit assisted (%) ARB 0.269 1.429 1.362 RRMSE 25.007 19.627 19.603

Conclusions usage of GREG estimators leads to biased but better results with regards to the accuracy. usage of the Tobit and Heckit-based estimators improve the quality of GREG estimator with regard to both bias and mean square error if the number of zero-values is not large. If it is large - the improvement can be lost. if the underlying process of zero-values appearing does not correspond well with estimator the improvement can be lost for both Tobit and Heckit models.

Thank you for your attention!