Inequality: Empirical Issues

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

“Students” t-test.
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Empirical Analysis Doing and interpreting empirical work.
Frank Cowell: TU Lisbon – Inequality & Poverty Inequality: Empirical Issues July 2006 Inequality and Poverty Measurement Technical University of Lisbon.
Inferences About Process Quality
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
AM Recitation 2/10/11.
Chapter 8 Introduction to Hypothesis Testing
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Parametric Approaches to Welfare Measurement. Background Up until now our examination of welfare has been essentially non-parametric in a statistical.
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Estimating standard error using bootstrap
Chapter 8: Estimating with Confidence
CHAPTER 9 Testing a Claim
Statistics in Management
Chapter 8: Estimating with Confidence
Nonparametric Tests IPS Chapter : The Wilcoxon Rank Sum Test
Two-Sample Hypothesis Testing
CHAPTER 10 Comparing Two Populations or Groups
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
When we free ourselves of desire,
Test for Mean of a Non-Normal Population – small n
CHAPTER 9 Testing a Claim
Discrete Event Simulation - 4
Geology Geomath Chapter 7 - Statistics tom.h.wilson
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Estimating with Confidence
CHAPTER 9 Testing a Claim
An examination of the purpose and techniques of inequality measurement
Chapter 8: Estimating with Confidence
Chapter 7: The Normality Assumption and Inference with OLS
Inferential Statistics
Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.
Chapter 8: Estimating with Confidence
CHAPTER 9 Testing a Claim
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 9 Testing a Claim
CHAPTER 2: Basic Summary Statistics
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
CHAPTER 9 Testing a Claim
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
The z-test for the Mean of a Normal Population
For a permutation test, we have H0: F1(x) = F2(x) vs
CHAPTER 10 Comparing Two Populations or Groups
Introduction to Econometrics, 5th edition
Statistical inference
Presentation transcript:

Inequality: Empirical Issues Inequality and Poverty Measurement Universitat Autònoma de Barcelona Frank Cowell http://darp.lse.ac.uk/uab2006 July 2006

Introduction Focus on an issue common to most empirical applications in distributional analysis Sensitivity to extreme values Should be able to estimate inequality and other indices by using sample data. But, how do very low / very high observations affect estimates? References found in Cowell, F. A. and Flachaire, E. (2002) "Sensitivity of Inequality Measures to Extreme Values" Distributional Analysis Discussion Paper, 60, STICERD, LSE, Houghton St., London, WC2A 2AE.” Motivation Interested in sensitivity to extreme values for a number of reasons Welfare properties of income distribution Robustness in estimation Intrinsic interest in the very rich, the very poor.

Sensitivity? How to define a “sensitive” inequality measure? Ad hoc discussion of individual measures empirical performance on actual data (Braulke 83). not satisfactory for characterising general properties Welfare-theoretical approaches focuses on transfer sensitivity (Shorrocks-Foster 1987) But does not provide a guide to the way measures may respond to extreme values. Need a general and empirically applicable tool.

Preliminaries Define two moments: A large class of inequality measures: Can be written as:

The Influence Function Mixture distribution: Influence function: For the class of inequality measures: which yields:

Some Standard Measures GE: Theil: MLD: Atkinson: Log var:

…and their IFs GE: Theil: MLD: Atkinson: Log var:

Special case The Gini coeff: The IF: where:

Tail behaviour z  0 [log z] 2 za - log z z   z za a < 0 a = 0 Log Var Gini GE

Implications Generalised Entropy measures with  > 1 are very sensitive to high incomes in the data. GE ( < 0) are very sensitive to low incomes We can’t compare the speed of increase of the IF for different values of 0 <  < 1 If we don’t know the income distribution, we can’t compare the IFs of different class of measures. So, let’s take a standard model…

Singh-Maddala c = 1.7 c = 1.2 c = 0.7

Using S-M to get the IFs Take parameter values a=100, b=2.8, c=1.7 Good model of income distribution of German households Take parameter values a=100, b=2.8, c=1.7 Use these to get true values of inequality measures. Obtained from the moments: Normalise the IFs Use relative influence function

IFs based on S-M Gini Gini Gini Gini

IF using S-M: conclusions When z increases, IF increases faster with high values of a. When z tends to 0, IF increases faster with small values of a. IF of Gini index increases slower than others but is larger for moderate values of z. Comparison of the Gini index with GE or Log Variance does not lead to clear conclusions.

A simulation approach Use a simulation study to evaluate the impact of a contamination in extreme observations. Simulate 100 samples of 200 observations from S-M distribution. Contaminate just one randomly chosen observation by multiplying it by 10. Contaminate just one randomly chosen observation by dividing it by 10. Compute the quantity Contaminated Distribution Empirical Distribution

Contamination in high values RC(I) 100 different samples sorted such that Gini realisations are increasing. Gini is less affected by contamination than GE. Impact on Log Var and GE (0<a1 is relatively small compared to GE (a<0) or GE (a>1) GE (0 a1) is less sensitive if a is smaller Log Var is slightly more sensitive than Gini

Contamination in low values RC(I) 100 different samples sorted such that Gini realisations are increasing. Gini is less affected by contamination than GE. Impact on Log Var and GE (0<a1 is relatively small compared to GE (a<0) or GE (a>1) GE (0 a1) is less sensitive if a is larger Log Var is more sensitive than Gini

Influential Observations Drop the ith observation from the sample Call the resulting inequality estimate Î(i) Compare I(F) with Î(i) Use the statistic Take sorted sample of 5000 Examine 10 from bottom, middle and top

Influential observations: summary Observations in the middle of the sorted sample don’t affect estimates compared to smallest or highest observations. Highest values are more influential than smallest values. Highest value is very influential for GE (a = 2) Its estimate should be modified by nearly 0.018 if we remove it. GE (a = –1) strongly influenced by the smallest observation.

Extreme values An extreme value is not necessarily an error or some sort of contamination Could be an observation belonging to the true distribution Could convey important information. Observation is extreme in the sense that its influence on the inequality measure estimate is important. Call this a high-leverage observation.

High-leverage observations The term leaves open the question of whether such observations “belong” to the distribution But they can have important consequences on the statistical performance of the measure. Can use this performance to characterise the properties of inequality measures under certain conditions. Focus on the Error in Rejection Probability as a criterion.

Davidson-Flachaire (1) Even in very large samples the ERP of an asymptotic or bootstrap test based on the Theil index, can be significant Tests are therefore not reliable. Three main possible causes : Nonlinearity Noise Nature of the tails.

Davidson-Flachaire (2) Three main possible causes : Indices are nonlinear functions of sample moments. Induces biases and non-normality in estimates. Estimates of the covariances of the sample moments used to construct indices are often noisy. Indices often sensitive to the exact nature of the tails. A bootstrap sample with nothing resampled from the tail can have properties different from those of the population. Simulation experiments show that case 3 is often quantitatively the most important. Statistical performance should be better with MLD and GE (0 < a < 1 ), than with Theil.

Empirical methods The empirical distribution Empirical moments Indicator function Empirical moments Inequality estimate

Testing Variance estimate For given value I0 test Test statistic

Bootstrap To construct bootstrap test, resample from the original data. Bootstrap inference should be superior For bootstrap sample j, j = 1,…,B, a bootstrap statistic W*j is computed almost as W from the original data But I0 in the numerator is replaced by the index Î estimated from the original data. Then the bootstrap P-value is

Error in Rejection Probability: A ERPs of asymptotic tests at the nominal level 0.05 Difference between the actual and nominal probabilities of rejection Example: N = 2 000 observations ERP of GE (a =2) is 0.11 Asymptotic test over-rejects the null hypothesis The actual level is 16%, when the nominal level is 5%.

Error in Rejection Probability: B ERPs of bootstrap tests. Distortions are reduced for all measures But ERP of GE (a = 2) is still very large even in large samples ERPs of GE (a = 0.5, –1) is small only for large samples. GE (a=0) (MLD) performs better than others. ERP is small for 500 or more observations.

More on ERP for GE 2 –1 0.5 1 N=100,000 N=50,000 a 0.0492 0.0113 What would happen in very large samples? 2 –1 0.5 1 N=100,000 N=50,000 a 0.0492 0.0113 0.0024 0.0054 0.0096 0.0415 0.0125 0.0043 0.0052 0.0096

ERP: conclusions Rate of convergence to zero of ERP of asymptotic tests is very slow. Same applies to bootstrap Tests based on GE measures can be unreliable even in large samples.

Sensitivity: a broader perspective Results so far are for a specific Singh-Maddala distribution. It is realistic, but – obviously – special. Consider alternative parameter values Particular focus on behaviour in the upper tail Consider alternative distributions Use other familiar and “realistic” functional forms Focus on lognormal and Pareto

Alternative distributions First consider comparative contamination performance for alternative distributions, same inequality index Use same diagrammatic tool as before x-axis is the 100 different samples, sorted such inequality realizations are increasing y-axis is RC(I) for the MLD index

Singh-Maddala Distribution function: Inequality found from: c = 0.7 (“heavy” upper tail) c = 1.2 c = 1.7

MLD Contamination S-M

Lognormal Distribution function: Inequality: s = 0.5 s = 0.7 s = 1.0 (“heavy” upper tail)

MLD Contamination: Lognormal

Pareto a = 1.5 (“heavy” upper tail) a = 2.0 a = 2.5

MLD Contamination Pareto

ERP at nominal 5%: MLD Asymptotic tests Bootstrap tests

ERP at nominal 5%: Theil Asymptotic tests Bootstrap tests

Comparing Distributions Bootstrap tests usually improve numerical performance. MLD is more sensitive to contamination in high incomes when the underlying distribution upper tail is heavy. ERP of an asymptotic and bootstrap test based on the MLD or Theil index is more significant when the underlying distribution upper tail is heavy.

Why the Gini…? Why use the Gini coefficient? Obvious intuitive appeal Sometimes suggested that Gini is less prone to the influence of outliers Less sensitive to contamination in high incomes than GE indices. But little to choose between… the Gini coefficient and MLD Gini and the logarithmic variance

The Bootstrap…? Does the bootstrap “get you out of trouble”? bootstrap performs better than asymptotic methods, but does it perform well enough? In terms of the ERP, the bootstrap does well only for the Gini, MLD and logarithmic variance. If we use a distribution with a heavy upper tail bootstrap performs poorly in the case of a = 0 even in large samples.