The 6th Conference on Survey Sampling in Economic and Social Research September 21-22, 2009 Katowice, Poland Criticalities in Applying the Neyman’s Optimality in Business Surveys: a Comparison of Selected Allocation Methods Paola M. Chiodini a,d, Rita Lima c, Giancarlo Manzi b,d, Bianca Maria Martelli c, *, Flavio Verrecchia d a. Department of Statistics, Università di Milano-Bicocca, Milan, Italy b. Department of Economics, Business and Statistics, Università degli Studi di Milano, Milan, Italy c. ISAE, Rome, Italy d. ESeC, Assago (MI), Italy
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 2 DISCUSS POSSIBLE MORE EFFICIENT SAMPLE DESIGNS FOR THE ISAE BUSINESS TENDENCY (BTS) SURVEY –BTS Economic features –BTS Statistical features –Operational bounds TO MEET EVERYBODY’S NEEDS WHILE STRENGHTENING OUTCOMES RELIABILITY (INDUSTRIAL CONFIDENCE) AIM OF THE PAPER
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 3 BTSECONOMIC FEATURES BTS ECONOMIC FEATURES Business Tendency Surveys investigate CONFIDENCE of economic agents CONFIDENCE can be defined as the (positive) attitude of economic agents toward both firms’ (internal) and country’s (external) variables –Corresponding Universe real value unknown To this purpose BTS collect information about a wide range of variables selected for their capability, when analysed together, to give an overall picture of industrial sector of the economy (OECD 2003)
assessments expectationsThe survey ask entrepreneurs and managers assessments on current trends and expectations for the near future regarding both their own business and the general situation of the economy qualitativeBusiness Tendency Survey thus collect qualitative information, mainly with a three options ordinal scale September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 4 BTSECONOMIC FEATURES BTS ECONOMIC FEATURES
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 5CONFIDENCE “balances”Answers obtained from the survey are quantified in form of “balances”, i.e. differences between positive and negative answers’ percentages The statistical series derived from business tendency surveys are particularly suitable for monitoring and forecasting business cycles confidenceThe aggregation of selected series (order book level, production expectations and stock) gives the confidence indicator leading capabilitiesConfidence indicators (and some single series too) often have leading capabilities and are widely used in the analysis of the economic cycle (recessions/expansions)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 6 SHORT SURVEY HISTORY The manufacturing survey began 1959 on a quarterly basis and became monthly 1962 on a limited number of questions (purposive panel) During the years the survey was broadly modified to meet upcoming occurrences: –1986 the sample was updated in order to provide information also a regional level adopting a stratified (sector/region/size) partially random sample –1998 the Neyman’s optimal allocation of the reporting units to sample strata based on workforce variance was introduced (Cochran 1977) –2003 data processing was upgraded introducing a two-stage weighting system (sample weights and size weights) according to OECD (2003) able to assure a fully fledged comparability between local and national data
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 7 GDP and CONFIDENCE Confidence well fit the GDP shifts In recent times (since April 2009) positive signals from the survey (last available GDP figures Q II 2009: very negative)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 8 EUROPEAN REFERENCE FRAME The Survey is part of the Joint Harmonised Business and Consumer Survey (BCS) program of the European Commission The project began 1962 and ISAE (formerly ISCO) was one of the founder member The principle of harmonisation underlying the project aims to produce a set of comparable data for all European countries (EC 2007) To achieve this goal institutes have to: –Use the same harmonised questionnaire –To strictly respect the Commission timetable in carrying on the survey and transmitting the results Institutes are relatively free to define any other aspects of the entire process (apart from a minimum sample size)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 9 FRAME : ASIA archive of Italian active firms (last update 2006): + complete universe of firms – relatively late update BTS Statistical features: SAMPLE DESIGN QUESTIONNAIRE: fixed by Commission. Can only be integrated DATA COLLECTING MODE: CATI (Computer Aided Telephonic Interviewing), partly integrated with fax (foreseen some CAWI): Keep ASIA as FRAME MIXED MODE
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 10 OPERATIONAL CONSTRAINS –EC: SAMPLE SIZE4000recommended SAMPLE SIZE about 4000 units (firms/kind of activity units), bound to the country population size Very strict TIMING CONSTRAINTS : –MONTHLY FREQUENCY, –12 DAYS DATA COLLECTION –1 WEEK PROCESSING RESULTS LOCAL INFORMATION –NATIONAL: LOCAL INFORMATION Governmental priority Possible revenues PRESERVING “LOYAL” FIRMS –ISAE: PRESERVING “LOYAL” FIRMS: Research purposes of longitudinal analyses Conflicting with sampling theory (Panel rotation)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 11 BTS STATISTICAL FEATURES As the total sample size is predetermined (about 4000 units), to increase precision is then mainly possible to work on: –Strata definition (partially predetermined and bound to economic and administrative settings) –Units’ allocation to Strata –Panel maintenance –Non response handling –Weighting
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 12 STRATA DEFINITION STRATA defined according to: ECONOMIC SECTORS –19, nearly EC requests, adapted to Italian economy AREAS (NUTS1) –4, administrative classification, widely different in size FIRMS’ SIZE (by workforce) –Small (10-49 ), Medium (50-249), Large (>=250). Distribution is right (positively) skewed because of the presence of few “large” establishments and many “small” units Minimum threshold of 10 employees –About 80% of total workforce
FIRMS BY STATA Nord OvestNord EstCentroSud e IsoleTotal & & & & Manufacture of food, beverages and tobacco products Manufacture of textiles Manufacture of wearing apparel Manufacture of leather and related products Manufacture of wood and paper products Printing and reproduction of recorded media Manufacture of coke and refined petroleum products Manufacture of chemical and pharmaceutical products Manufacture of rubber and plastic products Manufacture of other non-metallic mineral products Manufacture of basic metals Manufacture of fabricated metal products, except machinery and equipment Manufacture of computer, electronic and optical products Manufacture of electrical equipment Manufacture of machinery and equipment n.e.c Manufacture of transport vehicles Manufacture of furniture Other manufacturing Repair and installation of machinery and equipment Total
TOTAL WORKFORCE BY STATA Nord OvestNord EstCentroSud e IsoleTotal & & & & Manufacture of food, beverages and tobacco products Manufacture of textiles Manufacture of wearing apparel Manufacture of leather and related products Manufacture of wood and paper products Printing and reproduction of recorded media Manufacture of coke and refined petroleum products Manufacture of chemical and pharmaceutical products Manufacture of rubber and plastic products Manufacture of other non-metallic mineral products Manufacture of basic metals Manufacture of fabricated metal products, except machinery and equipment Manufacture of computer, electronic and optical products Manufacture of electrical equipment Manufacture of machinery and equipment n.e.c Manufacture of transport vehicles Manufacture of furniture Other manufacturing Repair and installation of machinery and equipment Total
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland FIRMS POPULATION BY SIZE Total
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 16 SIMULATION
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 17 UNIT ALLOCATION TO STRATA: SIMULATIONS SETTINGS REFERENCE POPULATION: ASIA INDUSTRIAL SECTOR –85710 ENTERPRISES – PERSONS EMPLOYED 3 DIMENSIONS –AREAS (NUTS1) –ECONOMIC SECTORS –FIRMS’ SIZE 226 STRATA 500 REPLICATES SIMULATION TECHNIQUE: SEQUENTIAL UNIT SELECTION
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 18 UNITS ALLOCATION TO STRATA: ALTERNATIVE ALLOCATION METHODS UNIFORM (21 units per stratum) PROPORTIONAL (f h 4,4%) NEYMAN (x-optimal) ISAE (NEYMAN x-optimal on areas; winsorised 5%) AOSU(n 1h ): UNIFORM(n 1h ) + NEYMAN(n 2h ) –n 1h = 1, 2, …, 21 –n 2h = n h -n 1h –so that: n 1h = 0 then AOSU0 = NEYMAN n 1h = 21 then AOSU21 = UNIFORM APSU(n 1h ): UNIFORM(n 1h ) + PROPORTIONAL(n 2h )
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 19 UNIT ALLOCATION TO STRATA: SIMULATION METHOD START RANDOM UNIT SELECTION (SEQUENTIALY RANKED) REPLICATION Simulation DW If replicate < 500 If replicate = 500 Allocation Methods Neyman samples ISAE samples AOSU(n 1 ) samples … OUTPUT END OVERALL STATS DOMAIN STATS INFERENCE
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 20 Total workforce) Distribution of Replication (Total workforce) NeymanISAE AOSU3 AOSU9 UNIFORM PROP.
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 21 OVERALL POPULATION
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 22 Total workforce) REPLICATION BOX PLOT (Total workforce)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 23 STATISTICS Bias = N – N r Total Error (TE) = |Bias| + N r Relative Total Error (RTE) = TE / N r Range = max(N r ) - min(N r ) Where: – : Population mean – r : Replication mean – r : Replication STD –N : # Enterprises
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 24 UNIT ALLOCATION: Statistics |BIAS|STDTERANGE isae neyman aosu aosu aosu uniform apsu proportional
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 25 Total workforce) REPLICATION BOUNDED BOX PLOT (Total workforce)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 26 UNIT ALLOCATION: Statistics BOUNDED UNIT ALLOCATION: Statistics Bound: Max 50% allocation per strata Minimum 3 unit per strata |BIAS|STDTERANGE aosu aosu aosu24 (i.e uniform) apsu
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 27 DOMAIN ANALYSIS
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 28 STRATA COVERAGE AOSU, UNIF, PROP: 0 strata with 0% allocation NEYMAN: 12 strata with 0% allocation ISAE: 7 strata with 0% allocation
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 29 STRATA STATISTICS CV s = r s / r s Bias s = s – r s Total Error s (TE s ) = |Bias s | + r s Relative Total Error s (RTE s ) = TE s / r s = (|Bias s | / r s ) + r CV s Where: – s : Strata population mean – r S : Strata replication mean – r S : Strata replication STD
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 30 STRATA BOX PLOT: |Bias| by strata (|Bias s |)
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 31 STRATA BOX PLOT: CV of replication means by strata ( r CV s )
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 32 STRATA BOX PLOT: Relative Total Error by strata (RTE s )
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 33 UNIT ALLOCATION TO STRATA: Statistics RTE Max (|BIAS s | / r s ) Max ( r CV s ) Max (RTE s ) isae0,00670,03150,56640,5979 neyman0,00640,03150,56640,5979 aosu10,00660,02440,42500,4491 aosu30,00690,02020,27750,2778 aosu90,00720,01410,15490,1624 uniform0,01830,02260,40420,4152 apsu30,03880,05821,00521,0094 proportional0,05630,10331,66451,6713
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 34 CONCLUDING REMARKS AND OPEN QUESTIONS Strata allocation: best proposal seem to be: Overall population: Neyman Domain analysis: Approach based on Neyman and strata representativeness constraints The AOSU(n 1 ) family ISAE They allow to strike a balance between theory and practical constraints
September, , 6th Conference “Survey Sampling in Economic and Social Research “, Katowice, Poland 35 THANK YOU FOR YOUR ATTENTION!