Sampling bias in multi-agent simulation (MAS) models Buysse, J 1., Frija, A 1., Van der Straeten, B 1., Nolte, S. 1, Lauwers, L. 1,2, Claeys, D. 2 and.

Slides:



Advertisements
Similar presentations
The methodology used for the 2001 SARs Special Uniques Analysis Mark Elliot Anna Manning Confidentiality And Privacy Group ( University.
Advertisements

Regional Impact Assessment AgMIP SSA Kickoff Workshop John Antle AgMIP Regional Econ Team Leader 1 Accra, Ghana Sept
20. Januar 2011Der Effekt von Direktzahlungen auf die Einkommensverteilung1 122 nd EAAE-Seminar February 17 th 1Evidence-Based Agricultural and Rural Policy.
A Connected Vehicle-Based Application to Estimate Road Roughness Transportation agencies devote significant resources towards collection of highly detailed.
Adam Arsenault Department of Agricultural Economics University of Saskatchewan UNIVERSITY OF SASKATCHEWAN Saskatoon, Saskatchewan, Canada.
1 Low-Dose Dual-Energy CT for PET Attenuation Correction with Statistical Sinogram Restoration Joonki Noh, Jeffrey A. Fessler EECS Department, The University.
The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.
April 21, 2010 STAT 950 Chris Wichman. Motivation Every ten years, the U.S. government conducts a population census, and every five years the U. S. National.
Module 36: Correlation Pitfalls Effect Size and Correlations Larger sample sizes require a smaller correlation coefficient to reach statistical significance.
Landbouweconomie, Coupure Links 653, 9000 Gent Sub-vector Efficiency Analysis in Chance Constrained Stochastic.
The effect of EU derogation strategies on the complying costs of the nitrate directive 1 Van der Straeten, B.*, Buysse, J.*, Nolte, S.*, Lauwers, L. *,**,
Spring INTRODUCTION There exists a lot of methods used for identifying high risk locations or sites that experience more crashes than one would.
Agriregionieuropa A metafrontier approach to measuring technical efficiency The case of UK dairy farms Andrew Barnes*, Cesar Reverado-Giha*, Johannes Sauer+
Limitations of Analytical Methods l The function of the analyst is to obtain a result as near to the true value as possible by the correct application.
An ex-ante analysis of distributional effects of the CAP on western German farm incomes Andre Deppermann, Harald Grethe (Universität Hohenheim) Frank Offermann.
Bio-Science Engineering Department of Agricultural Economics Impact of alternative implementations of the Agenda 2000 Mid Term Review An application of.
Agriregionieuropa Dynamic adjustments in Dutch greenhouse sector due to environmental regulations Daphne Verreth 1, Grigorios Emvalomatis 1, Frank Bunte.
Agriregionieuropa Evaluating the CAP Reform as a multiple treatment effect Evidence from Italian farms Roberto Esposti Department of Economics, Università.
122 nd EAAE Seminar Ancona 17 – 18 February nd EAAE Seminar Ancona Capturing impacts of Leader and of measures to improve Quality of Life in rural.
Agriregionieuropa Exploring the perspectives of a mixed case study approach for the evaluation of the EU Rural Development Policy Ida Terluin.
Formalizing the Concepts: Simple Random Sampling.
Conference title 1 A Few Bad Apples Are Enough. An Agent-Based Peer Review Game. Juan Bautista Cabotà, Francisco Grimaldo (U. València) Lorena Cadavid.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.
Basic Concepts of Research Basis of scientific method Making observations in systematic way Follow strict rules of evidence Critical thinking about evidence.
Measurement Error.
*F. Adamu-Lema, G. Roy, A. R. Brown, A. Asenov and S. Roy
Tradeoff Analysis: From Science to Policy John M. Antle Department of Ag Econ & Econ Montana State University.
© 2007 The McGraw-Hill Companies, Inc. All Rights Reserved Slide 1 Research Methods In Psychology 2.
Victoria Naipal Max-Planck Institute for Meteorology Land Department; Vegetation Modelling Group Supervisor: Ch.Reick CO-Supervisor: J.Pongratz EGU,
Institute for Statistics and Econometrics Economics Department Humboldt University of Berlin Spandauer Straße Berlin Germany CONNECTED TEACHING.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Accuracy and Precision
Bio-Science Engineering Department of Agricultural Economics Innovation in agriculture The case of agricultural broadening through direct selling Anne.
Chapter 5 Errors In Chemical Analyses Mean, arithmetic mean, and average (x) are synonyms for the quantity obtained by dividing the sum of replicate measurements.
1 FORMULATION OF TECHNICAL, ECONOMIC AND ENVIRONMENTAL EFFICIENCY MEASURES THAT ARE CONSISTENT WITH THE MATERIALS BALANCE CONDITION by Tim COELLI Centre.
Ameet Morjaria NSF-AERC-IGC Workshop Mombasa, 4 th Dec 2010 Comments on: “Adoption and Impact of Conservation Agriculture in Central Ethiopia: Application.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Assumptions of value-added models for estimating school effects sean f reardon stephen w raudenbush april, 2008.
INEMAD Improve Nutrient and Energy Management through Anaerobic Digestion Jeroen Buysse – Erik Meers Ghent University.
Credit Scoring of Bank-affiliated Captive Finance Companies Gabriela Pásztorová CERGE-EI Bratislava Economic Meeting 8 June 2012.
Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Interval Estimation.
Bio-Science Engineering Department of Agricultural Economics Development strategies for peri-urban farming BVLE symposium Valerie Vandermeulen Promotor:
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
EMPLOYMENT AND EARNINGS James and Clayton. Topic of Interest Describes the economic status of all businesses in Canada (trends) Helps with determining.
Improving of Household Sample Surveys Data Quality on Base of Statistical Matching Approaches Ganna Tereshchenko Institute for Demography and Social Research,
The impact of concentrated pig production in Flanders: a spatial analysis G. Willeghems, L. De Clercq, E. Michels, E. Meers, and J. Buysse Juan Tur.
Spatial impacts and sustainability of farm biogas diffusion in Italy Oriana Gava, Fabio Bartolini and Gianluca Brunori 150th EAAE Seminar ‘The spatial.
1 Hester van Eeren Erasmus Medical Centre, Rotterdam Halsteren, August 23, 2010.
Validation methods.
Adaptation, water resources, and IAMs: An overview and a new project Sheila Olmstead, University of Texas at Austin and RFF Renata Rimsaite, Pennsylvania.
Controls on Catchment-Scale Patterns of Phosphorous in Soil, Streambed Sediment, and Stream Water Marcel van der Perk, et al… Journal of Environmental.
Quantifying Uncertainty
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Home Reading Skoog et al. Fundamental of Analytical Chemistry. Chapters 5 and 6.
Research and Evaluation Methodology Program College of Education A comparison of methods for imputation of missing covariate data prior to propensity score.
Pharmacometrics research group Department of Pharmaceutical Biosciences Uppsala University Approximations of the population Fisher information matrix-
Research Methods In Psychology
Materials for Lecture 18 Chapters 3 and 6
Differential Privacy in Practice
Effective Social Network Quarantine with Minimal Isolation Costs
Nutrient balance for nitrogen TAPAS action – Statistics Belgium in collaboration with the Institute for Agriculture and Fisheries Research 11 June 2009.
Gerald Dyer, Jr., MPH October 20, 2016
Joydeep Chandra, Santosh Shaw and Niloy Ganguly
Statistical Data Analysis
Cross-validation Brenda Thomson/ Peter Fox Data Analytics
Remaining 10.1 Objectives State in nontechnical language what is meant by a “level C confidence interval” Explain what it means by the “upper p critical.
Presentation transcript:

Sampling bias in multi-agent simulation (MAS) models Buysse, J 1., Frija, A 1., Van der Straeten, B 1., Nolte, S. 1, Lauwers, L. 1,2, Claeys, D. 2 and Van Huylenbroeck G. (1)Ghent University, Department of Agricultural Economics (2)Institute for Agricultural and Fisheries Research, Merelbeke, Belgium 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Overview of the presentation - Why MAS models? - Problem statement: the sampling bias in MAS - Objectives - Methodology - Results - Perspectives 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Why MAS models? - Heterogeneity of opportunities and constraints at the individual level -Accurate estimation of policy distributional effects - Accurate estimation of agents interactions (spatial effects, TC, propensity of innovation, etc.) - But… 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Problem statement - The need for full population data - In case of sampling, farms in sample cannot interact with the real-world farm 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Problem statement 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics Farmer i Full population Farmer j

Problem statement 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics Farmer i Sample Farmer k

Problem statement -Systematic bias when TC between agents are simulated in MAS - Most MAS empirical models rely on sample data - Future large scale MAS models on sampled data ! 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Problem statement Illustration of sampling bias on real model (Van der Straeten et al. 2010) - A MAS model used to simulation PR exchange between 30,000 farmers in Flanders - The bias is correlated to the sample size 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics BootstrapNumber of repetitions Average cost simulated SDAverage cost simulated/Average cost of population S= 100 (0.26%) % S= 200 (0.52%) % S= 500 (1.31%) % S= 750 (2%) % Full population (100%)

Objectives ‣ To test, illustrate, and quantify the sampling biases resulting in cases of existence of TC ‣ To develop and to discuss mechanisms that can remove such sampling biases 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Methodology - Simplified MAS model - minimizes transport costs of emissions and the cost of emission abatement: Minimize Σ n (Σ m c nm τ nm + ω n p) s.t. e n + Σ m τ mn - Σ m τ nm ≤ r n + ω n where ‣ n and m are farm indices, ‣ τ nm is the amount of transported emission form n to m, ‣ ω n is the amount of emission abatement of agent n, ‣ e n is the amount of emission of farm n, ‣ r n is the amount of emission rights of farm n, ‣ c nm is the transport cost per transported emission from farm n to m, ‣ p is the penalty per overused emission right. 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Methodology -Applied on synthetic population data (500 farmers): average cost per farm: ‣ Emission (e n ) random sampled from normal distribution, ‣ Emission right (r n ) random sampled from normal distribution, ‣ Transport costs (c nm ) is random sampled from uniform distribution - We bootstrap on different sample sizes: nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Results 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics Average cost of 27.7 Variations

Results - The order of magnitude of the sampling bias can be very large - Nonlinear effect of the sample size on the bias Cause: - Subsamples do not always satisfy the real population balance - Motivation for sampling bias correction via macrobalance coefficients - The amount of emission is smaller than the total amount of emission rights (Σn en < Σn rn) 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Results 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics Average cost of 27.7 Average costs of samples

Remove remaining bias with calibration - Calibration is the comparison of two measurements: -the measurement of a device with known correctness: full population model -is used to correct another measurement made by another device: sample model -Once calibrated, the second device can make correct measurements: sample model can be used in for simulations - Resampling data is used to estimate the calibration function: prediction of the bias as a function of the sample size 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Results coefficients of the polynomial of the simulated average costs on the sample size 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics EstimateStd. Errort valuePr(>|t|) (sample size)^06,11E+052,68E < 2e-16*** (sample size)^1-2,53E+042,62E < 2e-16*** (sample size)^25,64E+029,35E e-09*** (sample size)^3-7,46E+001,69E e-05*** (sample size)^46,24E-021,78E *** (sample size)^5-3,42E-041,16E ** (sample size)^61,24E-064,84E * (sample size)^7-2,95E-091,29E * (sample size)^84,42E-122,13E * (sample size)^9-3,78E-151,98E (sample size)^101,41E-187,91E

Results 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics Average cost of 27.7

Conclusion - Macrobalance correction is very useful -Only macrobalance is necessary -Also usefull in models without heterogenous interactions - Calibration correction is promising -such corrections are not possible if we do not have full population data -necessity to assign correction factors based on information available in sample datasets - Corrected sampling in MAS is important - more complex analysis become possible - more datasets at sample level could be used - MAS can be applied in large scale empirical models 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

Further research - Check for: -Impact on variance -Impact of changes in model structure -Impact of using synthetic full population as calibration reference - Search for: -Calibration correction without availability of full population (see first attempts in paper) 122 nd EAAE Seminar – Ancona February 2011 UGent – Faculty of Bioscience Engineering– Department Agricultural Economics

THANK YOU