Celine Dondeynaz Joint Research Centre, Italy and University of Liverpool Dr C.Camona-moreno, Prof D.Chen, A. Leone PhD Vienna – EGU - 5 April 2011 1 Inter.

Slides:



Advertisements
Similar presentations
Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
Advertisements

Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Assumptions underlying regression analysis
Assessing Capacity Building and Good Governance Indicators in Sub-Saharan Africa: The Implications for Poverty Reduction By Dr. Theodore J. Davis, Jr.
Panel Data Models Prepared by Vera Tabakova, East Carolina University.
Design of Experiments Lecture I
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Bivariate Regression Analysis
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Class 17: Tuesday, Nov. 9 Another example of interpreting multiple regression coefficients Steps in multiple regression analysis and example analysis Omitted.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 12 Simple Regression
Analysing relationships among socio- economic, environmental, governance, and water supply and sanitation variables in developing countries Summary Sept.
Spatial-temporal behavior modeling of human development indexes in Water Supply and Sanitation Management in developing countries Celine DONDEYNAZ Supervisors:
Lecture 24: Thurs., April 8th
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #20.
Topic 3: Regression.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Simple Linear Regression Analysis
1 D r a f t Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Correlation & Regression
 Quantitative Approaches to International Relations  Case Study of Research Design in the International Political Economy  Case Study of Research Design.
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Model Building III – Remedial Measures KNNL – Chapter 11.
Xavier Sala-i-Martin Columbia University June 2008.
Review of Statistical Models and Linear Regression Concepts STAT E-150 Statistical Methods.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Giovanna Brancato, Giorgia Simeoni Istat, Italy European Conference on Quality in Official Statistics – Q2008, Rome, 8-11 July 2008 Modelling Survey Quality.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
World Population Links between population and the environment 1. Total pollution = (pollution per person x population) - pollution control.
1 Survey of Economic and Social Conditions in Africa, 2006 Economic Commission for Africa Fortieth Session of the Conference of African Ministers of Finance,
Chapter 13 Multiple Regression
Implementing the Analysis Information System IN 2004 In the sub Saharan region of Africa In the Northern Africa region WHY This difference of level? Overall.
Multivariate Data Analysis Chapter 1 - Introduction.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Research Methodology Lecture No :26 (Hypothesis Testing – Relationship)
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Examining Potential Demand For Public Transit --A Case Study of City of Long Beach Presented to UP206, Dec 2010 By Jixuan Jiang, Master in UPlanning.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Chapter 22 Inferential Data Analysis: Part 2 PowerPoint presentation developed by: Jennifer L. Bellamy & Sarah E. Bledsoe.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Stats Methods at IC Lecture 3: Regression.
Mapping MPI and Monetary Poverty: The Case of Uganda
Food Balance Sheets FBS component: Food availability.
Workshop on Land Accounts and urban morphology, ETC-CE, 12 july 2006
Fundamentals of regression analysis
Multiple Imputation Using Stata
CHAPTER 26: Inference for Regression
Food Balance Sheets FBS component: Food availability.
Poverty Maps for Sri Lanka
Presentation transcript:

Celine Dondeynaz Joint Research Centre, Italy and University of Liverpool Dr C.Camona-moreno, Prof D.Chen, A. Leone PhD Vienna – EGU - 5 April Inter relationships among water, governance, human development variables in developing countries Pit latrine in Lalibela, Ethiopia, C.Dondeynaz

Vienna – EGU - 5 April Presentation Structure 1. Thematic and objective 2. Database building Data collection Data framework Data formatting 3. African Dataset coherence PCA analysis Linear regression analysis 4. Extension of the dataset South Africa EuropeAid, F.Lefèbvre

Vienna – EGU - 5 April Thematic and questions The efficiency of the WSS management in a specific developing country = a combination of a wide range of variables¹ = > a complex and a cross cutting issue OBJECTIVE :Better understand the keys elements involved in an improved WSS management. Main QUESTIONS 1. Are the different variables and data coherent enough to establish spatial-temporal behaviors? 2. Can be established measurable protocols/models and can patterns be extrapolated in time? ¹ Integrated water resources management Principles laid down at the International Conference on Water and the Environment held in Dublin in January 1992

Vienna – EGU - 5 April Data collection  International data providers : UNEP – FAO – JRC – WB …  Scale : National country level over the world  Time series : consistency issue requires a strict examination of data coherence and methodologies year of reference Variables selection criteria  Relevance : potential role regarding water supply and sanitation  Data availability : enough observations  Reliability : produced by trustfully providers and with described methods 132 indicators analysed shortlist of 53 indicators

Vienna – EGU - 5 April Data framework Environmental Cluster Water resources availability (Water poverty index, Water stress, water bodies...) Land cover indicators (dryland coverage, biodiversity index..) Human pressure Cluster Activities pressure ( water demand, irrigation level, industrial pollution, production indexes...) Demographic pressure ( growth, repartition Urban-rural Accessibility to WSS Cluster Population access to Sanitation Population access to Water Supply Country Well being Cluster Health indicators (water-born disease, mortality, life expectancy..) Poverty indicators ( HDI, National poverty index, education level...) Education indicators Official Development aid flow : global and WSS ODA Governance cluster Stability and level of violence, government effectiveness, rule of law, regulatory quality, control of corruption

Vienna – EGU - 5 April Data formatting Process 1.Normalization 2.Missing data treatment: Imputation Step 1 Variables Normalization Standard normalization (SQRT- LOG - OLS) not possible on the worldwide dataset because of strong heterogeneous behaviour among countries as preliminary phase => Restriction to Africa = 52 countries Test of what? Missing data methods Methods used for data coherency Foreseen modelling methods Normalization Issue Processing the extremities distribution

Vienna – EGU - 5 April Data formatting Step 2 Missing Data treatment Objective : Qualitative approach => find order of magnitude rather than exact value Method Expectation – Maximization algorithm combined with bootstraps (EMB) 1 Assumptions: - the complete data (that is, both observed and unobserved) are multivariate normal. - the data are missing at random (MAR). STEP by STEP imputation process starting from the ones with less missing data to the more incomplete ones. ¹ Amelia II software is provided by Honaker James, King Gary, Blackwell Matthew,

Vienna – EGU - 5 April Checking Variable Relationships Coherence Group 1 Group 2 Group 4 Group 3 figure: the first two PCA factors of variables, (accumulated variability equal to 43,02%) Principal component Analysis (PCA) Adjusted R² = (3 components) On F1 axis group 1-2 representing the society development – poverty On F2 group 3-4 represents the balance between water demand and resources Coherency of the dataset on Africa Dataset coherency verification

Vienna – EGU - 5 April Dataset coherency verification 2. Linear regression Objectives: Look for incoherent behaviours Test if linear models could be used in a later stage Water supply coverage and sanitation coverage are analysed separately The coherency of the final model relies on: the significance of the variables the confidence intervals

Vienna – EGU - 5 April Preliminary phase on Africa Anova with stepwise method Dependent variable: Water supply access level (AIWS) Adjusted R² = Standards parameters of the final model Model Unstandardized Coefficients Standardize d Coefficients tSig. 95% Confidence Interval for B B Std. ErrorBeta Lower Bound Upper Bound 1(Constant) Children Mortality under 5 years Environmental governance level Withdrawal industrial a Dependent Variable: TOT.AIS.2004

Vienna – EGU - 5 April Anova with stepwise method Dependent variable: Sanitation access level (AIS) Adjusted R² = Standards parameters of the final model Model Unstandardized Coefficients Standardi zed Coefficien ts tSig. 95% Confidence Interval for B B Std. Error Beta Lower Bound Upper Bound 5(Constant) Health expenditure Water Use intensity in agriculture Urban pop level Environmental gov Corruption perception index Preliminary phase on Africa a Dependent Variable: TOT.AIS.2004

Vienna – EGU - 5 April Conclusions of the preliminary phase On AFRICA Good points: 1. The dataset is coherent – IF data considered qualitative/estimates 2. Linear models explain most of the variability Limits 1. Too few observations (52 countries) versus variables number (45 variables) 2. Variability (38%) in both cases remains not completely explained => Complex relationships between variables

Vienna – EGU - 5 April Extension of the dataset SOLVING POINT 1: too few observations Available Options : 1. Increasing the number of observations 2. Grouping variables We start with option 1 : -> clustering worldwide countries list -> using different Agglomerative Hierarchical Clustering (AHC) methods with several distances -> looking at the stability of results Increasing the dataset by adding countries with similar behaviours to African’s

Vienna – EGU - 5 April Thanks you for your attention Questions?