4.2.3 Data Quality, Composite Indicators and Aggregation 1 DATA QUALITY, COMPOSITE INDICATORS AND AGGREGATION UPA Package 4, Module 2.

Slides:



Advertisements
Similar presentations
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Advertisements

4.2.1 Descriptive Statistics and Classification of Data 1 UPA Package 4, Module 2 DESCRIPTIVE STATISTICS AND CLASSIFICION OF DATA.
1.2.4 Statistical Methods in Poverty Estimation 1 MEASUREMENT AND POVERTY MAPPING UPA Package 1, Module 2.
Statistics for Managers Using Microsoft® Excel 5th Edition
4.2.2 Inductive Statistics 1 UPA Package 4, Module 2 INDUCTIVE STATISTICS.
4.3.4 Spatial Data Integration 1 UPA Package 4, Module 3 SPATIAL DATA INTEGRATION.
4.3.1 GIS 1 GEOGRAPHICAL INFORMATION SYSTEMS (GIS) UPA Package 4, Module 3.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 1-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Correlation. Introduction Two meanings of correlation –Research design –Statistical Relationship –Scatterplots.
Chap 1-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 1-1 Basic Business Statistics 12 th Edition Chapter 1 Introduction.
© 2004 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (9 th Edition) Chapter 1 Introduction and Data Collection.
Datasets and Variables We want to answer questions We want to use data for this purpose Observations of characteristics of cases Case: person, city, organization,
Research Curriculum Session III – Estimating Sample Size and Power Jim Quinn MD MS Research Director, Division of Emergency Medicine Stanford University.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Statistical Concepts (continued) Concepts to cover or review today: –Population parameter –Sample statistics –Mean –Standard deviation –Coefficient of.
Chapter 1: Data Collection
Researching society and culture Alan Bradley
Palestinian Central Bureau of Statistics (PCBS) Palestine Poverty Maps 2009 March
Formalizing the Concepts: Simple Random Sampling.
4.1.4 Collection and Processing of Household Data 1 UPA Package 4, Module 1 COLLECTION AND PROCESSING OF HOUSEHOLD DATA.
Squeezing more out of existing data sources: Small Area Estimation of Welfare Indicators Berk Özler The World Bank Development Research Group, Poverty.
4.3.3 Time Series UPA Package 4, Module 3 TIME SERIES.
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Eurostat Repeated surveys. Presented by Eva Elvers Statistics Sweden.
The new HBS Chisinau, 26 October Outline 1.How the HBS changed 2.Assessment of data quality 3.Data comparability 4.Conclusions.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 1-1 Chapter 1 Introduction and Data Collection Basic Business Statistics 11 th Edition.
Sampling January 9, Cardinal Rule of Sampling Never sample on the dependent variable! –Example: if you are interested in studying factors that lead.
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
Chap 1-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Business Statistics: A First Course 6 th Edition Chapter 1 Introduction.
Chapter 15 Correlation and Regression
Introduction to Statistics What is Statistics? : Statistics is the sciences of conducting studies to collect, organize, summarize, analyze, and draw conclusions.
STA291 Statistical Methods Lecture 16. Lecture 15 Review Assume that a school district has 10,000 6th graders. In this district, the average weight of.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
4 May 2010 Towards a common revision for European statistics By Gian Luigi Mazzi and Rosa Ruggeri Cannata Q2010 European Conference on Quality in Official.
1 Sources of gender statistics Angela Me UNECE Statistics Division.
United Nations Economic Commission for Europe Statistical Division Sources of gender statistics Angela Me UNECE Statistics Division.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Ch.1 INTRODUCTION TO STATISTICS Prepared by: M.S Nurzaman, MIDEc. ( deden )‏ (021) /
Basic Business Statistics
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Data Preprocessing Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
ISI Satellite Conference on Agricultural Statistics, Maputo, August 2009 Integrated survey framework Using Household Expenditure Surveys for Food.
BUS304 – Chapter 6 Sample mean1 Chapter 6 Sample mean  In statistics, we are often interested in finding the population mean (µ):  Average Household.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
2008 Population Census of Cambodia Post Enumeration Survey Mrs. Hang Lina Deputy Director General National Institute of Statistics, Min. of Planning Regional.
Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Collection of Data on Remittances Experience from the Ghana Living Standards Survey Grace Bediako Ghana Statistical Service.
Targeting of Public Spending Menno Pradhan Senior Poverty Economist The World Bank office, Jakarta.
1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-1 Inferential Statistics for Forecasting Dr. Ghada Abo-zaid Inferential Statistics for.
Designing ICT Surveys: An Introduction to the Basic Theory Phillippa Biggs, Economist, ITU MCIT, Cairo, Egypt 10 March 2009.
Workshop on MDG, Bangkok, Jan.2009 MDG 3.2: Share of women in wage employment in the non-agricultural sector National and global data.
Using official data-sources Tom Spencer, Social Justice Analysis.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Sinclair Sutherland Labour supply: Finding and using statistics.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Addis.
1 TARGETING HEALTH INSURANCE TO THE POOR IN COLOMBIA By Tarsicio Castañeda Reaching the Poor Conference The World Bank, February 18-20, 2004.
Welcome to Statistics World Note: This PowerPoint is only a summary and your main source should be the book.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Statistics Netherlands Division Social and Spatial Statistics
The European Statistical Training Programme (ESTP)
Basic Statistical Terms
Chapter 1 Introduction and Data Collection
Chapter 5: The analysis of nonresponse
Presentation transcript:

4.2.3 Data Quality, Composite Indicators and Aggregation 1 DATA QUALITY, COMPOSITE INDICATORS AND AGGREGATION UPA Package 4, Module 2

4.2.3 Data Quality, Composite Indicators and Aggregation 2 Data Quality, Composite Indicators and Aggregation Data errors Less is more; benefit-cost of data Data cleaning Composite indicators Introduction exercise Exploring Data sets Introduction exercise Aggregation

4.2.3 Data Quality, Composite Indicators and Aggregation 3 Data Errors Biased data Outliers, error or extreme value Sample too small Too much precision or regularity (too good to be true) Missing values Inconsistencies

4.2.3 Data Quality, Composite Indicators and Aggregation 4 Less is more; Benefit and Cost of Data Quality (full coverage and maintenance) Quantity (many variables but missing values and outdated) Physical Characteristics of a building Ownership Characteristics of a building

4.2.3 Data Quality, Composite Indicators and Aggregation 5 Benefit and Cost of Data Data Benefit and Costs Strategy and clear objectives of developing databases Data (and functionalities) requirement study Data benefit, the value of information and quantification –costs reduction, effectiveness/priorities of (public) resource allocation –transparency, awareness, involvement Data costs high (acquisition, editing, conversion, updates, maintenance)

4.2.3 Data Quality, Composite Indicators and Aggregation 6 Benefit and Cost of Data Primary and secondary data, data sharing Primary, ad-hoc, single use of data, (too) expensive Secondary matching with requirement for poverty studies Combination of existing data and samples Data collection embedded into institutional settings, from data projects to data processes

4.2.3 Data Quality, Composite Indicators and Aggregation 7 Composite Indicators Poverty without reliable income data Slums Composite Indicator Human Development Indicator, Poverty Index Proxy indicators (consumption / income)

4.2.3 Data Quality, Composite Indicators and Aggregation 8 Composite Indicators

4.2.3 Data Quality, Composite Indicators and Aggregation 9 Aggregation Aggregate cases into a single summary case Break variable defines a group and create one case e.g. neighborhood Aggregate functions Summary, fractions

4.2.3 Data Quality, Composite Indicators and Aggregation 10 Small Area Statistics Limited (existing) data, limited funds for data collection Sample survey and auxiliary data sets (+ analytical skills) = small area statistics Developing a model to identify the relationship between the survey and the auxiliary data more reliable estimates can be made and the possibilities to extrapolate to areas not covered by a household survey

4.2.3 Data Quality, Composite Indicators and Aggregation 11 Introduction Exercise Exploring Datasets Classifying interval data (number of foreigners, income, family size) into meaningful groups (e.g. low income, medium income, high income). Create cross tables and analyze relationships between these ordinal data sets.

4.2.3 Data Quality, Composite Indicators and Aggregation 12 Introduction Exercise Count incomeclTotal LowMediumHighVery High houseclLow Medium High Total Symmetric Measures c c 538 Pearson's RInterval by Interval Spearman CorrelationOrdinal by Ordinal N of Valid Cases Value Asymp. Std. Error a Approx. T b Approx. Sig. Not assuming the null hypothesis.a. Using the asymptotic standard error assuming the null hypothesis.b. Based on normal approximation.c. Cross table (mean income x mean house value) Municipalities in the Netherlands

4.2.3 Data Quality, Composite Indicators and Aggregation 13 Introduction Exercise Aggregation Central Bureau of Statistics of The Netherlands three main spatial units: Municipality (n=538) Districts (n=2382) Neighbourhoods (n=10737) Aggregation, summarizing data, why and what Spatially homogenous versus heterogeneous variables Which statistics to use (mean or other statistical figures) Simple and weighted aggregates

4.2.3 Data Quality, Composite Indicators and Aggregation 14 Introduction Exercise 4.3.2

4.2.3 Data Quality, Composite Indicators and Aggregation Data Quality, Composite Indicators and Aggregation Introduction Exercise 4.3.2

4.2.3 Data Quality, Composite Indicators and Aggregation 16 Introduction Exercise 4.2.3

4.2.3 Data Quality, Composite Indicators and Aggregation 17 Introduction Exercise 4.2.3