SADC Course in Statistics Assessing data critically Module B1 Session 17.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Review of Data Processing Steps MICS3 Data Analysis and Report Writing Workshop.
Web Design Issues in a Business Establishment Panel Survey Third International Conference on Establishment Surveys (ICES-III) June 18-21, 2007 Montréal,
1 Questionnaire design Module 3 Session 3. 2 Overview (of Session) This session starts by introducing some aspects that need to be considered when designing.
Collecting data for informed decision-making
SADC Course in Statistics Analysis of Variance for comparing means (Session 11)
Basic Sampling Concepts
SADC Course in Statistics Estimating population characteristics with simple random sampling (Session 06)
Data collection for epidemiological statistics
The Poisson distribution
SADC Course in Statistics Comparing several proportions (Session 15)
Overview of Sampling Methods II
SADC Course in Statistics Further ideas concerning confidence intervals (Session 06)
Data collection for demographic & vital statistics
SADC Course in Statistics Tests for Variances (Session 11)
Assumptions underlying regression analysis
SADC Course in Statistics Basic principles of hypothesis tests (Session 08)
SADC Course in Statistics (Session 20)
SADC Course in Statistics The binomial distribution (Session 06)
SADC Course in Statistics Sampling weights: an appreciation (Sessions 19)
Managing data using CSPro
SADC Course in Statistics Examples of Statistics Module B1, Session1.
SADC Course in Statistics Multi-stage sampling (Sessions 13&14)
SADC Course in Statistics Session 4 & 5 Producing Good Tables.
SADC Course in Statistics Exploratory Data Analysis (EDA) in the data analysis process Module B2 Session 13.
SADC Course in Statistics Graphical summaries for quantitative data Module I3: Sessions 2 and 3.
SADC Course in Statistics Types and Sources of Errors in Statistical Data.
SADC Course in Statistics Common complications when analysing survey data Module I3 Sessions 14 to 16.
SADC Course in Statistics Comparing two proportions (Session 14)
SADC Course in Statistics Introduction to Statistical Inference (Session 03)
SADC Course in Statistics (Session 09)
Preparing & presenting demographic information: 1
SADC Course in Statistics Overview of Sampling Methods I (Session 03)
SADC Course in Statistics General approaches to sample size determinations (Session 12)
SADC Course in Statistics Introduction to the module and the sessions Module I4, Sessions 1 and 2.
SADC Course in Statistics Reporting on the web site Module I4, Sessions 14 and 15.
SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
The MDGs and School Enrolment: An example of administrative data
SADC Course in Statistics Handling Data Module B2.
1 Table design Module 3 Session 2. 2 Objectives of this session By the end of this session, you will be able to: appreciate the different type of objectives.
SADC Course in Statistics Comparing Means from Paired Samples (Session 13)
SADC Course in Statistics Analysing Data Module I3 Session 1.
SADC Course in Statistics Revision on tests for proportions using CAST (Session 18)
SADC Course in Statistics Good graphs & charts using Excel Module B2 Sessions 6 & 7.
SADC Course in Statistics Excel for statistics Module B2, Session 11.
SADC Course in Statistics Module B2, Session3
SADC Course in Statistics Exploratory Data Analysis for single variables Module B2 Session 12.
Maintaining data quality: fundamental steps
Data Imputation United Nations Statistics Division (UNSD) 16 March 2011 Santiago, Chile.
The Frequency Table or Frequency Distribution Table
Preparing Data for Quantitative Analysis
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Data Processing, Fundamental Data Analysis, and Statistical Testing of Differences CHAPTER.
1 QUANTITATIVE DESIGN AND ANALYSIS MARK 2048 Instructor: Armand Gervais
Their Strengths and Limitations. 1. Practically – available for free 2. More detail as there are more categories of crime than with the British Crime.
RESEARCH METHODS Lecture 24
SADC Course in Statistics Introduction and Study Objectives (Session 01)
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
SADC Course in Statistics Producing Good Tables In Excel Module B2 Sessions 4 & 5.
SADC Course in Statistics Paddy results: a discussion (Session 17)
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 15.
C M Clarke-Hill1 Collecting Quantitative Data Samples Surveys Pitfalls etc... Research Methods.
Data Analysis.
PROCESSING OF DATA The collected data in research is processed and analyzed to come to some conclusions or to verify the hypothesis made. Processing of.
PROCESSING, ANALYSIS & INTERPRETATION OF DATA
11 Chapter 6 The Research Process – Data collection & Data analysis – (Stage 5 & 6 in Research Process) © 2009 John Wiley & Sons Ltd.
Chapter 34 Organisation & Collection of Data. Primary & Secondary Data PRIMARY DATA is collected for a particular purpose. PRIMARY DATA is obtained from.
Research Methods in Psychology Introduction to Psychology.
Data Collection Methods Pros and Cons of Primary and Secondary Data.
Research Problem: The research problem starts with clearly identifying the problem you want to study and considering what possible methods will affect.
Presentation transcript:

SADC Course in Statistics Assessing data critically Module B1 Session 17

To put your footer here go to View > Header and Footer 2 Objectives At the end of this session the students will be able to: Apply basic techniques for error detection Ask relevant questions that allow for the explanation or correction of discrepancies

To put your footer here go to View > Header and Footer 3 Detecting errors in primary data Checks to detect errors in primary data should be made at various stages: Immediately after data collection (and during data entry) After data computerisation During exploratory data analysis

To put your footer here go to View > Header and Footer 4 Checking for errors after data collection Have all questions been answered? If not, are the reasons for non-response clear? Are recorded values within their expected range? Do all questions or items have meaningful entries? Are they internally consistent? Are any zero entries genuinely zeros? Are IDs unique?

To put your footer here go to View > Header and Footer 5 Checking for errors after data entry Compute new (temporary) variables to check if: Rates recorded per 1000 of population are less than 1000 Percentages expected to be less than 100% are indeed so There is internal consistency amongst variables, and between tables – for example, date of interviewing should be earlier than the date when the supervisor checked the questionnaire totals are consistent across different tables, and sub- totals add to overall totals. Codes for missing values have been identified correctly according to their reason for missing and have been set as missing in the database to be used for analysis.

To put your footer here go to View > Header and Footer 6 Tips for error detection Look for counts or categories that do not make sense If you have a series of data in chronological order, look for jumps in the data. They may be errors Always check your totals –Make sure they add to the expected total (e.g. 100%). –When looking at multiple tables in a single study, the sample size should be consistent in all tables What is expected to tally should tally! Dont just look at the numbers, look at the definitions that the numbers represent

To put your footer here go to View > Header and Footer 7 Checks during Exploratory Data Analysis Simple one-way or two-way tables can help identify errors. (a) Results are from a socio-economic survey in Uganda. Are these results reasonable? Average number of meals taken by HH in past weekFrequency Total 9652

To put your footer here go to View > Header and Footer 8 Checks during Exploratory Data Analysis (b) A second example from the British Crime Survey, 2000 Number of times something was stolen from respondents hands, pockets, bag or case since 1 Jan 99Frequency Total 463 Can the last figure be correct?

To put your footer here go to View > Header and Footer 9 Checks during Exploratory Data Analysis (c) Detection rate of property crimes in one police force. (Data are fictitious) Property Crime JanFebMar Vandalism Burglary Vehicle thefts Bicycle thefts 433 Thefts from person 325 Other thefts 7911

To put your footer here go to View > Header and Footer 10 Checks during Exploratory Data Analysis Consistency checks across related variables The following examples show: (i)Current number of cars at household versus whether respondent was worried about having car stolen. (ii)Current number of cars at household versus whether respondent was worried about having things stolen from car. (iii)Distance to reach any type of formal court versus distance from nearest Magistrate s Court.

To put your footer here go to View > Header and Footer 11 Use of cross-tabulations Table 1. Cross-tabulation of current number of cars at household versus extent to which respondent is worried about having car stolen (Source: BCS, 2000)

To put your footer here go to View > Header and Footer 12 Use of cross-tabulations Table 2. Cross-tabulation of current number of cars at household versus extent to which respondent is worried about having things stolen from the car (Source: BCS, 2000)

To put your footer here go to View > Header and Footer 13 Detecting errors in secondary data Procedures similar to the above can be undertaken,but in addition: Ask questions regarding the source from where data arose, e.g. to assess competence, adequacy of funding, motivation for study, etc. Ask about the data collection procedure and associated documentation. In particular seek answers to what, who, why, when, where, and how. Important to follow the whole data chain.

To put your footer here go to View > Header and Footer 14