Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.

Similar presentations


Presentation on theme: "UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data."— Presentation transcript:

1 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Data Capture Overview United Nations Statistics Division

2 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Overview of Presentation  Definition of data capture  Methods of data capture: -Different Methods -Advantages and disadvantages  Issues to consider

3 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 What’s Data Capture? “Data capture is the system used to convert the information obtained in the census to a format that can be interpreted by a computer.” Source: United Nations Principles and Recommendations for Population and Housing Censuses, Rev. 2, p.68.

4 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Data Capture Methods 1)Keyboard data entry 2)Optical mark recognition/reading (OMR) 3)Optical character recognition/intelligent character recognition (OCR/ICR) 4)Personal digital assistant (PDA) 5)Internet  Advantages/disadvantages/costs/impacts at both data capture and later stages  Combination of more than one of the above methods

5 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Keyboard Data Entry  Response codes from census form are manually entered into computers  Sophisticated version involves computer assisted key entry where operator selects a response from options displayed on the screen  Use of method based on time and cost considerations, and feasibility to implement more sophisticated technology  Method also used to process textual responses into classification categories

6 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Advantages and Disadvantages of Keyboard Data Entry Advantages  Method requires simple software systems and low-end computing hardware  Less costly (depending on the costs of manpower)  There will be a large number of PCs available for other uses after censusDisadvantages  Requires more staff  Task takes much longer time to complete than with automated data entry  Potential for errors during data entry  Standardization of operations is difficult as performance may be individually dependant

7 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Data Capture Technologies  Imaging and intelligent character recognition offer great potential and benefits for data capture  Use of technology for data capture should be to enhance effective and efficient data capture and not for technology’s sake  Awareness of long lead times and technology infrastructure required for successful implementation of intelligent character recognition

8 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Optical Mark Recognition/Reading (OMR)  OMR is a form-scanning method whereby responses are read into a computer without a keyboard  OMR technology reads responses to “tick-box” type questions on specially designed paper  Only presence or absence of a mark is detected by the machine  The scanned responses are transformed into codes  Handwritten responses must be manually entered or coded using computer-assisted methods

9 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Advantages and Disadvantages of OMR Advantages  Improved data accuracy  Data capture faster than keyboard data entry  Equipment is relatively inexpensive  Relatively simple to install and run  A well-established technology that’s been used in many countries Disadvantages  Restrictions as to form design  Restrictions on type of paper and ink  Precision required in printing process/cutting of sheets  Response boxes should be correctly marked with appropriate pen or pencil  Won’t capture textual responses

10 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Optical Character Recognition (OCR)/ Intelligent Character Recognition (ICR)  OCR and ICR combine scanning and character recognition technology to scan the whole form and interpret the responses  OCR technology recognizes machine-printed characters only  ICR technology reads both machine-printed and hand- written responses in specific locations of the page and transforms the responses into codes  For OCR, handwritten responses must be manually entered or coded using computer-assisted methods

11 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Advantages of OCR/ICR  Form design is not as stringent as for OMR  Processing time can be reduced due to automated nature of the process  Allow for digital filing of questionnaires resulting in efficiency of storage and retrieval of questionnaires for future use  Some handwritten responses can be automatically coded thereby improving data quality

12 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Disadvantages of OCR/ICR  Higher costs of equipment (sophisticated hardware/software required)  High calibre IT staff required to support the system  Handwriting on census forms be as close as possible to the model handwriting to avoid recognition error  Possibility for error during character substitution which would affect data quality  Tuning of recognition engine to accurately recognize characters is critical with trade-off between quality and cost

13 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Personal Digital Assistant (PDA)  Contents of the census form are stored onto the PDA so that the questions appear sequentially on the screen  Data are entered into a hand-held computer instead of onto a paper census form  Data are then electronically transmitted to an NSO database for further processing

14 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Advantages and Disadvantages of use of the PDA Advantages  Instant data capturing at the point of collection, reducing manual input errors  Immediate data validation, reducing re-verifications at later stage  Time effective with real time logical validation rules, reducing logical errors  Faster processing of census information leading to timely availability of results Disadvantages  Setting up of process may take a long time as it requires extensive testing  Requires that enumerators have ability to use the device which may require administering a test  Requires intensive training of enumerators on use of device (training is more complicated)  Need to recharge the battery which could run out during enumeration  Possibility of equipment failure

15 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Internet-based Data Collection  Use of the Internet for census data collection is growing -However, the method is always complementary to other more established methods  Like with PDAs, the on-line form is not a downloadable version of the paper form  Use of this method requires a password in order to access and fill in the form  Development of the internet system for data collection is generally outsourced for lack of in-house expertise

16 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Advantages/Disadvantages of use of the Internet Advantages  Reduced resources necessary for form handling and data capture  Better opportunity to enumerate difficult to reach and to enumerate geographic area and population groups  Automatic filtering of irrelevant questions  Better quality data due to in- built interactive verification mechanism  Faster availability of census results through simplified data entry and editingDisadvantages  Requires that respondents have a computer with Internet access  Management of responses can be problematic, e.g., that households have responded once and only once  Requires high security system to ensure safe transfer of data  Need to build parallel processing system as not everyone will use the Internet  Requires mechanism to check for omitted and duplicate submissions  Is costly and requires a lot of resources for setting up and adequately test the system

17 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Issues to Consider in Choosing a Method  Method to use is dependant on national circumstances  Choice of method should be part of the overall strategic objective of the census in terms of timeliness, accuracy and cost  Choice of processing system and technology to use need to be established early in census cycle  Enough time is required to test and implement the system  When imaging technology is used for data capture, extensive testing is required well in advance of the census  Possibility to outsource when the required expertise is not available in-house

18 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Issues to consider (cont.)  Extensive testing of the system is also critical when data collection is either by PDA or via the Internet  Design and paper quality of census form should be linked to method of data capture  When imaging technology is to be used, adequate training of enumerators on how to properly fill in the forms is crucial

19 UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data editing Doha, State of Qatar, 18-22 May 2008 Thank you


Download ppt "UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data."

Similar presentations


Ads by Google