Disseminating ICT data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-5: Disseminating ICT data Unctad Manual Chapter 8
Objectives of the module After completing this part of the course you will be able To disseminate data that is nationally representative and internationally comparable To publish results according to ICT indicators To define a tabulation plan To produce statistical tables To define publication format To disseminate metadata and information on survey methodology and evaluation
Contents of module B5 Dissemination of data and metadata: 5.1 Definition of a tabulation plan 5.2 Dissemination of metadata 5.3 Survey implementation reports
Tabulation plan: ways of dissemination B5.1. Definition of a tabulation plan Tabulation plan: ways of dissemination Minimum of 24 tables including: 12 Partnership core indicators (B1 to B12) on the use of ICT in business by Business size and industry breakdowns (see Box 22 with UNCTAD recommendations and paragraph 174 with OECD reference) Additional tables presenting Core indicators broken down by location of the business (rural/urban) whenever feasible. * See chapter 8 of the Manual
B5.1. Definition of a tabulation plan Model table for the publication of ICT indicators broken down by economic activity Types of dissemination Static dissemination (the most frequent) Pre-defined set of tables produced by NSO Released as paper or electronic publications Taylored dissemination Taylored tabulations produced by NSO at users request (for a fee) Dynamic dissemination Tables designed and obtained by users through Web-based technology implemented by NSO It is important to indicate reliability when presenting data tables Figures with a low level of precision should be highlighted (for instance, those that have a coefficient of variation higher than 20 per cent). General issues when designing tabulations To specify the level of detail of table cells according to: The sample design (in order to provide reliable figures), and. The statistical disclosure control rules. To specify the way of presenting statistical estimates in the tables: Absolute figures or. Proportions). Statistical disclosure control rules avoid the displaying of statistical aggregates coming from a small number of businesses potentially identifiable. When presenting proportions, the reference population and the value of denominators used should be attached to tables. * See chapter 8 of the Manual
Metadata at the indicator level B5.2. Dissemination of metadata Metadata at the indicator level The provision of associated metadata at both indicator level and survey level Improves data usability, and is key for assessing comparability with other national and international data. Quality frameworks (defined by NSOs) constitute useful guidelines for: Determining the metadata that should be disseminated with ICT data, and Promoting its integration into the statistical production process as a joint activity. Accuracy The degree to which an estimate correctly describes the phenomenon it was designed to measure. It covers both sampling error and non-sampling error (bias). Sampling error The precision of an estimate can be indicated by. The standard error (the square root of the sampling variance), The coefficient of variation (CV) or relative standard error. A confidence interval. For proportions (since most of ICT indicators are proportions), the relative standard error may be a more easily understood measure of precision. * See chapter 8 of the Manual
Precision of an indicator B5.2. Dissemination of metadata Precision of an indicator Do you remember the classical definition of coeficienet of variation? The coefficient of variation of an estimate of a total is the sampling counterpart of that formula: On the other hand, the 95% confidence interval for a total Y’ (assuming a normal distribution) is expressed as the approximation, CV is usually shown as a percentage * See chapter 8 of the Manual to
Possible causes of bias of the estimates (bias=non sampling errors) B5.2. Dissemination of metadata Possible causes of bias of the estimates (bias=non sampling errors) Non-response Respondent errors Errors in the population frame Sub-optimal questionnaire design Systematic errors by interviewers Processing errors Bias (often referred to as non-sampling error) is caused by various imperfections of the measurement system. Non-response (where the characteristics of the responding population differ from those of the non-responding population); Respondent errors (e.g. a tendency to underestimate income); Errors in the population frame (e.g. coverage errors, misclassification errors); Sub-optimal questionnaire design (e.g. unclear instructions or definitions, poor flow); Systematic errors by interviewers (e.g. leading respondents to particular answers), and Processing errors (e.g. in data entry or editing). Measure of bias It is usually not possible to give a measure of bias. It is important to recognise that bias errors can be in opposite directions and can therefore cancel to some extent but It is necessary to inform users about their possible sources of bias and attempts made to minimize it * See chapter 8 of the Manual Bias are caused by imperfections in the measurement system. Bias are not sampling errors.
B5.2. Dissemination of metadata Scope of ICT core indicators Scope: population to which an indicator refers The scope of an indicator is defined by the population to which it refers. Most indicators on the use of ICT by businesses are proportions, The denominator is determined by the scope specification of the survey in terms of size, economic activity etc. * See chapter 8 of the Manual
Reference date and period for ICT core indicators B5.2. Dissemination of metadata Reference date and period for ICT core indicators B6 Metadata should contain: The reference date and period (to which the indicators refer) used, and The explanation of any discrepancies arising from changes or from delays in data collection. Such information would typically be included in table headings, as notes to tables and/or in a survey execution report. NOTE: All indicators produced from the survey will share these metadata. They are related to the type of data source (be it a stand-alone survey or a module attached to an existing sample survey or census), the scope and coverage of the survey, classifications and definitions, and. methodological issues including any technicalities of data collection. * See chapter 8 of the Manual
Survey implementation reports B5.3. Survey implementation reports Survey implementation reports The metadata for a survey can be presented as a ‘survey execution report’ containing the following (suggested) topics (page 104): General information Statistical units, scope and coverage Concepts, classifications and definitions Information on the questionnaire Population frame Sample design Weighting procedures Unit non-response and misclassification Item non-response Accuracy and precision measures * See chapter 8 of the Manual
Topics to be included in metadata reporting for ICT use surveys (1) B5.3. Survey implementation reports Topics to be included in metadata reporting for ICT use surveys (1) Topic Description (metadata to be included) General information Rationale for survey Data sources used Reference period and date Date of survey Survey vehicle (where applicable) Data collection methods Pilot tests undertaken (if any) Methodological differences with previous data collection exercises Timeliness and punctuality including changes over time Data accessibility.
Topics to be included in metadata reporting for ICT use surveys (2) B5.3. Survey implementation reports Topics to be included in metadata reporting for ICT use surveys (2) Topic Description (metadata to be included) Statistical units, scope and coverage Definition of statistical units used Differences between national unit concepts and international standards Reporting, observation and analytical units. Definition of scope and target population (economic activity, size and geography) Description of any coverage limitations in respect of the scope. Rationale Description of the legislation that refers to the origin of the data collection exercise, and Details of decisions taken to implement the operation (such as a recommendation by a national statistical council). Description of data sources The nature of the data source(s) used for the calculation of ICT indicators. Administrative records, Stand-alone ICT surveys and Modules in existing surveys This point is particularly important in the case of indicators expressed as a proportion since the numerator and denominator may be obtained from different data sources. Timeliness and punctuality Timeliness can be defined as the time interval between the availability of results and the date of reference of the information presented. Punctuality is the measurement of the delay between the anticipated date of release and the actual date of release. Data accessibility: the easyness for data users to obtain statistical results and associated metadata. The key aspects are: The physical means available for data publication (paper, electronic, web-based), The requirements for access (subscription, payment, free of charge, etc.) and Awareness of users of available data and how it can be accessed. Statistical units Statistical units used (establishments, enterprises, etc.), and how they have been defined. Any distinctions between reporting, observation and analytical units should be made clear. Impacts on the estimates from deviations from the recommended unit (enterprise) or changes over time should be described, even if it is not possible to quantify them. Scope and coverage The scope of the survey in terms of at least size and economic activity (and often geography) should be stated. Any coverage limitations related to the scope should be specified e.g. whether there are some geographical areas that have not been included in the survey or have been treated differently. Response rate The final response rate for the survey (overall and for major disaggregations). The response rate is calculated as the proportion of live (eligible) units responding to the survey. Disaggregations of response rate, by size for example, are useful in conveying an indication of non-response bias. * See chapter 8 of the Manual
Topics to be included in metadata reporting for ICT use surveys (3) B5.3. Survey implementation reports Topics to be included in metadata reporting for ICT use surveys (3) Topic Description (metadata to be included) Concepts, classifications and definitions Concepts and their basis; changes over time Classifications used; differences with international standards Classification categories (e.g. size and geographical categories) Definitions of key terms (e.g. computer); differences with international standards and changes over time. Statistical standards: concepts, classifications and definitions 1. Major concepts and definitions used should be described in the metadata set (e.g. concepts underlying the measurement of e-commerce). 2. Classificatory variables The metadata for the survey should indicate to what extent the classifications used correspond to international classifications (ISIC, for example) Metadata should also describe any classificatory concepts that could be ambiguous (e.g.“small and medium businesses” ). 3. Definitions (for instance, of ‘broadband’ or ‘computer’) and classifications Are key elements for the assessment of international comparability of ICT indicators and coherence with alternative information sources (such as private surveys). Changes in definitions and classifications can also affect comparability of indicators over time and should be well documented.
Topics to be included in metadata reporting for ICT use surveys (4) Description (metadata to be included) Information on the questionnaire Questionnaire used in the survey Indications of significant changes over time; deviations from international model questions. Data collection method and questionnaire Data collection method: the sample design and the method of data collection used (face-to-face interviews, telephone interviews, mailed questionnaires). Publishing the questionnaire used to collect data is generally of great help for more advanced users who may benefit from knowing the exact wording of questions. * See chapter 8 of the Manual
Topics to be included in metadata reporting for ICT use surveys (5) B5.3. Survey implementation reports Topics to be included in metadata reporting for ICT use surveys (5) Topic Description (metadata to be included) Population frame Description of the population frame or underlying business register used origin updating periodicity available segmentation variables and any known shortcomings Changes in the frame over time Description of the population frame or underlying business register used - origin - updating periodicity - available segmentation variables - and any known shortcomings Changes in the frame over time
Looking forward to receive your good ICT statistics data B5.3. Survey implementation reports Thank you very much Description of the population frame or underlying business register used - origin - updating periodicity - available segmentation variables - and any known shortcomings Changes in the frame over time Looking forward to receive your good ICT statistics data 17 17