Download presentation
Presentation is loading. Please wait.
Published byAron Mills Modified over 9 years ago
1
Coding closed questions Training session 5 GAP Toolkit 5 Training in basic drug abuse data management and analysis
2
Objectives To establish a set of practical coding rules for closed questions To explain the importance of assigning numbers to characteristics To construct a framework for recording missing values To introduce identification numbers as a method of ensuring the anonymity of respondents, while maintaining a link between files and questionnaires
3
Components of a data file Cases or observations Variables Values
4
Coding The identification of the possible values of a variable and the assignment of numbers to those values The numbers, representing the values, are stored in a data file
5
Closed questions/categorical variables A limited number of values The values are mutually exclusive The values are collectively exhaustive Code by assigning a number to each value
6
Example Coding gender Possible values: male; female Coding scheme: 1 = Male; 2 = Female
7
Why numbers? Efficient use of computers Quicker to enter Not subject to spelling mistakes
8
Why numbers? Some statisticians define measurement as necessarily resulting in numbers “To measure a property means to assign numbers to units as a way of representing that property.” (D. S. Moore, Statistics: Concepts and Controversies, 2nd ed. (New York, W. H. Freeman Press, 1985)).
9
Pre-code Coding takes place before the questionnaire is delivered The possible responses to a question are anticipated The coding appears on the questionnaire
10
Coding rules Codes must be: –Mutually exclusive –Collectively exhaustive –Consistent across variables (J. Fielding, “ Coding and managing data ”, Researching Social Life, N. Gilbert, ed. (London, Sage Publications, 1993) and D. De Vaus, Surveys in Social Research (London, Routledge, 2002)).
11
Continuous variables Do not generally require coding as: –They are already numerical –There is a potentially infinite number of categories
12
Coding in SPSS The Values column in Variable View is used to implement coding in SPSS Numbers are allocated to each of the categories of a variable
13
Example: coding Drug In data file Ex1.sav, a variable called Drug was defined as a string variable and a number of drugs were entered Drug 1Heroin 2Alcohol 3Hashish 4Bhang 5Heroin 6Hashish Total N6 Case summaries a a Limited to first 100 cases.
14
Coding Drug Decide on a set of numeric labels for the different categories, in this case drugs: –1 = Heroin –2 = Alcohol –3 = Hashish –4 = Bhang
15
Coding Drug Create a new variable Drug2: type = numeric; width = 2; decimals = 0; label = Drug Coded Click on the Values column and then on the three dots that appear to the right of the Values box to generate the following dialogue box:
16
Click to register code
18
FrequencyPercentageValid percentage Cumulative percentage ValidHeroin233.3 Alcohol233.3 66.7 Hashish116.7 83.3 Bhang116.7 100.0 Total6100.0 Drug Coded Frequency count for Drug Coded:
19
Note Coding data does not change the level of measurement The level of measurement is a guide to the selection of appropriate statistics
20
SPSS Value labels can be assigned to numeric variables and string variables of eight or fewer characters By default, SPSS sets all numeric variables to Scale variables
21
Exercise: coding
22
Frequency count of Drug FrequencyPercentageValid percentage Cumulative percentage ValidAlcohol325.0 Bhang18.3 33.3 Hashish325.0 58.3 Heroin216.7 75.0 Mandrax325.0 100.0 Total12100.0 Drug
23
Frequency count of Condition FrequencyPercentageValid Percentage Cumulative percentage ValidRecovered541.7 Relapsed758.3 100.0 Total12100.0 Condition Coded
24
Missing values
25
Missing values: causes The question is not applicable The respondent does not know The respondent refuses to answer No response is marked on the questionnaire (i.e., truly missing and there is no clue why) (De Vaus, 2002)
26
Coding missing values Use codes outside of the range of common values: –e.g., 9, 99, -99, 999 If possible, retain the same codes for the various missing options for all variables The default missing value in SPSS is a full stop. and is called the “system’s missing value”
27
SPSS: missing values Part of the variable definition Variable View: Missing column –Click on the Missing cell in the row defining the variable –Click on the three buttons that appear to the right of the Missing cell and the following dialogue box will appear:
29
Exercise Three additional observations are obtained for Ex1.sav: –DAP1-0013; Alcohol; 39; ------------ –DAP1-0014; Hashish; --; Recovered –DAP1-0015; ---------; 16; Relapsed Code necessary missing values for the variables Run a frequency count on Drug and Condition, comparing percentage and valid percentage
30
Identification numbers
31
ID numbers: purpose An ID number: –Ensures anonymity –Links a row in the data file to a physical questionnaire
32
ID numbers: characteristics A unique identifier Sometimes contains information in a compound form
33
Example DAP1-001, DAP1-002, … : –DAP is short for Drug Assessment Programme –001, 002 are consecutive numbers that uniquely identify each questionnaire or respondent –There must be at most 999 respondents, as space has only been made available for 999 unique ID numbers
34
Summary Coding closed questions Value labels Frequency counts Missing values ID numbers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.