LECT. 8 INITIAL ANALYSIS OF RAW DATA
INITIAL ANALYSIS OF RAW DATA After data has been collected and before it is analysed, the researcher must examine it to ensure its validity. Blank responses, referred to as missing data must be dealt with. If the questions were pre-coded, then they can simply be transferred into a database. If they are not pre-coded, then a system must be developed so that they can be input into the database. The typical tasks involved are data editing, which deals with missing data, coding, transformation and data entry.
Data Editing Before questionnaire data can be used, it must be edited. This means it must be inspected for completeness and consistency. Some inconsistencies may be corrected at this point. For example, a respondent may have not answered a question on marital status. But in other questions, she responded that she had been married for 10 years and had three children under the age of 18. In such cases, the researcher may choose to fill in the unanswered question of marital status. Of course, this has some risk because the individual may have been recently divorced.
If this were true, the researcher would be introducing bias in the data if he or she chose to mark the married category. Thus, if possible it is always best to contact individuals to complete missing responses. Editing also involves checking to see if respondents understood the question or followed a particular sequence they were supposed to in a branching question. Editing may result in the elimination of questionnaires. For example, if there is a large proportion of missing data, then the entire questionnaire may have to be removed from the database.
The next step is to code the responses. Responses could be coded either before or after the data is collected. Coding means assigning a number to a particular response so the answer can be entered into a database. For example, if a five-point Agree-Disagree scale is used, then it must be decided if Strongly Agree will be coded with a 5 or a 1. Most researchers will assign the largest number to Strongly Agree and the smallest to Strongly Disagree; for example; 5 = Strongly Agree and 1 = Strongly Disagree, with the points in between being assigned 2, 3, or 4.
Codes are study specific, depending entirely on the purpose and findings of the study. It is important for multiple researchers to ensure that they are coding with the same assumptions and standards by comparing and discussing their codes on a percentage of the data. This ensures researchers are thinking about the data in the same way and drawing similar conclusions. Analysis then proceeds from an interpretation of these codes and what they mean for the larger research questions.
THANK YOU.