Indicator 3.05 Interpret marketing information to test hypotheses and/or to resolve issues.
Need for physical inspection of paper-based marketing data If errors are caught early, they can be corrected. Look for standardizations and continuity Missing values/nonuniform data entry which means changing the value or formula of the response codes. Make sure information exists as required Data may not be presented in the form needed for further analysis ie incomplete responses (skips questions) data entry error (wrong entry/answer; recorded improperly –choice is B but entered as C) Inconsistencies in responses
Questions to ask when screening initial data Where did the data come from? Who did research? Who is source? Have the data been peer reviewed? Professionals have looked at the study before it was published and concluded that the study’s authors pretty much followed the rules of good scientific research How were the data collected? If survey, were the respondents selected at random? No cherry picking (selecting an area you know will give you the answers you hope for) Be skeptical when dealing with comparisons A correlation (relationship) DOES NOT mean causation Be aware of numbers taken out of context Numbers picked to look interesting may mean something entirely difference once it is placed in a different context
Actions that can be taken with incomplete records Cleanse the data which includes dropping problematic data. If caught early and still in field, can correct at once Recontact/reinterview respondent Check responses for each question as interview progresses Drop incomplete data from survey (in part or in full)
Failure of data entry technology Information is not recorded properly Reasons for data entry errors that limit the use of marketing research data Wrong entry Failure of data entry technology Information is not recorded properly
Inconsistencies in responses Reasons that questionable entries are found in raw marketing research data Honesty Inconsistencies in responses
Role of coding in processing marketing data Procedure converts the edited raw data into numbers or symbols Code book is created Identifies all of the variable names and numbers associated with each possible response to each question
Generally accepted coding principles Specify categories/classes ie Male = M or 1 Female = F or 2 Locate only one character in each column Use only numeric codes Use as many columns for fields assigned to variables Use standard coded for no information ie No answers = 9 Code in respondent ID on each record to tie coded data to questionnaire
Importance of editing coded information Must have systematic procedure for assessing the questionnaires developed by the research analyst Cleary defined decision rules should be in place and followed MUST not allow subjectivity to enter into editing process Data editors should be intelligent, experienced, and objective
Data-entry options On-line direct data entry Optical scanning Computer Assisted Telephone Interviewing (CATI) interviewer is sitting behind a terminal and asks the questions that appear on the screen. The respondent's answer is then typed into computer by the interviewer. Optical scanning For highly structured questionnaires Keyboarding Data entry via a computer keyboard; requires verification
Paper-based marketing information vs Paper-based marketing information vs. computer-based marketing information Computer based software saves time and money. The programs have error checking built in to prevent common errors. The availability of the item bank makes it easier to build questionnaires.
Importance of using a codebook to describe data and to indicate where and how they can be accessed Shows each question in the questionnaire Shows the variable name and label Shows the variable type, values and value labels.
Data Analysis After data is collected it must be analyzed so that conclusions can be made. Data summarization: is the process of describing a data set by computing a small number of measures that explain the data set
Purpose of Data Summarization Summarizes the data Make the data understandable Communicates patterns Shows how the sample findings relate to the overall population
What is descriptive statistics? Descriptive statistics are used simply to describe the sample you are concerned with. Descriptive statistics use measures of central tendency to explain data.
What is central tendency? A measure of central tendency is a measure that tells us where the middle of a bunch of data lies. It explains where data falls. The three most common measures of central tendency are the mean, the median, and the mode.
Mean It is simply the sum of the numbers divided by the number of numbers in a set of data. This is also known as average. Ex. What is the mean number of brothers and sisters? The responses are as follows: 2, 3, 1, 0, 9, 2, 3, 2, 4, 2. 2.8
Median The median is the middle score in a distribution of scores that have been ranked in numerical order. What is the median number of brothers and sisters? The responses are as follows: 2, 3, 1, 0, 9, 2, 3, 2, 4, 2. 2 A disadvantage of the median is if there is a large set of numbers, it would be time consuming to place each in order of size
Mode The mode is the most frequently occurring score in a set of scores. Ex. A researcher is interested in the effect of family size on self-esteem. To begin this study, 10 students are each asked how many brothers and sisters they have. The responses are as follows: 2, 3, 1, 0, 9, 2, 3, 2, 4, 2. What is the mode for this set of data? 2
When to use the mean, median, and mode The mean can be used to find an average/ typical number. Use the mean when numbers are closely related. If however, the numbers are very widely spread, are very unevenly distributed, or contain extreme values, e.g. 9, 10, 13, 17, 23, 30, 45; or a hundred values of 10 and one value of 50 then the mean can be misleading,. Use the median if looking for the central point. Mode – When you want to determine the most common response
When do you use measures of variability? Measure of variablity are used when the researcher wants to know the extent to which the data set differ from one another. Range, and Standard Deviation are computations for measuring variability.
Range The range is the difference between the highest and lowest scores in a distribution. A social researcher would like to know how many digits people in different age categories can recall with only one presentation of a list. She creates random lists of digits and presents them to participants. The number of digits recalled by the first 10 participants is as follows: 5, 9, 6, 10, 9, 7, 8, 7, 9, 12. What is the range of this data set? 7
Standard Deviation The standard deviation is a measure of how spread out your data are. How far away is the data from the average.