Data Analysis AMA Collegiate Marketing Research Certificate Program
Module Objectives Introduce key data analysis procedures Provide a basic understanding of how to interpret data
Data Analysis Metrics
What Will We Be Covering? Frequencies Mode Median Mean Standard Deviation Cross-Tabulations T-Tests One-way ANOVA Data Type Plays a Key Role in Data Analysis and Analytics
Frequencies
Frequencies = Data Counts and Percent Relevant for all types of data
Mode
Mode = Most Frequent Response
Median
Median = Middle Response Numerical value that identifies split of high vs. low half of sample responses Notes: (1) 18-34 has 32.1% of responses (2) 45+ has 47% of responses (3) Thus, 50th percentile falls in the 35-44 age range (32.1% - 53%) Median = 35-44
Mean
Mean = Average Convenience Store Food Must be interval or ratio data 1 2 3 4 5 Mean Not Fresh 0.3% 1.8% 18.2% 38.4% 41.4% Fresh 4.19 Bad Value 1.6% 21.6% 40.9% 35.6% Good Value 4.10 Taste Bad 0.4% 1.9% 20.2% 42.5% 35.0% Taste Good Low Quality 0.5% 2.9% 26.1% 40.7% 29.8% High Quality 3.96 Few Take-Home Options 1.4% 5.0% 29.5% 35.9% 28.2% Many Take-Home Options 3.85 Little Variety 4.0% 29.3% 38.9% 27.5% Large Variety 3.89 Unhealthy 1.1% 7.4% 33.2% 31.9% 26.3% Healthy 3.75 Few Choices for Children 6.9% 35.7% 30.6% 25.8% Many Choices for Children 3.73 Must be interval or ratio data Also frequencies, mode (3), median (4) for children food choices
Standard Deviation
Standard Deviation = Variation/Dispersion Represents how much variation or dispersion there is in the distribution of responses. Less variation = lower standard deviation. Standard deviation is “standardized” with a mean “0” and standard deviation of “1”
Standard Deviation and Margin of Error Need interval or ratio data Margin of error goes up as standard deviation goes up Harder to detect group difference (i.e., male vs. female satisfaction) as standard deviation goes up +/- 3% +/- 5% +/- 7%
Cross-Tabulations
Cross-Tabulations Comparing the distribution of responses across two categorical variables Goal is to see if there is a statistical differences in the distribution of responses across groups Male vs. females and preferred movie type Age groups and smart phone ownership Home owners vs. renters and Cable vs. Dish
Cross-Tabulations Considerations Typically two categorical variables Gender (M/F) and Movies (Comedy, Drama, Romance) Although you can use interval scales in cross-tabulations, t-tests and ANOVA are more commonly used because of the ability to calculate means Identify independent and dependent variable V = which groups you are comparing (gender, age) DV = what are you comparing (movies, smart phone)
Example: Buying Channel by Year Need to determine statistical significance
SPSS Output Where Buy: Hobbyist vs. Professional Conclusion: Because p of .000 < .05, can conclude that hobbyists are more likely to buy in store than professionals (74.6% vs. 68.5%)
T-Tests
T-Tests Comparison of mean scores across two groups Grouping variable = Independent variables Grouping variable must be categorical Male vs. female, smart phone vs. no smart phone Dependent variable = must be something for which you can calculate a mean (interval/ratio) Average satisfaction score, average importance rating, average number of times dining out per week
Independent Sample T-test Two different (independent) groups Male vs. female Smart phone buyers vs. non-smart phone buyers Hobbyists Professionals Quality 4.1 4.7 Price 4.6 4.0 Service 4.8 4.9 Speed of Orders 4.2 Conclusions: Professionals put greater importance on quality, service and speed; Hobbyists on price
SPSS Output Purchase Likelihood Hobbyist Vs. Professional Conclusion: Professionals are more likely to buy power tools. No significant difference for wood Sig (p value) must be < than .05
Paired T-test Paired T-test = same group, two different scores Customer satisfaction 2011 vs. 2012 (same customers) Comparison of attributes vs. competitor (see below) Our Brand Brand X Quality 4.9 4.2 Price 4.1 4.6 Service 4.8 Speed of Orders 4.7 4.3
SPSS Output Comparing Perceptions of Price vs. Other Attributes Conclusion: Perceptions of price are lower than hand tool variety, power tool variety, and product quality
One-way ANOVA
One-Way ANOVA Comparison of means across multiple groups Purchase likelihood by Age: by 18-24, 25-45 and 45+ years old Bus type: Government, university, manufacturing Conclusion: Each has a more likely purchase: G = Phones U = Computers M = Services Government University Manufacturing Computers 3.7 4.1 3.6 Phone Systems 3.4 3.3 Services 3.8 3.0 4.4
SPSS Output Conclusion: At least one significant difference exists Comparing Overall Value by Channel Conclusion: At least one significant difference exists Conclusion: Retail buyers have the highest perceptions of overall value; catalog second (but need to look at contrasts)
You will get to see how to run these analyses in SPSS, including click by click tutorials and results