Data Analysis: Data Analysis: Review and Practical Application using SPSS.

Slides:



Advertisements
Similar presentations
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Advertisements

Bivariate Analysis Cross-tabulation and chi-square.
Ch11 Curve Fitting Dr. Deshi Ye
Chapter 13 Multiple Regression
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
Chapter Fourteen Examining Associations: Correlation and Regression.
Chapter 12 Multiple Regression
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Data Analysis Express: Data Analysis Express: Practical Application using SPSS.
MR2300: MARKETING RESEARCH PAUL TILLEY Unit 10: Basic Data Analysis.
Data Analysis Statistics. Inferential statistics.
Analysis of Variance & Multivariate Analysis of Variance
Cross-Tabulations.
Correlational Designs
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
DESIGNING, CONDUCTING, ANALYZING & INTERPRETING DESCRIPTIVE RESEARCH CHAPTERS 7 & 11 Kristina Feldner.
Quantifying Data.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Learning Objective Chapter 13 Data Processing, Basic Data Analysis, and Statistical Testing of Differences CHAPTER thirteen Data Processing, Basic Data.
Inferential Statistics
Leedy and Ormrod Ch. 11 Gray Ch. 14
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Chapter 13: Inference in Regression
Linear Regression and Correlation
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Fundamentals of Data Analysis. Four Types of Data Alphabetical / Categorical / Nominal data: –Information falls only in certain categories, not in-between.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
STA291 Statistical Methods Lecture 31. Analyzing a Design in One Factor – The One-Way Analysis of Variance Consider an experiment with a single factor.
Week 10 Chapter 10 - Hypothesis Testing III : The Analysis of Variance
Chapter 14 Introduction to Multiple Regression
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
User Study Evaluation Human-Computer Interaction.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
Correlation Patterns.
Managerial Economics Demand Estimation & Forecasting.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
CADA Final Review Assessment –Continuous assessment (10%) –Mini-project (20%) –Mid-test (20%) –Final Examination (50%) 40% from Part 1 & 2 60% from Part.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
Chapter Twelve Copyright © 2006 John Wiley & Sons, Inc. Data Processing, Fundamental Data Analysis, and Statistical Testing of Differences.
Chapter 13 Multiple Regression
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
Correlation & Regression Analysis
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
PART 2 SPSS (the Statistical Package for the Social Sciences)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Data Analysis: Statistics for Item Interactions. Purpose To provide a broad overview of statistical analyses appropriate for exploring interactions and.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Appendix I A Refresher on some Statistical Terms and Tests.
Chapter 14 Introduction to Multiple Regression
CHAPTER 13 Data Processing, Basic Data Analysis, and the Statistical Testing Of Differences Copyright © 2000 by John Wiley & Sons, Inc.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Multiple Regression Analysis and Model Building
Basic Statistical Terms
LEARNING OUTCOMES After studying this chapter, you should be able to
Product moment correlation
Data Processing, Basic Data Analysis, and the
Presentation transcript:

Data Analysis: Data Analysis: Review and Practical Application using SPSS

Data of Interest National Insurance Company – 1000 questionnaires sent – 285 respondents Questionnaire Presentation – Copy given in class

Coding Coding broadly refers to the set of all tasks associated with transforming edited responses into a form that is ready for analysis Steps – Transforming responses to each question into a set of meaningful categories – Assigning numerical codes to the categories – Creating a data set suitable for computer analysis

Transforming Responses into Meaningful Categories A structured question is pre-categorized Responses to a nonstructured or open-ended question to be grouped into a meaningful and manageable set of categories Q 1: Q 1: In this questionnaire, how many non- categorized questions?

Missing-Value Category A missing value can stem from – A respondent's refusal to answer a question – An interviewer's failure to ask a question or record an answer or a "don't know" that does not seem legitimate Best way to treat missing value responses – Sound questionnaire design – Tight control over fieldwork

Assigning Numerical Codes Assign appropriate numerical codes to responses that are not already in quantified form To assign numerical codes, the researcher should facilitate computer manipulation and analysis of the responses

Multiple Response Question – Rank Order Question Please rank the following Insurance companies by placing a 1 beside the company you think is best overall, a 2 beside the company you think is second best, and so on. __________Progressive __________All State __________National Q2 Q2 How would you code the previous question to be added to the questionnaire ? This question requires as many variables (and columns) as there are objects to be ranked: 3 separate variables are needed

Creating a Data Set Organized collection of data records Each sample unit within the data set is called a Case or Observation Structure of a Data Set – The number of observations = n – The total number of variables embedded in the questionnaire is m, then Data set = n x m matrix of numbers Importance of Coding Sheet: Anybody can enter /check data set. (Copy of coding sheet)

SPSS Data Set 2 Views : Variable and Data. Raw Variable (labels and values) Transformed Variable (compute and recode)

Preliminary Data Analysis: Basic Descriptive Statistics Preliminary data analysis examines the central tendency and the dispersion of the data on each variable in the data set Measurement level dictates what to do Feeling for the data What can we do: limitations on next slide? Run descriptives. (outputs 1)

Measures of Central Tendency and Dispersion for Different Types of Variables

Why Averages May be Misleading Researchers tested a new sauce product and found – Mean rating of the taste test was close to the middle of the scale, which had "very mild" and "very hot" as its bipolar adjectives Researcher’s conclusion – Consumers need really neither really hot nor really mild sauce

Why Averages May be Misleading (Cont’d) Deeper examination revealed – The existence of a large proportion of consumers who wanted the sauce to be mild and an equally large proportion who wanted it to be hot nor really mild sauce Moral of the story: – A clear understanding of the distribution of responses can help a researcher avoid erroneous inferences. Talk about Skewness and Kurtosis.

Crosstabs: Occurencies in specific condition. Most of the time with categorical variables Examples to run

Cross-Tabulations- Comparing frequencies: Chi-square Contingency Test Technique used for determining whether there is a statistically significant relationship between two categorical (nominal or ordinal) variables

Cross-Tabulation Using SPSS for National Insurance Company One crucial issue in the customer survey of National Insurance Company was how a customer's education was associated with whether or not she or he would recommend National to a friend.

Need to Conduct Chi-square Test to Reach a Conclusion The hypotheses are: – H 0 :There is no association between educational level and willingness to recommend National to a friend (the two variables are independent of each other). – H a :There is some association between educational level and willingness to recommend National to a friend (the two variables are not independent of each other). – Let’s do it….

Conducting the Test Test involves comparing the actual, or observed, cell frequencies in the cross-tabulation with a corresponding set of expected cell frequencies(E ij )

Expected Values n i n j E ij = n where n i and n j are the marginal frequencies, that is, the total number of sample units in category i of the row variable and category j of the column variable, respectively

where r and c are the number of rows and columns, respectively, in the contingency table. The number of degrees of freedom associated with this chi ‑ square statistic are given by the product (r - 1)(c - 1). Chi-square Test Statistic

Computed Chi- square value P-value National Insurance Company Study

National Insurance Company Study --P-Value Significance The actual significance level (p-value) = the chances of getting a chi-square value as high as when there is no relationship between education and recommendation are less than 19 in The apparent relationship between education and recommendation revealed by the sample data is unlikely to have occurred because of chance. We can safely reject null hypothesis.

Precautions in Interpreting Cross Tabulation Results Two-way tables cannot show conclusive evidence of a causal relationship Watch out for small cell sizes Increases the risk of drawing erroneous inferences when more than two variables are involved

Overview of Techniques for Examining Associations Spearman Correlation Coefficient Technique The technique is appropriate when – The degree of association between two sets of ranks (pertaining to two variables) is to be examined Illustrative Research Question(s) This Technique Can Answer: – Is there a significant relationship between motivation levels of salespeople and the quality of their performance? Assume that the data on motivation and quality of performance are in the form of ranks, say, 1through 20, for 20 salespeople who were evaluated subjectively by their supervisor on each variable

Overview of Techniques for Examining Associations (Cont’d) Pearson Correlation Coefficient Technique This technique is appropriate when – The degree of association between two metric-scaled (interval or ratio) variables is to be examined Illustrative Research Question(s) This Technique Can Answer: – Is there a significant relationship between customers' age (measured in actual years) and their perceptions of our company's image (measured on a scale of 1to 7)?

Spearman Correlation Coefficient A Spearman correlation coefficient is a measure of association between two sets of ranks d i = the difference between the ith sample unit's ranks on the two variables n = the total sample size

The Pearson correlation coefficient is the degree of association between variables that are interval-or ratio-scaled. Pearson correlation coefficient (r xy ) between them is given by n = sample size (total number of data points) X and Y = means X i and Y i = values for any sample unit i s x and s y = standard deviations n  i = 1 (X i – X)(Y i – Y) r xy = (n-1) s x s y Pearson Correlation Coefficient

National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs National Insurance Company was interested in the correlations between respondents’ overall service- quality perceptions (on the 10-point scale) and their average ratings along each of the five dimensions of Service Quality

National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs Using SPSS

Interpreting Pearson Correlation Coefficients Each of the five service-quality measures (reliability, empathy, tangibles, responsiveness, and assurance) is significantly related to the overall quality (OQ) at the.001 level of significance Responsiveness has the strongest correlation (.8625) Tangibles have the weakest correlation (.5038) All the correlations are strong enough to be meaningful

Comparing Means Mainly T-tests and ANOVAs T-test on OQ and gender.

Independent T-tests Independent Variable with 2 categories max. Equality of variance (cf output) 88% of chance that the difference of.04 is due to chance (random effect). Cannot reject the null hypothesis.

Analysis of Variance ANOVA is appropriate in situations where the independent variable is set at certain specific levels (called treatments in an ANOVA context) and metric measurements of the dependent variable are obtained at each of those levels

Example 24 Stores Chosen randomly for the study 8 Stores randomly chosen for each treatment Treatment 1 Store brand sold at the regular price Treatment 2 Store brand sold at 50¢ off the regular price Treatment 3 Store brand sold at 75¢ off the regular price monitor sales of the store brand for a week in each store

Table 15.2 Unit Sales Data Under Three Pricing Treatments

ANOVA –Grocery Store Hypothesis Grocery Store Example – H o  1 =  2 =  3 – H a At least one  is different from one or more of the others Hypotheses for K Treatment groups or samples – H o  1 =  2 = ………..  k – H a At least one  is different from one or more of the others

Exhibit 15.1 SPSS Computer Output for ANOVA Analysis

Exhibit 15.1 SPSS Computer Output for ANOVA Analysis (Cont’d) There is less than a.001 probability of obtaining an F- value as high as

ANOVA OQ recommendation and OQ, individual variable OQ and EDUC (Graph)..and post hoc

Overview of Techniques for Examining Associations (Cont’d) Simple Regression Analysis Technique This technique is appropriate when – A mathematical function or equation linking two metric-scaled (interval or ratio) variables is to be constructed, under the assumption that values of one of the two variables is dependent on the values of the other

Overview of Techniques for Examining Associations–Simple Regression Analysis (Cont’d) Illustrative Research Question(s) this Technique Can Answer: – Are sales (measured in dollars) significantly affected by advertising expenditures (measured in dollars)? – What proportion of the variation in sales is accounted for by variation in advertising expenditures? How sensitive are sales to changes in advertising expenditures?

Overview of Techniques for Examining Associations (Cont’d) Multiple Regression Analysis Technique This technique is appropriate when – Under the same conditions as simple regression analysis except that more than two variables are involved wherein one variable is assumed to be dependent on the others

Overview of Techniques for Examining Associations (Cont’d) Illustrative Research Question(s) this Technique Can Answer: – Are sales significantly affected by advertising expenditures and price (where all three variables are measured in dollars)? – What proportion of the variation in sales is accounted for by advertising and price? How sensitive are sales to changes in advertising and price?

Simple Regression Analysis Generates a mathematical relationship (called the regression equation) between one variable designated as the dependent variable (Y) and another designated as the independent variable (X)

Independent Variable Vs. Dependent Variable Independent variable – Explanatory or predictor variable – Often presumed to be a cause of the other Dependent variable – Criterion Variable – Influenced by the independent variable

Practical Applications of Regression Equations The regression coefficient, or slope, can indicate how sensitive the dependent variable is to changes in the independent variable The regression equation is a forecasting tool for predicting the value of the dependent variable for a given value of the independent variable

Precautions In Using Regression Analysis Only capable of capturing linear associations between dependent and independent variables A significant R 2- value does not necessarily imply a cause-and-effect association between the independent and dependent variables A regression equation may not yield a trustworthy prediction of the dependent variable when the value of the independent variable at which the prediction is desired is outside the range of values used in constructing the equation

Precautions In Using Regression Analysis (Cont’d) A regression equation based on relatively few data points cannot be trusted The ranges of data on the dependent and independent variables can affect the meaningfulness of a regression equation

Multiple Regression Analysis Yi = a + b 1 X 1i + b 2 X 2i + … + b k X ki Y i is the predicted value of the dependent variable for some unit i; X 1i, X 2i, …, X ki are values on the independent variables for unit i; b l, b 2,..., b k are the regression coefficients; a is the Y-intercept representing the prediction for Y when all independent variables are set to zero

National Insurance Company– Multiple Regression Using SPSS Jill and Tom were interested in conducting a multiple regression analysis wherein overall service quality perceptions is the dependent variable and the average ratings along the five dimensions are the indpendent variable

Factor Analysis A data and variable reduction technique that attempts to partition a given set of variables into groups of maximally correlated variables

Factor Analysis Output and Its Interpretation Primary output of factor analysis is a factor- loading matrix

Table 15.4 Factor-Loading Matrix Based on Data from Study of Star Customers 3 Variables load high on factor 1 3 Variables load high on factor 2

Reducing Star Data X 1, X 4, and X 6 can be combined into one factor X 2, X 3, and X 5 can be into a second factor 6 variables can be reduced to two factors

Potential Applications of Factor Analysis Used to – Develop concise but comprehensive, multiple- item scales for measuring various marketing constructs – Illuminate the nature of distinct dimensions underlying an existing data set – Convert a large volume of data into a set of factor scores on a limited number of uncorrelated factors

Cluster Analysis Segment objects into groups so that members within each group are similar to one another in a variety of ways Useful for segmenting customers, market areas, and products

Use of Cluster Analysis Firm offering recreational services wanted to enter a new region of the country They gathered data on more than 100 characteristics including – Demographics – Expenditures on recreation – Leisure time activities – Interests of household members The firm identified one or several household segments that are likely to be most responsive to its advertising and to its services

How Does Cluster Analysis Work? Cluster analysis measures the similarity between objects on the basis of their values on the various characteristics

Exhibit 15.8 Clusters Formed by Using Data on Two Characteristics High Low Extent of participation in outdoor sporting events Extent of watching outdoor sporting events on TV