QM Spring 2002 Business Statistics Bivariate Analyses for Qualitative Data
Student Objectives Summarize regression analysis – Interpret regression statistics – Incorporate into report – Address questions concerning homework Discuss why regression won’t work with qualitative data Use crosstab approach for joint frequency distributions Use PivotTable feature of Excel for creating crosstabs
Let’s Wrap Up Regression Complete example from previous class Review interpretations of regression statistics – Describe the relationship – Assess the validity Summary of notation & terminology Address questions concerning the homework – Expectations – Mechanics (e.g., copy/paste) – Other... ?
Results of Analysis of TV Time versus Age Note: using complete data set Results b 0 = hours/week b 1 = hours per year of age R 2 = 56% S yx = hours/week Correlation (r): a single, multipurpose measure – Square root of R – Same sign as b 1 – R = – Summarizes the estimated strength of the relationship
Interpreting Regression Analyses (a) Describing the relationship – Intercept (b 0 ): Base value for Y If it were possible for X to be 0, this is what Y would be – Slope (b 1 ): How much Y changes when X changes 1 unit The sensitivity of Y to changes in X (sometimes, the marginal value of X)
Validity – R-Square (R 2 ): we know Y varies, but how much (i.e., what percentage) is attributable to the variation in X? – Standard error (S yx ): if we used the regression equation to predict Y, how much, on the average, should we expect to be wrong? Interpreting Regression Analyses (b)
Questions About the Homework? Which data: – kivzdata.xls – All households, not just Ch.7 What analyses – Univariate Include: histogram and descriptive stats Variables: TV Time, Income – Bivariate Scatterplot (properly labeled) Regression statistics (the basic 4) The report – Integrate charts with text – Nontechnical language Other questions... ?
Regression, What Not to Do Typical modeling errors – Reverse Y and X – Treat qualitative variables as quantitative Use Excel shortcuts to create inflexible worksheets – Data analysis tool – Plot trend line
Now, Recall Analysis Depends on Data Type Univariate: – Quanitative data: histograms, averages, etc. – Qualitative data: bar charts, proportions Bivariate: – Both variables quantitative Scatterplots Regression analysis – Either or both variables qualitative Contingency tables, aka: –PivotTables (Excel) –Crosstabulations Chi-square analysis (beyond our scope)
Let’s Look at the Website Analytics Case Pilot sample of major eCommerce sites Note Internet business models – Virtual storefront (e.g., Amazon) – Content provider (e.g., WSJ) – Auction (e.g., eBay) – Several others, but these are the top three Major decision common in business – Make vs buy – Apply to site development What’s the research question here?
Examining the Question Does “make vs buy” depend upon type of business model? Start with simple frequency tables Doesn’t tell us about how these variables are related Need to go further: crosstab
Crosstabs: Many Flavors Joint frequency: basis for developing the other three Joint relative frequency (% of total) – Joint percentages – Margin percentages (same as univariate %) Analyzing relationships – Row percentage – Column percentage
Relationship? – If so, % of observations in given category of primary variable should differ substantially across categories of explanatory variable – That is, depending upon type of table, Row % values differ down a given column, or Column % values across a given row Easier to analyze – With practice – Using basic probability concepts Crosstabs: Relationships
Using Excel’s PivotTable Feature for Crosstabs Select the data, including headings Click on Data | PivotTable Click twice on Next Click on Layout – Drag Development to row – Drag Model to column – Drag either to data – Double click on data button Select Count, then click on Options In Show Data As, select % of Total Click on OK – Click on OK Click on Finish
Homework Complete the KIVZ analysis/report Development vs Model for WA case – Try to create crosstabulation – Think about whether a relationship exists