BNAD 276: Statistical Inference in Management Spring 2016 Green sheets.

Slides:

Advertisements

Similar presentations

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated.

Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.

SIMPLE LINEAR REGRESSION

Chapter Topics Types of Regression Models

Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Relationships Among Variables

Correlation and Linear Regression

Correlation and Linear Regression

Linear Regression and Correlation

Correlation and Linear Regression

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Stage Screen Row B Gallagher Theater Row R Lecturer’s desk Row A Row B Row C

Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.

Introduction to Linear Regression

McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.

Stage Screen Row B Gallagher Theater Row R Lecturer’s desk Row A Row B Row C

Stage Screen Row B Gallagher Theater Row R Lecturer’s desk Row A Row B Row C

Lecture 10: Correlation and Regression Model.

Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A

June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.

Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M

Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.

Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.

Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M

BNAD 276: Statistical Inference in Management Spring 2016 Green sheets.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.

Screen Stage Lecturer’s desk Gallagher Theater Row A Row A Row A Row B

Please hand in Project 4 To your TA.

Modern Languages Projection Booth Screen Stage Lecturer’s desk broken

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Please sit in your assigned seat INTEGRATED LEARNING CENTER

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2017 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2017 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2018 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Correlation and Simple Linear Regression

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.

Lecturer’s desk Projection Booth Screen Screen Harvill 150 renumbered

BNAD 276: Statistical Inference in Management Spring 2016

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.

Correlation and Simple Linear Regression

Lecturer’s desk Projection Booth Screen Screen Harvill 150 renumbered

Lecturer’s desk Projection Booth Screen Screen Harvill 150 renumbered

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression

Presentation transcript:

BNAD 276: Statistical Inference in Management Spring 2016 Green sheets

Before our fourth and final exam (April 28 th ) OpenStax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions Schedule of readings

On class website: Please complete homework worksheet #17 Hypothesis testing with Correlations Worksheet Due: Thursday, April 14 th Homework

By the end of lecture today 4/12/16 Logic of hypothesis testing with Correlations Interpreting the Correlations and scatterplots Simple Regression Using correlation for predictions

Exam 3 Thanks for your patience and cooperation We should have the grades up by Tuesday (takes about a week) It went really well!

Correlation Correlation: Measure of how two variables co-occur and also can be used for prediction Range between -1 and +1 Range between -1 and +1 The closer to zero the weaker the relationship and the worse the prediction The closer to zero the weaker the relationship and the worse the prediction Positive or negative Positive or negative Remember, We’ll call the correlations “r” Revisit this slide

Positive correlation Positive correlation: as values on one variable go up, so do values for other variable pairs of observations tend to occupy similar relative positions higher scores on one variable tend to co-occur with higher scores on the second variable lower scores on one variable tend to co-occur with lower scores on the second variable scatterplot shows clusters of point from lower left to upper right Remember, Correlation = “r” Revisit this slide

Negative correlation Negative correlation: as values on one variable go up, values for other variable go down pairs of observations tend to occupy dissimilar relative positions higher scores on one variable tend to co-occur with lower scores on the second variable lower scores on one variable tend to co-occur with higher scores on the second variable scatterplot shows clusters of point from upper left to lower right Remember, Correlation = “r” Revisit this slide

Zero correlation as values on one variable go up, values for the other variable go... anywhere pairs of observations tend to occupy seemingly random relative positions scatterplot shows no apparent slope Revisit this slide

Is it possible that they are causally related? Correlation does not imply causation Yes, but the correlational analysis does not answer that question What if it’s a perfect correlation – isn’t that causal? No, it feels more compelling, but is neutral about causality Number of Birthday Cakes Number of Birthdays Remember the birthday cakes! Revisit this slide

Correlation - How do numerical values change? r = r = r = r = 0.61 Revisit this slide

Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number) Revisit this slide

Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)

Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number) Revisit this slide

Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number) Revisit this slide

Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number) Revisit this slide Statistically significant p < 0.05 Reject the null hypothesis

Finding a statistically significant correlation The result is “statistically significant” if: the observed correlation is larger than the critical correlation we want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) the p value is less than 0.05 (which is our alpha) we want our “p” to be small!! we reject the null hypothesis then we have support for our alternative hypothesis

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule Alpha level? ( α =.05 or.01)? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger than critical r then reject null Step 5: Conclusion - tie findings back in to research problem Critical statistic (e.g. critical r) value from table? For correlation null is that r = 0 (no relationship) Degrees of Freedom = (n – 2) df = # pairs - 2

Five steps to hypothesis testing Problem 1 Is there a relationship between the: Price Square Feet We measured 150 homes recently sold

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule – find critical r (from table) Alpha level? ( α =.05) null is that there is no relationship (r = 0.0) Degrees of Freedom = (n – 2) df = # pairs - 2 Is there a relationship between the cost of a home and the size of the home alternative is that there is a relationship (r ≠ 0.0) 150 pairs – 2 = 148 pairs

Critical r value from table df = # pairs - 2 df = 148 pairs α =.05 Critical value r (148) = 0.195

Five steps to hypothesis testing Step 3: Calculations

Five steps to hypothesis testing Step 3: Calculations

Five steps to hypothesis testing Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger than critical r then reject null r = Critical value r (148) = Observed correlation r (148) = Yes we reject the null > 0.195

Conclusion: Yes we reject the null. The observed r is bigger than critical r (0.727 > 0.195) Yes, this is significantly different than zero – something going on These data suggest a strong positive correlation between home prices and home size. This correlation was large enough to reach significance, r(148) = 0.73; p < 0.05

Finding a statistically significant correlation The result is “statistically significant” if: the observed correlation is larger than the critical correlation we want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) the p value is less than 0.05 (which is our alpha) we want our “p” to be small!! we reject the null hypothesis then we have support for our alternative hypothesis

Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables 1.0** EducationAgeIQIncome IQ Age Education Income 1.0** 0.65** 0.52* 0.27* 0.41* 0.38* * p < 0.05 ** p < 0.01 Remember, Correlation = “r” Revisit this slide

Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables EducationAgeIQIncome IQ Age Education Income 0.65** 0.52* 0.27* 0.41*0.38* * p < 0.05 ** p < 0.01

Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of X with X Correlation of Y with Y Correlation of Z with Z Correlation matrices

Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of X with Y Correlation matrices p value for correlation of X with Y p value for correlation of X with Y Does this correlation reach statistical significance?

Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of X with Z p value for correlation of X with Z p value for correlation of X with Z Correlation matrices Does this correlation reach statistical significance?

Variable names Make up any name that means something to you VARX = “Variable X” VARY = “Variable Y” VARZ = “Variable Z” Correlation of Y with Z p value for correlation of Y with Z p value for correlation of Y with Z Correlation matrices Does this correlation reach statistical significance?

What do we care about? Correlation matrices

What do we care about? We measured the following characteristics of 150 homes recently sold Price Square Feet Number of Bathrooms Lot Size Median Income of Buyers

Correlation matrices What do we care about?

Correlation matrices What do we care about?

Correlation matrices What do we care about?

Critical r value from table df = # pairs - 2 df = 148 pairs α =.05 Critical value r (148) = 0.195

Correlation matrices What do we care about? Critical value from table r (148) = 0.195

Correlation: Independent and dependent variables When used for prediction we refer to the predicted variable as the dependent variable and the predictor variable as the independent variable Dependent Variable Dependent Variable Independent Variable Independent Variable What are we predicting?

Correlation - What do we need to define a line Expenses per year Yearly Income Y-intercept = “a” ( also “b 0 ”) Where the line crosses the Y axis Slope = “b” ( also “b 1 ”) How steep the line is If you spend this much If you probably make this much The predicted variable goes on the “Y” axis and is called the dependent variable The predictor variable goes on the “X” axis and is called the independent variable

Angelina Jolie Buys Brad Pitt a $24 million Heart-Shaped Island for his 50th Birthday Expenses per year Yearly Income Angelina spent this much Angelina probably makes this much Dustin spends $12 for his Birthday Dustin spent this much Dustin probably makes this much Revisit this slide

Assumptions Underlying Linear Regression These Y values are normally distributed. The means of these normal distributions of Y values all lie on the straight line of regression. For each value of X, there is a group of Y values The standard deviations of these normal distributions are equal. Revisit this slide

Assumptions Underlying Linear Regression These Y values are normally distributed. The means of these normal distributions of Y values all lie on the straight line of regression. For each value of X, there is a group of Y values The standard deviations of these normal distributions are equal.

Correlation - the prediction line Prediction line makes the relationship easier to see (even if specific observations - dots - are removed) identifies the center of the cluster of (paired) observations identifies the central tendency of the relationship (kind of like a mean) can be used for prediction should be drawn to provide a “best fit” for the data should be drawn to provide maximum predictive power for the data should be drawn to provide minimum predictive error - what is it good for?

Predicting Restaurant Bill The expected cost for dinner for two couples (4 people) would be $95.06 Cost = Persons If “Persons” = 4, what is the prediction for “Cost”? Cost = Persons Cost = (4) Cost = = Prediction line Y’ = a + b 1 X 1 Y-intercept Slope If “Persons” = 1, what is the prediction for “Cost”? Cost = Persons Cost = (1) Cost = = People Cost If People = 4 Cost will be about 95.06

Predicting Rent The expected cost for rent on an 800 square foot apartment is $990 Rent = SqFt If “SqFt” = 800, what is the prediction for “Rent”? Rent = SqFt Rent = (800) Rent = = 990 Prediction line Y’ = a + b 1 X 1 Y-intercept Slope Square Feet Cost If SqFt = 800 Rent will be about 990 If “SqFt” = 2500, what is the prediction for “Rent”? Rent = SqFt Rent = (2500) Rent = ,625 = 2,775

Regression Example Rory is an owner of a small software company and employs 10 sales staff. Rory send his staff all over the world consulting, selling and setting up his system. He wants to evaluate his staff in terms of who are the most (and least) productive sales people and also whether more sales calls actually result in more systems being sold. So, he simply measures the number of sales calls made by each sales person and how many systems they successfully sold.

Regression Example Do more sales calls result in more sales made? Dependent Variable Independent Variable Ethan Isabella Ava Emma Emily Jacob Joshua Number of sales calls made Number of systems sold Step 1: Draw scatterplot Step 2: Estimate r

Regression Example Do more sales calls result in more sales made? Step 3: Calculate r Step 4: Is it a significant correlation?

Do more sales calls result in more sales made? Step 4: Is it a significant correlation? n = 10, df = 8 alpha =.05 Observed r is larger than critical r (0.71 > 0.632) therefore we reject the null hypothesis. Yes it is a significant correlation r (8) = 0.71; p < 0.05 Step 3: Calculate r Step 4: Is it a significant correlation?

Regression: Predicting sales Step 1: Draw prediction line What are we predicting? r = 0.71 b = (slope) a = (intercept) Draw a regression line and regression equation

Regression: Predicting sales Step 1: Draw prediction line r = 0.71 b = (slope) a = (intercept) Draw a regression line and regression equation

Regression: Predicting sales Step 1: Draw prediction line r = 0.71 b = (slope) a = (intercept) Draw a regression line and regression equation

Step 2: State the regression equation Y’ = a + bx Y’ = x Step 3: Solve for some value of Y’ Y’ = (1) Y’ = If make one sales call You should sell systems Regression: Predicting sales Step 1: Predict sales for a certain number of sales calls What should you expect from a salesperson who makes 1 calls? Madison Joshua They should sell systems If they sell more  over performing If they sell fewer  underperforming

Step 2: State the regression equation Y’ = a + bx Y’ = x Step 3: Solve for some value of Y’ Y’ = (2) Y’ = Regression: Predicting sales Step 1: Predict sales for a certain number of sales calls What should you expect from a salesperson who makes 2 calls? If make two sales call You should sell systems Isabella Jacob They should sell systems If they sell more  over performing If they sell fewer  underperforming