Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Lesson 10: Linear Regression and Correlation
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Ch11 Curve Fitting Dr. Deshi Ye
Simple Linear Regression and Correlation
Business Statistics for Managerial Decision
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Multiple regression analysis
Chapter 10 Simple Regression.
Linear Regression and Correlation
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Simple Linear Regression
SIMPLE LINEAR REGRESSION
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
Inferences About Process Quality
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Simple Linear Regression
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Correlation and Linear Regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Introduction to Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Correlation & Regression Analysis
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Lecture Slides Elementary Statistics Twelfth Edition
Simple Linear Regression
Lecture Slides Elementary Statistics Thirteenth Edition
Correlation and Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Regression Models - Introduction
SIMPLE LINEAR REGRESSION
Product moment correlation
SIMPLE LINEAR REGRESSION
REGRESSION ANALYSIS 11/28/2019.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Business Statistics for Managerial Decision Farideh Dehkordi-Vakil

Comparing Two Proportions We often want to compare the proportions of two groups (such as men and women) that have some characteristics. We call the two groups being compared Population 1 and population 2. The two population proportions of “Successes” P 1 and P 2. The data consist of two independent SRS The sample sizes are n 1 from population 1 and n 2 from population 2.

Comparing Two Proportions The proportion of successes in each sample estimates the corresponding population proportion. Here is the notation we will use populationpopulationSample Count of Sample proportion size successes proportion 1 P 1 n 1 X 1 2 P 2 n 2 X 2

Sampling Distribution of Choose independent SRS of sizes n 1 and n 2 from two populations with proportions P 1 and P 2 of successes. Let be the difference between the two sample proportions of successes. Then as both sample sizes increase, the sampling distribution of D becomes approximately Normal. The mean of the sampling distribution is. The standard deviation of the sampling distribution is

Sampling Distribution of The sampling distribution of the difference of two sample proportions is approximately Normal. The mean and standard deviation are found from the two population proportions of successes, P 1 and P 2

Confidence Interval Just as in the case of estimating a single proportion, a small modification of the sample proportions greatly improves the accuracy of confidence intervals. The Wilson estimates of the two population proportions are

Confidence Interval The standard deviation of is approximately To obtain a confidence interval for P 1 -P 2, we replace the unknown parameters in the standard deviation by estimates to obtain an estimated standard deviation, or standard error.

Confidence Interval for Comparing Two Proportions

Example:”No Sweat” Garment Labels Following complaints about the working conditions in some apparel factories both in the United States and Abroad, a joint government and industry commission recommended in 1998 that companies that monitor and enforce proper standards be allowed to display a “No Sweat” label on their product. A survey of U.S. residents aged 18 or older asked a series of questions about how likely they would be to purchase a garment under various conditions.

Example:”No Sweat” Garment Labels For some conditions, it was stated that the garment had a “No Sweat” label; for others, there was no mention of such label. On the basis of of the responses, each person was classified as a “label user” or “ a “label nonuser.” About 16.5% of those surveyed were label users. One purpose of the study was to describe the demographic characteristics of users and nonusers.

Example:”No Sweat” Garment Labels The study suggested that there is a gender difference in the proportion of label users. Here is a summary of the data. Let X denote the number of label users. populationnX 1 (women) (men)

Example:”No Sweat” Garment Labels First calculate the standard error of the observed difference. The 95% confidence interval is

Example:”No Sweat” Garment Labels With 95% confidence we can say that the difference in the proportions is between 0.04 and Alternatively, we can report that the women are about 10% more likely to be label users than men, with a 95% margin of error of 6%. In this example we chose women to be the first population. Had we chosen men as the first population, the estimate of the difference would be negative (-0.104). Because it is easier to discuss positive numbers, we generally choose the first population to be the one with the higher proportion. The choice does not affect the substance of the analysis.

Significance Tests It is sometimes useful to test the null hypothesis that the two population proportions are the same. We standardize by subtracting its mean P 1 -P 2 and then dividing by its standard deviation If n 1 and n 2 are large, the standardized difference is approximately N(0, 1). To estimate  D we take into account the null hypothesis that P 1 = P 2.

Significance Tests If these two proportions are equal, we can view all of the data as coming from a single population. Let P denote the common value of P 1 and P 2. The standard deviation of is then

Significance Tests We estimate the common value of P by the overall proportion of successes in the two samples. This estimate of P is called the pooled estimate. To estimate the standard deviation of D, substitute for P in the expression for  DP. The result is a standard error for D under the condition that the null hypothesis H 0 : P 1 = P 1 is true. The test statistic uses this standard error to standardize the difference between the two sample proportions.

Significance Tests for Comparing Two Proportions

Example:men, women, and garment labels. The previous example presented the survey data on whether consumers are “label users” who pay attention to label details when buying a shirt. Are men and women equally likely to be label users? Here is the data summary: PopulationnX 1 (women) (men)

Example:men, women, and garment labels We compare the proportions of label users in the two populations (women and men) by testing the hypotheses H 0 :P 1 = P 2 H a :P 1  P 2 The pooled estimate of the common value of P is: This is the proportion of label users in the entire sample.

Example:men, women, and garment labels The test statistic is calculated as follows: The observed difference is more than 3 standard deviation away from zero.

Example:men, women, and garment labels The P-value is: Conclusion: 21% of women are label users versus only 11% of men; the difference is statistically significant.

Simple Regression Simple regression analysis is a statistical tool That gives us the ability to estimate the mathematical relationship between a dependent variable (usually called y) and an independent variable (usually called x). The dependent variable is the variable for which we want to make a prediction. While various non-linear forms may be used, simple linear regression models are the most common.

Introduction The primary goal of quantitative analysis is to use current information about a phenomenon to predict its future behavior. Current information is usually in the form of a set of data. In a simple case, when the data form a set of pairs of numbers, we may interpret them as representing the observed values of an independent (or predictor ) variable X and a dependent ( or response) variable Y.

Introduction The goal of the analyst who studies the data is to find a functional relation between the response variable y and the predictor variable x.

Regression Function The statement that the relation between X and Y is statistical should be interpreted as providing the following guidelines: 1.Regard Y as a random variable. 2.For each X, take f (x) to be the expected value (i.e., mean value) of y. 3.Given that E (Y) denotes the expected value of Y, call the equation the regression function.

Historical Origin of Regression Regression Analysis was first developed by Sir Francis Galton, who studied the relation between heights of sons and fathers. Heights of sons of both tall and short fathers appeared to “revert” or “regress” to the mean of the group.

Basic Assumptions of a Regression Model A regression model is based on the following assumptions: 1. There is a probability distribution of Y for each level of X. 2. Given that y is the mean value of Y, the standard form of the model is where  is a random variable with a normal distribution.

Statistical relation between Lot Size and number of man-Hours-Westwood Company Example

Pictorial Presentation of Linear Regression Model

Construction of Regression Models Selection of independent variables Functional form of regression relation Scope of model

Uses of Regression Analysis Regression analysis serves Three major purposes. 1. Description 2. Control 3. Prediction The several purposes of regression analysis frequently overlap in practice

Formal Statement of the Model General regression model 1.  0, and  1 are parameters 2. X is a known constant 3. Deviations  are independent N(o,  2 )

Meaning of Regression Coefficients The values of the regression parameters  0, and  1 are not known.We estimate them from data.  1 indicates the change in the mean response per unit increase in X.

Regression Line If the scatter plot of our sample data suggests a linear relationship between two variables i.e. we can summarize the relationship by drawing a straight line on the plot. Least squares method give us the “best” estimated line for our set of sample data.

Regression Line We will write an estimated regression line based on sample data as The method of least squares chooses the values for b 0, and b 1 to minimize the sum of squared errors

Regression Line Using calculus, we obtain estimating formulas:

Estimation of Mean Response Fitted regression line can be used to estimate the mean value of y for a given value of x. Example The weekly advertising expenditure (x) and weekly sales (y) are presented in the following table.

Point Estimation of Mean Response From previous table we have: The least squares estimates of the regression coefficients are:

Point Estimation of Mean Response The estimated regression function is: This means that if the weekly advertising expenditure is increased by $1 we would expect the weekly sales to increase by $10.8.

Point Estimation of Mean Response Fitted values for the sample data are obtained by substituting the x value into the estimated regression function. For example if the advertising expenditure is $50, then the estimated Sales is: This is called the point estimate of the mean response (sales).

Residual The difference between the observed value y i and the corresponding fitted value. Residuals are highly useful for studying whether a given regression model is appropriate for the data at hand.

Example: weekly advertising expenditure