Presentation is loading. Please wait.

Presentation is loading. Please wait.

SESSION 49 - 52 Last Update 17 th June 2011 Regression.

Similar presentations


Presentation on theme: "SESSION 49 - 52 Last Update 17 th June 2011 Regression."— Presentation transcript:

1 SESSION 49 - 52 Last Update 17 th June 2011 Regression

2 Lecturer:Florian Boehlandt University:University of Stellenbosch Business School Domain:http://www.hedge-fund- analysis.net/pages/vega.php

3 Learning Objectives 1.XY-Scatter Diagrams 2.Plotting the Regression Line 3.Coefficient Estimates 4.Pearson Coefficient of Correlation 5.Spearman Rank Correlation Coefficient

4 XY-Scatter Diagram To draw a scatter diagram we need data for two variables. In applications where one variable depends to some degree on the other variable, the dependent variable is labeled Y and the other, called the independent variable, X. The values for X and Y are combined into a single data point using the observations for X and Y as coordinates.

5 Example Temperature - Truck TempTrucks Obsxy 1112.5 2146.5 3208.5 42110.5 52311 62412 72613 82813.5 93015.5 103419

6 Regression Analysis Regression analysis is used to predict the value of one variable on the basis of the other variables. The first-order linear model describes the relationship between the dependent variable Y and the independent variable(s) X. The regression model with a as the y-intercept and m as the slope coefficient is of the form:

7 Example Temperature - Truck TempTrucks Obsxy 1112.5 2146.5 3208.5 42110.5 52311 62412 72613 82813.5 93015.5 103419 The estimators of the intercept a and slope coefficient b are based on drawing a straight line through the sample data:

8 Intercept and Slope The intercept a is the y-coordinate of the point where the linear function intersects the y-axis. The slope coefficient b is defined as the change in y for a unit change in x.

9 Fitted Line With Residuals The line drawn through the point is called the regression line.

10 Residuals Squared The regression or least square line represents a line that minimizes the sum of the squared differences between the points and the line.

11 Calculating Coefficients Raw Data (y-variable as dependent and x as independent variable): TempTrucks Obsxy 1112.5 2146.5 3208.5 42110.5 52311 62412 72613 82813.5 93015.5 103419

12 Solution TempTrucks Obsxyxyx^2 1112.527.5121 2146.591196 3208.5170400 42110.5220.5441 52311253529 62412288576 72613338676 82813.5378784 93015.5465900 1034196461156 Total23111228775779 Step1: Calculate the gradient (beta):

13 Solution TempTrucks Obsxyxyx^2 1112.527.5121 2146.591196 3208.5170400 42110.5220.5441 52311253529 62412288576 72613338676 82813.5378784 93015.5465900 1034196461156 Total23111228775779 Step 2: Calculate the intercept (alpha):

14 Interpreting the Coefficients The slope coefficient b may be interpreted as the change in the dependent variable y for a one unit change in x. In the previous example, a one unit change in temperature results in a b = 0.654 additional truckloads of cool drinks sold. The intercept a is the point at which the regression line and the y-axis intersect. If x = 0 lies far outside the range of sample values x, the interpretation of the intercept is not straight- forward. In the temperature-truck example, x = 0 lies outside the smallest and largest values for x in the sample. Interpreting the intercept for x would imply that at temperature of x = 0, the soft-drink sales decline to negative 3.914!

15 Point Prediction Upon obtaining the coefficient estimates we can predict the outcome for various x (point prediction) between the minimum and maximum sample observation using the regression function y = a + mx. For example: x = 16 degrees?y = 3.914 + 0.654*16y = 6.554 ≈ 7 truckloads X = 32 degrees?y = 3.914 + 0.654*32y = 17.023 ≈ 17 truckloads

16 Pearson Coefficient of Correlation The Pearson coefficient of correlation R may be used to test for linear association between variables. The coefficient is useful to determine whether or not a linear relationship exists between y and x. Note that variables may be positively or negatively correlated. R = 1 denotes perfect positive correlation, R = -1 signifies perfect negative correlation. R is defined for:

17 Type of Relationship DIRECT LINEAR RELATIONSHIP Small Dispersion Wide Dispersion INVERSE LINEAR RELATIONSHIP Small DispersionWide Dispersion NO LINEAR RELATIONSHIP Positive Linear Correlation exists 0 < r <+ 1 Negative Linear Correlation exists -1 < r < 0 No Correlation r = 0

18 Coefficient of Determination Squaring the Pearson coefficient of correlation delivers the coefficient of determination R 2 in regression. It may be interpreted as the proportion of variation in the dependent variable y that is explained by the variation in the explanatory variable x. R 2 is a measure of strength of the linear relationship between y and x.

19 Solution Step 3: Calculate R and R 2 TempTrucks Obsxyxyx^2y^2 1112.527.51216.25 2146.59119642.25 3208.517040072.25 42110.5220.5441110.25 52311253529121 62412288576144 72613338676169 82813.5378784182.25 93015.5465900240.25 1034196461156361 Total231112287757791448.5

20 Spearman Rank Correlation The standard coefficient of correlation allows for determining whether there is evidence of a linear relationship between two interval variables. In case where the variables are ordinal, or, if both variables are interval, the normality requirement may not be satisfied. A nonparametric test statistic called Spearman Rank Correlation Coefficient may be used under the circumstances.

21 Objective: Comparing 2 Variables Nominal Chi-Square test of a contingency table Nominal Analyzing the relationship between two variables Ordinal Data type? Spearman Rank Correlation Population Distribution? Error is normal or x and y bivariate normal x and y not bivariate normal Simple linear regression

22 Example Ranking Business Aspect Manag ementStaff Brand Equity11 Financial Controls23 Customer Service32 Planning Systems46 Research & Development54 Company Morale67 Productivity75 Below there is a list of organizational strengths that were independently ranked by management and staff and the managing director wished to know how closely correlated were the assessments:

23 Calculating R S Ranking Business AspectObs Manage mentStaffdd^2 Brand Equity11100 Financial Controls2231 Customer Service33211 Planning Systems446-24 Research & Development55411 Company Morale6671 Productivity77524 Total12


Download ppt "SESSION 49 - 52 Last Update 17 th June 2011 Regression."

Similar presentations


Ads by Google