Everyone thinks they know this stuff

Slides:



Advertisements
Similar presentations
Chapter 9: Simple Regression Continued
Advertisements

Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
Correlation and Regression Analysis
Lecture 5 Correlation and Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Regression. Population Covariance and Correlation.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Curve Fitting Pertemuan 10 Matakuliah: S0262-Analisis Numerik Tahun: 2010.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistical Data Analysis - Lecture /04/03
Part 5 - Chapter 17.
Statistics 101 Chapter 3 Section 3.
Practice. Practice Practice Practice Practice r = X = 20 X2 = 120 Y = 19 Y2 = 123 XY = 72 N = 4 (4) 72.
26134 Business Statistics Week 5 Tutorial
Basic Estimation Techniques
Simple Linear Regression
Slope of the regression line:
Correlation and Simple Linear Regression
Basic Estimation Techniques
I271B Quantitative Methods
Correlation and Regression
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Part 5 - Chapter 17.
REGRESSION.
Least-Squares Regression
Multiple Regression Models
Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
Product moment correlation
SIMPLE LINEAR REGRESSION
Simple Linear Regression
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Lesson 2.2 Linear Regression.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Everyone thinks they know this stuff Linear Regression Everyone thinks they know this stuff

For example, the above is a data set in X and Y For example, the above is a data set in X and Y. We are interested in finding the line which slope and intercept such that the square of the residuals (the deviations from the actual Y-values compared to the line at a given value of X) is minimized. The line above is adjusted so that a minimum value is achieved. This is easily proved using a bit of calculus which seems unnecessary. In producing a linear regression, one uses this method of “least squares” to determine the parameters. The important parameters which are determined are the following:   The slope of the line (denoted as a) The intercept of the line (the value of Y at X=0) – denoted as b The scatter of dispersion around the fitted line – denoted as s (this is also referred to as the standard error of the estimate) The error in the slope of the line – denoted as sa The correlation coefficient – denoted as r The accuracy of any prediction of Y for some future value of X depends on s and sa

Who gives a shit about r r2 is called the coefficient of determination. Thus if r = 0.9 then r2 = 0.81. This is taken to mean that 81% of the value of Y is determined by X, on average. For us physical scientists, this a meaningless statement and hence we will not make use of these parameters for this course. An example is provided below. Please guard against the not uncommon situation of "statistically significant" correlations (i.e., small p values) that explain miniscule variations in the data (i.e., small r2 values). For example, with 100 data points a correlation that explains just 4% of the variation in y (i.e., r=.2) would be considered statistically significant (i.e., p<.05). Here is what such data looks like in a scatter plot: The correlation may be statistically significant, but it is probably not important in understanding the variation in y.

sx and sy Matter

Refer to the Excel Spread sheet

The things that matter: If you want to minimize the error in the slope what do you want in your data set?

2 This is a very simple and useful statistic for all kinds of curve fitting The chi-square distribution can be used to see whether or not the observed counts (data) agree with the predicted counts (data) from your functional fit O = observed count and E = Expected count We only care about MINIMIZING this number, never about its actual statistical p-value or degrees of freedom. We use this just to determine a best fit from available alternatives.

Example The quality of the fit is totally determined by the Leaf Cutter Ants

Maybe there is a bad data point in the Leaf Cutter Ants Extremely useful to therefore do 1 cycle of sigma-clipping. In science we often use 2.5s as the one cycle threshold (this corresponds to p = .01 – well use 2.35s if you want to be exact). – all points that deviate more than +/- 2.5s from the initial fit, can be rejected to produce another fit. You only get to do this clipping one time. This is used quite a bit, for instance, in climate analysis where often times the data is just bad. Unless you have some a priori way of insuring that you have no bad data, you an always use this approach.

Example: Sigma = 16.3 Sigma = 14.4 The accuracy of my prediction has now been improved!