Download presentation
Presentation is loading. Please wait.
Published byJonah Hutchinson Modified over 9 years ago
1
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Lecture 4 Comparing Data and Models— Quantitatively Linear Regression References: Taylor Ch. 8 and 9 Also refer to “Glossary of Important Terms in Error Analysis”
2
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 2 Lecture 5 Slide 2 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Errors in Measurements and Models A Review of What We Know
3
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 3 Lecture 5 Slide 3 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Quantifying Precision and Random (Statistical) Errors
4
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 4 Lecture 5 Slide 4 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Single Measurement: Comparison with Other Data Comparison of precision or accuracy?
5
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 5 Lecture 5 Slide 5 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Single Measurement: Direct Comparison with Standard Comparison of precision or accuracy?
6
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 6 Lecture 5 Slide 6 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Multiple Measurements of the Same Quantity
7
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 7 Lecture 5 Slide 7 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Standard Deviation Multiple Measurements of the Same Quantity
8
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 8 Lecture 5 Slide 8 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Standard Deviation of the Mean Multiple Sets of Measurements of the Same Quantity
9
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 9 Lecture 5 Slide 9 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Errors Propagation—Error in Models or Derived Quantities
10
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 10 Lecture 5 Slide 10 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Specific Rules for Error Propagation (Worst Case)
11
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 11 Lecture 5 Slide 11 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 General Formula for Error Propagation
12
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 12 Lecture 5 Slide 12 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 General Formula for Multiple Variables Worst case Best case
13
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 13 Lecture 5 Slide 13 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Error for a function of Two Variables: Addition in Quadrature Consider the arbitrary derived quantity q(x,y) of two independent random variables x and y. Expand q(x,y) in a Taylor series about the expected values of x and y (i.e., at points near X and Y). Error Propagation: General Case Fixed, shifts peak of distribution FixedDistribution centered at X with width σ X Best case
14
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 14 Lecture 5 Slide 14 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Independent (Random) Uncertaities and Gaussian Distributions
15
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 15 Lecture 5 Slide 15 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Gaussian Distribution Function Width of Distribution (standard deviation) Center of Distribution (mean) Distribution Function Independent Variable Gaussian Distribution Function Normalization Constant
16
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 16 Lecture 5 Slide 16 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Standard Deviation of Gaussian Distribution Area under curve (probability that –σ<x<+σ) is 68% 5% or ~2σ “Significant” 1% or ~3σ “Highly Significant” 1σ “Within errors” 0.5 ppm or ~5σ “Valid for HEP” See Sec. 10.6: Testing of Hypotheses
17
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 17 Lecture 5 Slide 17 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Mean of Gaussian Distribution as “Best Estimate” Principle of Maximum Likelihood Best Estimate of X is from maximum probability or minimum summation Consider a data set Each randomly distributed with {x 1, x 2, x 3 …x N } To find the most likely value of the mean (the best estimate of ẋ ), find X that yields the highest probability for the data set. The combined probability for the full data set is the product Minimize Sum Solve for derivative set to 0 Best estimate of X
18
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 18 Lecture 5 Slide 18 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Uncertaity of “Best Estimates” of Gaussian Distribution Principle of Maximum Likelihood Best Estimate of X is from maximum probability or minimum summation Consider a data set{x 1, x 2, x 3 …x N } To find the most likely value of the mean (the best estimate of ẋ ), find X that yields the highest probability for the data set. The combined probability for the full data set is the product Minimize Sum Solve for derivative wrst X set to 0 Best estimate of X Best Estimate of σ is from maximum probability or minimum summation Minimize Sum Best estimate of σ Solve for derivative wrst σ set to 0 See Prob. 5.26
19
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 19 Lecture 5 Slide 19 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Weighted Averages Question: How can we properly combine two or more separate independent measurements of the same randomly distributed quantity to determine a best combined value with uncertainty?
20
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 20 Lecture 5 Slide 20 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Weighted Averages Best Estimate of χ is from maximum probibility or minimum summation Minimize SumSolve for best estimate of χ Solve for derivative wrst χ set to 0 Note: If w 1 =w 2, we recover the standard result X wavg = (1/2) (x 1 +x 2 )
21
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 21 Lecture 5 Slide 21 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Comparing Measurements to Models Linear Regression
22
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 22 Lecture 5 Slide 22 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Motivating Regression Analysis Question: Consider now what happens to the output of a nearly ideal experiment, if we vary how hard we poke the system (vary the input). SYSTEM Input Output Uncertainties in Observations The Universe The simplest model of response (short of the trivial constant response) is a linear response model y(x) = A + B x
23
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 23 Lecture 5 Slide 23 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Questions for Regression Analysis Two principle questions: What are the best values of A and B, for: (see Taylor Ch. 8) A perfect data set, where A and B are exact? A data set with uncertainties? What confidence can we place in how well a linear model fits the data? (see Taylor Ch. 9) The simplest model of response (short of the trivial constant response) is a linear response model y(x) = A + B x
24
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 24 Lecture 5 Slide 24 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Graphical Analysis
25
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 25 Lecture 5 Slide 25 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Graphical Analysis An “old School” approach to linear fits. Rough plot of data Estimate of uncertainties with error bars A “best” linear fit with a straight edge Estimates of uncertainties in slope and intercept from the error bars This is a great practice to get into as you are developing an experiment!
26
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 26 Lecture 5 Slide 26 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Is it Linear? Adding 2D error bars is sometimes helpful. A simple model is a linear model You know it when you see it (qualitatively) Tested with a straight edge Error bar are a first step in gauging the “goodness of fit”
27
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 27 Lecture 5 Slide 27 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Making It Linear or Linearization A simple trick for many models is to linearize the model in the independent variable. Refer to Baird Ch.5 and the associated homework problems.
28
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 28 Lecture 5 Slide 28 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Special Graph Paper Linear Semilog Log-Log “Old School” graph paper is still a useful tool, especially for reality checking during the experimental design process. Semi-log paper tests for exponential models. Log-log paper tests for power law models. Both Semi-log and log -log paper are handy for displaying details of data spread over many orders of magnitude.
29
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 29 Lecture 5 Slide 29 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Linear Regression
30
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 30 Lecture 5 Slide 30 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Basic Assumptions for Regression Analysis We will (initially) assume: The errors in how hard you poke something (in the input) are negligible compared with errors in the response (see discussion in Taylor Sec. 9.3) The errors in y are constant (see Problem 8.9 for weighted errors analysis) The measurements of y i are governed by a Gaussian distribution with constant width σ y
31
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 31 Lecture 5 Slide 31 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Question 1: What is the Best Linear Fit (A and B)? Best Estimate of intercept, A, and slope, B, for Linear Regression or Least Squares- Fit for Line
32
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 32 Lecture 5 Slide 32 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Best Estimates of A and B are from maximum probibility or minimum summation Minimize SumBest estimate of A and B Solve for derivative wrst A and B set to 0 “Best Estimates” of Linear Fit
33
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 33 Lecture 5 Slide 33 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Best Estimates of A and B are from maximum probibility or minimum summation Minimize SumBest estimate of A and B Solve for derivative wrst A and B set to 0 “Best Estimates” of Linear Fit In a linear algebraic formThis is a standard eigenvalue problemWith solutions is the uncertainty in the measurement of y, or the rms deviation of the measured to predicted value of y
34
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 34 Lecture 5 Slide 34 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves
35
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 35 Lecture 5 Slide 35 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves
36
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 36 Lecture 5 Slide 36 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves
37
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 37 Lecture 5 Slide 37 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Fitting a Polynomial This leads to a standard 3x3 eigenvalue problem, Which can easily be generalized to any order polynomial Extend the linear solution to include on more tem, for a second order polynomial This looks just like our linear problem, with the deviation in the summation replace by
38
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 38 Lecture 5 Slide 38 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves
39
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 39 Lecture 5 Slide 39 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves
40
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 40 Lecture 5 Slide 40 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves
41
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 41 Lecture 5 Slide 41 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24
42
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 42 Lecture 5 Slide 42 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24
43
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 43 Lecture 5 Slide 43 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24
44
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 44 Lecture 5 Slide 44 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24
45
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 45 Lecture 5 Slide 45 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24
46
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 46 Lecture 5 Slide 46 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Correlations
47
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 47 Lecture 5 Slide 47 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Uncertaities in a Function of Variables Consider an arbitrary function q with variables x,y,z and others. Expanding the uncertainty in y in terms of partial derivatives, we have If x,y,z and others are independent and random variables, we have If x,y,z and others are independent and random variables governed by normal distributions, we have We now consider the case when x,y,z and others are not independent and random variables governed by normal distributions.
48
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 48 Lecture 5 Slide 48 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Covariance of a Function of Variables The standard deviation of the N values of q i is We then find the simple result for the mean of q Note partial derivatives are all taken at X or Y and are hence the same for each i We now consider the case when x and y are not independent and random variables governed by normal distributions. Assume we measure N pairs of data (x i,y i ), with small uncertaities so that all x i and y i are close to their mean values X and Y. Expanding in a Taylor series about the means, the value q i for (x i,y i ), 00
49
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 49 Lecture 5 Slide 49 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Covariance of a Function of Variables The standard deviation of the N values of q i is If x and y are independent
50
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 50 Lecture 5 Slide 50 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Schwartz Inequality Show that See problem 9.7
51
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 51 Lecture 5 Slide 51 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Schwartz Inequality Combining the Schwartz inequality With the definition of the covariance yields Then completing the squares And taking the square root of the equation, we finally have At last, the upper bound of errors is And for independent and random variables
52
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 52 Lecture 5 Slide 52 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Another Useful Relation
53
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 53 Lecture 5 Slide 53 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Question 2: Is it Linear? Consider the limiting cases for: r=0 (no correlation) [for any x, the sum over y-Y yields zero] r=±1 (perfect correlation). [Substitute yi-Y=B(xi-X) to get r=B/|B|=±1] y(x) = A + B x
54
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 54 Lecture 5 Slide 54 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Tabulated Correlation Coefficient Consider the limiting cases for: r=0 (no correlation) r=±1 (perfect correlation). To gauge the confidence imparted by intermediate r values consult the table in Appendix C. r value N data points Probability that analysis of N=70 data points with a correlation coefficient of r=0.5 is not modeled well by a linear relationship is 3.7%. Therefore, it is very probably that y is linearly related to x. If Prob N (|r|>r o )<32% it is probably that y is linearly related to x Prob N (|r|>r o )<5% it is very probably that y is linearly related to x Prob N (|r|>r o )<1% it is highly probably that y is linearly related to x
55
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 55 Lecture 5 Slide 55 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Uncertainties in Slope and Intercept Taylor: Relation to R 2 value:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.